vignettes/Introduction-to-SoilProfileCollection-Objects.Rmd
Introduction-to-SoilProfileCollection-Objects.Rmd
This is a basic introduction to the
SoilProfileCollection
class object defined in the
aqp
package for R.
The SoilProfileCollection
class was designed to simplify
the process of working with the collection of data associated with soil
profiles: site-level data, horizon-level data, spatial data, diagnostic
horizon data, metadata, etc.
Examples listed below are meant to be copied/pasted from this
document and interactively run within R. Comments
(preceded by #
symbol) briefly describe what the code in
each line does. Further documentation on objects and functions from the
aqp
package can be accessed by typing
help(aqp)
(or more generally, ?function_name
)
at the R console.
In this tutorial we will use some sample data included with the
aqp
package, based on characterization data from 10 soils
sampled on serpentinitic parent material as described in McGahan
et al, 2009.
To begin you will load required packages. You may have to first install these if missing:
# install CRAN release + dependencies
install.packages('aqp', dependencies = TRUE)
install.packages('remotes', dependencies = TRUE)
# install latest version from GitHub
remotes::install_github("ncss-tech/aqp", dependencies = FALSE)
# load sample data set, a data.frame object with horizon-level data from 10 profiles
data(sp4)
str(sp4)
#> 'data.frame': 30 obs. of 13 variables:
#> $ id : chr "colusa" "colusa" "colusa" "colusa" ...
#> $ name : chr "A" "ABt" "Bt1" "Bt2" ...
#> $ top : int 0 3 8 30 0 9 0 4 13 0 ...
#> $ bottom : int 3 8 30 42 9 34 4 13 40 6 ...
#> $ K : num 0.3 0.2 0.1 0.1 0.2 0.3 0.2 0.6 0.8 0.4 ...
#> $ Mg : num 25.7 23.7 23.2 44.3 21.9 18.9 12.1 12.1 17.7 16.4 ...
#> $ Ca : num 9 5.6 1.9 0.3 4.4 4.5 1.4 7 4.4 24.1 ...
#> $ CEC_7 : num 23 21.4 23.7 43 18.8 27.5 23.7 18 20 31.1 ...
#> $ ex_Ca_to_Mg: num 0.35 0.23 0.08 0.01 0.2 0.2 0.58 0.51 0.25 1.47 ...
#> $ sand : int 46 42 40 27 54 49 43 36 27 43 ...
#> $ silt : int 33 31 28 18 20 18 55 49 45 42 ...
#> $ clay : int 21 27 32 55 25 34 3 15 27 15 ...
#> $ CF : num 0.12 0.27 0.27 0.16 0.55 0.84 0.5 0.75 0.67 0.02 ...
# optionally read about it...
# ?sp4
# upgrade to SoilProfileCollection
# 'id' is the name of the column containing the profile ID
# 'top' is the name of the column containing horizon upper boundaries
# 'bottom' is the name of the column containing horizon lower boundaries
depths(sp4) <- id ~ top + bottom
# define "horizon designation" column name for the collection
hzdesgnname(sp4) <- 'name'
# check it out:
class(sp4)
#> [1] "SoilProfileCollection"
#> attr(,"package")
#> [1] "aqp"
print(sp4)
#> SoilProfileCollection with 10 profiles and 30 horizons
#> profile ID: id | horizon ID: hzID
#> Depth range: 16 - 49 cm
#>
#> ----- Horizons (6 / 30 rows | 10 / 14 columns) -----
#> id hzID top bottom name K Mg Ca CEC_7 ex_Ca_to_Mg
#> colusa 1 0 3 A 0.3 25.7 9.0 23.0 0.35
#> colusa 2 3 8 ABt 0.2 23.7 5.6 21.4 0.23
#> colusa 3 8 30 Bt1 0.1 23.2 1.9 23.7 0.08
#> colusa 4 30 42 Bt2 0.1 44.3 0.3 43.0 0.01
#> glenn 5 0 9 A 0.2 21.9 4.4 18.8 0.20
#> glenn 6 9 34 Bt 0.3 18.9 4.5 27.5 0.20
#> [... more horizons ...]
#>
#> ----- Sites (6 / 10 rows | 1 / 1 columns) -----
#> id
#> colusa
#> glenn
#> kings
#> mariposa
#> mendocino
#> napa
#> [... more sites ...]
#>
#> Spatial Data:
#> [EMPTY]
SoilProfileCollection
objects are typically created by
“promoting” data.frame
objects (rectangular tables of data)
that contain at least three essential columns:
The data.frame
is sorted internally according to the
profile ID and horizon top boundary. Formula notation is used to define
the columns used to promote a data.frame
object:
idcolumn ~ hz_top_column + hz_bottom_column
Small collections of soil profiles can be described using text
“templates” and quickSPC()
. Templates take the form of:
ID:AAA|BBB|CCCCCC
: relative thickness specified by
horizon designations, split by “|” symbol
ID:A-B-C
: horizons of random thickness specified by
horizon designations, split by “-” symbol
The “ID:” prefix is optional, with unique IDs generated by the digest package when omitted.
# character vector of horizon templates
# must all use the same formatting
x <- c(
'P1:AAA|BwBwBwBw|CCCCCCC|CdCdCdCd',
'P2:ApAp|AE|BhsBhs|Bw1Bw1|Bw2|CCCCCCC',
'P3:A|Bt1Bt1Bt1|Bt2Bt2Bt2|Bt3|Cr|RRRRRR'
)
# each horizon label is '10' depth-units (default)
s <- quickSPC(x)
# sketch profiles
par(mar = c(0, 0, 0, 0))
plotSPC(s, name.style = 'center-center',
cex.names = 1, depth.axis = FALSE,
hz.depths = TRUE
)
“Accessor” functions are used to extract specific components from
within SoilProfileCollection
objects.
Methods that return a column name. These are useful for extracting depths, horizon designations, IDs, etc. before taking an SPC apart for a specific task.
idname(sp4)
: extract profile ID name (column name used
to init SPC)horizonDepths(sp4)
: horizon top / bottom depth names
(used to init SPC)hzidname(sp4)
: horizon ID name (typically automatically
built at init time)hzdesgnname(sp4)
: horizon designation name (if
set)hztexclname(sp4)
: horizon texture class name (if
set)Methods that return a vector of values.
profile_id(sp4)
: profile IDs, in orderhzID(sp4)
: horizon IDs, in orderhzDesgn(sp4)
: horizon designations, in orderMethods that return site/horizon attribute column names.
names(sp4)
: site + horizon names concatenated into a
single vectorhorizonNames(sp4)
: horizon namessiteNames(sp4)
: site namesProfile and horizon totals.
length(sp4)
: number of profiles in collectionnrow(sp4)
: number of horizons in collectionOther methods.
depth_units(sp4)
: defaults to ‘cm’ at SPC creationmetadata(sp4)
: returns list
object with
base + optional (user-defined) metadata elementsHorizon data and information about key columns (unique profile ID and depths) are the primary input to the methods provided to create the object. Site data can be derived from unique profile-specific values within the input horizon data, or joined in from an external source based on profile ID.
Both site and horizon data are stored as data.frame
within the SoilProfileCollection; with one or more rows (per profile ID)
in the horizon table and one row (per profile ID) in the site table. The
SoilProfileCollection also supports data.table
, or
tibble
objects in the data.frame
slots.
Columns from site or horizon tables can be accessed with the
$
syntax notation, similar to the data.frame
.
New data can be assigned to either table in the same manner, as long as
the length of the new data is:
site(object)$new_column <- new_value
for new site data
and horizons(object)$new_column <- new value
for
horizonAssignment of new data to existing or new attributes can proceed as follows.
# site-level (based on length of assigned data == number of profiles)
sp4$elevation <- rnorm(n = length(sp4), mean = 1000, sd = 150)
# horizon-level (calculated from two horizon-level columns)
sp4$thickness <- sp4$bottom - sp4$top
# extraction of specific attributes by name
sp4$clay # vector of clay content (horizon data)
#> [1] 21 27 32 55 25 34 3 15 27 32 25 31 33 13 21 23 15 17 12 19 14 14 22 25 40 51 67 24 25 32
sp4$elevation # vector of simulated elevation (site data)
#> [1] 789.9935 1038.2976 634.4105 999.1643 1093.2329 1172.2617 726.7274 962.9012 963.3701
#> [10] 957.5942
# unit-length value explicitly targeting site data
site(sp4)$collection_id <- 1
# assign a single single value into horizon-level attributes
sp4$constant <- rep(1, times = nrow(sp4))
# unit-length value explicitly targeting horizon data
horizons(sp4)$analysis_group <- "SERP"
# _moves_ the named column from horizon to site
site(sp4) <- ~ constant
Horizon and site data can be modified via extraction to
data.frame
followed by replacement (horizon data) or join
(site data). Note that while this approach gives the most flexibility,
it is also the most dangerous–replacement of horizon data with new data
that don’t exactly conform to the original sorting may corrupt your
SoilProfileCollection
.
# extract horizon data to data.frame
h <- horizons(sp4)
# add a new column and save back to original object
h$random.numbers <- rnorm(n = nrow(h), mean = 0, sd = 1)
# _replace_ original horizon data with modified version
replaceHorizons(sp4) <- h
# extract site data to data.frame
s <- site(sp4)
# add a fake group to the site data
s$group <- factor(rep(c('A', 'B'), length.out = nrow(s)))
# join new site data with previous data: old data are _not_ replaced
site(sp4) <- s
# check
sp4
#> SoilProfileCollection with 10 profiles and 30 horizons
#> profile ID: id | horizon ID: hzID
#> Depth range: 16 - 49 cm
#>
#> ----- Horizons (6 / 30 rows | 10 / 17 columns) -----
#> id hzID top bottom name K Mg Ca CEC_7 ex_Ca_to_Mg
#> colusa 1 0 3 A 0.3 25.7 9.0 23.0 0.35
#> colusa 2 3 8 ABt 0.2 23.7 5.6 21.4 0.23
#> colusa 3 8 30 Bt1 0.1 23.2 1.9 23.7 0.08
#> colusa 4 30 42 Bt2 0.1 44.3 0.3 43.0 0.01
#> glenn 5 0 9 A 0.2 21.9 4.4 18.8 0.20
#> glenn 6 9 34 Bt 0.3 18.9 4.5 27.5 0.20
#> [... more horizons ...]
#>
#> ----- Sites (6 / 10 rows | 5 / 5 columns) -----
#> id elevation collection_id constant group
#> colusa 789.9935 1 1 A
#> glenn 1038.2976 1 1 B
#> kings 634.4105 1 1 A
#> mariposa 999.1643 1 1 B
#> mendocino 1093.2329 1 1 A
#> napa 1172.2617 1 1 B
#> [... more sites ...]
#>
#> Spatial Data:
#> [EMPTY]
Diagnostic horizons typically span several genetic horizons and may or may not be present in every profile.
To accommodate the wide range of possibilities, diagnostic horizon
data are stored as a data.frame
in “long format”: each row
corresponds to a diagnostic feature in a single profile, identified with
a column matching the ID column used to initialize the
SoilProfileCollection
object and a label reflecting the
feature kind.
For diagnostic horizon data there are no restrictions on data content, as long as each row has an ID that exists within the collection. Be sure to use the ID column name that was used to initialize the SoilProfileCollection object.
dh <- data.frame(
id = 'colusa',
kind = 'argillic',
top = 8,
bottom = 42,
stringsAsFactors = FALSE
)
# overwrite any existing diagnostic horizon data
diagnostic_hz(sp4) <- dh
# append to diagnostic horizon data
dh <- diagnostic_hz(sp4)
dh.new <- data.frame(
id = 'napa',
kind = 'argillic',
top = 6,
bottom = 20,
stringsAsFactors = FALSE
)
# overwrite existing diagnostic horizon data with appended data
diagnostic_hz(sp4) <- rbind(dh, dh.new)
Features that restrict root entry (fine or very fine roots) are commonly used to estimate functional soil depth. Restrictive features include salt accumulations, duripans, fragipans, paralithic materials, lithic contact, or an abrupt change in chemical property.
Not all soils have restrictive features, therefore these data are
stored as a data.frame
in “long format”. Each row
corresponds to a restrictive feature, associated depths, and identified
by profile_id()
. There may be more than one restrictive
feature per soil profile.
The example data sp4
does not describe the restrictive
features, so we will simulate some at the bottom of each profile +
20cm.
# get the depth of each profile
rf.top <- profileApply(sp4, max)
rf.bottom <- rf.top + 20
# the profile IDs can be extracted from the names attribute
pIDs <- names(rf.top)
# note: profile IDs must be stored in a column named for idname(sp4) -> 'id'
rf <- data.frame(
id = pIDs,
top = rf.top,
bottom = rf.bottom,
kind = 'fake',
stringsAsFactors = FALSE
)
# overwrite any existing diagnostic horizon data
restrictions(sp4) <- rf
# check
restrictions(sp4)
#> id top bottom kind
#> 1 colusa 42 62 fake
#> 2 glenn 34 54 fake
#> 3 kings 40 60 fake
#> 4 mariposa 49 69 fake
#> 5 mendocino 30 50 fake
#> 6 napa 20 40 fake
#> 7 san benito 20 40 fake
#> 8 shasta 40 60 fake
#> 9 shasta-trinity 40 60 fake
#> 10 tehama 16 36 fake
SoilProfileCollection metadata can be extracted and set using the
metadata()
and metadata<-
methods.
#> List of 6
#> $ aqp_df_class : chr "data.frame"
#> $ aqp_group_by : chr ""
#> $ aqp_hzdesgn : chr "name"
#> $ aqp_hztexcl : chr ""
#> $ depth_units : chr "cm"
#> $ stringsAsFactors: logi FALSE
# alter the depth unit metadata attribute
depth_units(sp4) <- 'inches' # units are really 'cm'
# add or replace custom metadata
metadata(sp4)$describer <- 'DGM'
metadata(sp4)$date <- as.Date('2009-01-01')
metadata(sp4)$citation <- 'McGahan, D.G., Southard, R.J, Claassen, V.P. 2009. Plant-Available Calcium Varies Widely in Soils on Serpentinite Landscapes. Soil Sci. Soc. Am. J. 73: 2087-2095.'
# check new values have been added
str(metadata(sp4))
#> List of 9
#> $ aqp_df_class : chr "data.frame"
#> $ aqp_group_by : chr ""
#> $ aqp_hzdesgn : chr "name"
#> $ aqp_hztexcl : chr ""
#> $ depth_units : chr "inches"
#> $ stringsAsFactors: logi FALSE
#> $ describer : chr "DGM"
#> $ date : Date[1:1], format: "2009-01-01"
#> $ citation : chr "McGahan, D.G., Southard, R.J, Claassen, V.P. 2009. Plant-Available Calcium Varies Widely in Soils on Serpentini"| __truncated__
# fix depth units, set back to 'cm'
depth_units(sp4) <- 'cm'
Spatial data and metadata can be stored within a
SoilProfileCollection
object. To initialize a “spatial”
SoilProfileCollection, use initSpatial()<-
method. This
allows you to specify which columns contain the geometry and optionally
the Coordinate Reference System of those data. The SoilProfileCollection
stores two metadata entries: coordinates
and
crs
which refer to the column names containing geometric
data and the CRS, respectively.
# generate some fake coordinates as site level attributes
sp4$x <- rnorm(n = length(sp4), mean = 354000, sd = 100)
sp4$y <- rnorm(n = length(sp4), mean = 4109533, sd = 100)
# initialize spatial coordinates (CRS optional)
initSpatial(sp4, crs = "EPSG:26911") <- ~ x + y
# extract coordinates as matrix
getSpatial(sp4)
#> x y
#> [1,] 354007.0 4109619
#> [2,] 353936.1 4109509
#> [3,] 353995.0 4109512
#> [4,] 353974.9 4109535
#> [5,] 354044.5 4109536
#> [6,] 354275.5 4109588
#> [7,] 354004.7 4109306
#> [8,] 354057.8 4109801
#> [9,] 354011.8 4109497
#> [10,] 353808.8 4109554
# get/set spatial reference system using prj()<-
prj(sp4) <- '+proj=utm +zone=11 +datum=NAD83'
# return CRS information
prj(sp4)
#> [1] "+proj=utm +zone=11 +datum=NAD83"
There are several SoilProfileCollection methods defined to identify corrupted objects/bad depth logic and fix them.
Usually an “invalid” or “corrupt” SoilProfileCollection comes from direct modification of the S4 object slot contents (not using standard methods), which can result in omissions or reordering that bypasses internal accounting methods.
Occasionally issues arise from illogical inputs (depths) or collections that are missing data. The latter cases (standard methods produce a corrupt SPC given some “degenerate” input data) are considered “bugs” that should reported on the issue tracker: https://github.com/ncss-tech/aqp/issues
checkSPC()
returns TRUE for a SoilProfileCollection that
contains all slots defined in the class prototype.
checkSPC(sp4)
spc_in_sync()
is used as the validity method for the
SoilProfileCollection, it determines if some reordering of the horizon
data relative to the unique profile ID / site order has occurred.
spc_in_sync(sp4)
#> nSites relativeOrder valid
#> 1 TRUE TRUE TRUE
rebuildSPC()
is used to re-construct all the required
slots of a SoilProfileCollection, given a source, possibly corrupt,
object. This function fixes major issues related to the internal
ordering of data in slots, as well as missing slots or metadata.
z <- rebuildSPC(sp4)
checkHzDepthLogic()
has the ability to perform logical
tests on whole profiles or individual horizons. Four different tests are
performed related to four common errors in horizon depths:
bottom depth shallower than top depth
equal top and bottom depth
missing (NA
) top or bottom depth
gap or overlap between adjacent horizons
checkHzDepthLogic(sp4)
#> id valid depthLogic sameDepth missingDepth overlapOrGap
#> 1 colusa TRUE FALSE FALSE FALSE FALSE
#> 2 glenn TRUE FALSE FALSE FALSE FALSE
#> 3 kings TRUE FALSE FALSE FALSE FALSE
#> 4 mariposa TRUE FALSE FALSE FALSE FALSE
#> 5 mendocino TRUE FALSE FALSE FALSE FALSE
#> 6 napa TRUE FALSE FALSE FALSE FALSE
#> 7 san benito TRUE FALSE FALSE FALSE FALSE
#> 8 shasta TRUE FALSE FALSE FALSE FALSE
#> 9 shasta-trinity TRUE FALSE FALSE FALSE FALSE
#> 10 tehama TRUE FALSE FALSE FALSE FALSE
checkHzDepthLogic(sp4, byhz = TRUE)
#> id top bottom valid hzID depthLogic sameDepth missingDepth overlapOrGap
#> 1 colusa 0 3 TRUE 1 FALSE FALSE FALSE NA
#> 2 colusa 3 8 TRUE 2 FALSE FALSE FALSE NA
#> 3 colusa 8 30 TRUE 3 FALSE FALSE FALSE NA
#> 4 colusa 30 42 TRUE 4 FALSE FALSE FALSE NA
#> 5 glenn 0 9 TRUE 5 FALSE FALSE FALSE NA
#> 6 glenn 9 34 TRUE 6 FALSE FALSE FALSE NA
#> 7 kings 0 4 TRUE 7 FALSE FALSE FALSE NA
#> 8 kings 4 13 TRUE 8 FALSE FALSE FALSE NA
#> 9 kings 13 40 TRUE 9 FALSE FALSE FALSE NA
#> 10 mariposa 0 3 TRUE 10 FALSE FALSE FALSE NA
#> 11 mariposa 3 14 TRUE 11 FALSE FALSE FALSE NA
#> 12 mariposa 14 34 TRUE 12 FALSE FALSE FALSE NA
#> 13 mariposa 34 49 TRUE 13 FALSE FALSE FALSE NA
#> 14 mendocino 0 2 TRUE 14 FALSE FALSE FALSE NA
#> 15 mendocino 2 8 TRUE 15 FALSE FALSE FALSE NA
#> 16 mendocino 8 30 TRUE 16 FALSE FALSE FALSE NA
#> 17 napa 0 6 TRUE 17 FALSE FALSE FALSE NA
#> 18 napa 6 20 TRUE 18 FALSE FALSE FALSE NA
#> 19 san benito 0 8 TRUE 19 FALSE FALSE FALSE NA
#> 20 san benito 8 20 TRUE 20 FALSE FALSE FALSE NA
#> 21 shasta 0 3 TRUE 21 FALSE FALSE FALSE NA
#> 22 shasta 3 40 TRUE 22 FALSE FALSE FALSE NA
#> 23 shasta-trinity 0 2 TRUE 23 FALSE FALSE FALSE NA
#> 24 shasta-trinity 2 5 TRUE 24 FALSE FALSE FALSE NA
#> 25 shasta-trinity 5 12 TRUE 25 FALSE FALSE FALSE NA
#> 26 shasta-trinity 12 23 TRUE 26 FALSE FALSE FALSE NA
#> 27 shasta-trinity 23 40 TRUE 27 FALSE FALSE FALSE NA
#> 28 tehama 0 3 TRUE 28 FALSE FALSE FALSE NA
#> 29 tehama 3 7 TRUE 29 FALSE FALSE FALSE NA
#> 30 tehama 7 16 TRUE 30 FALSE FALSE FALSE NA
SoilProfileCollection objects can be coerced to
data.frame
, list
and
SpatialPointsDataFrame
(when spatial slot has been set
up):
# check our work by viewing the internal structure
str(sp4)
# create a data.frame from horizon+site data
as(sp4, 'data.frame')
# or, equivalently:
as.data.frame(sp4)
# convert SoilProfileCollection to a named list containing all slots
as(sp4, 'list')
# extraction of site + spatial data as SpatialPointsDataFrame
as(sp4, 'SpatialPointsDataFrame')
SoilProfileCollection
objects can be subset using the
familiar [
-style notation used by matrix and
data.frame
objects, such that: spc[i, j]
will
return profiles identified by the integer vector i
, and
horizons identified by the integer vector j
.
Omitting either index will result in all profiles (i
omitted) or all horizons (j
omitted).
Typically, site-level attributes will be used as the subsetting
criteria. Functions that return an index to matches (such as
grep()
or which()
) provide the link between
attributes and an index to matching profiles.
Using the i index, select one or more profiles by numeric
index. An index greater than the number of profiles will return an empty
SoilProfileCollection
object.
# explicit string matching
idx <- which(sp4$group == 'A')
# numerical expressions
idx <- which(sp4$elevation < 1000)
# regular expression, matches any profile ID containing 'shasta'
idx <- grep('shasta', profile_id(sp4), ignore.case = TRUE)
# perform subset based on index
sp4[idx, ]
In an interactive session, it is often simpler to use
subset()
directly:
SoilProfileCollection
objects are combined by passing a
list of objects to the combine()
function.
Ideally all objects share the same internal structure, profile ID,
horizon ID, depth units, and other parameters of a
SoilProfileCollection
. Manually subset the example data
into 3 pieces, compile into a list, and then combine
back
together.
# subset data into chunks
s1 <- sp4[1:2, ]
s2 <- sp4[4, ]
s3 <- sp4[c(6, 8, 9), ]
# combine subsets
s <- combine(list(s1, s2, s3))
# double-check result
plotSPC(s)
It is possible to combine SoilProfileCollection
objects
with different internal structure. The final object will contain the all
site and horizon columns from the inputs, possibly creating sparse
tables. IDs and horizon depth names are taken from the first object.
# sample data as data.frame objects
data(sp1)
data(sp3)
# rename IDs horizon top / bottom columns
sp3$newid <- sp3$id
sp3$hztop <- sp3$top
sp3$hzbottom <- sp3$bottom
# remove originals
sp3$id <- NULL
sp3$top <- NULL
sp3$bottom <- NULL
# promote to SoilProfileCollection
depths(sp1) <- id ~ top + bottom
depths(sp3) <- newid ~ hztop + hzbottom
# label each group via site-level attribute
site(sp1)$g <- 'sp1'
site(sp3)$g <- 'sp3'
# combine
x <- combine(list(sp1, sp3))
# make grouping variable into a factor for groupedProfilePlot
x$g <- factor(x$g)
# check results
str(x)
# graphical check
# convert character horizon IDs into numeric
x$.horizon_ids_numeric <- as.numeric(hzID(x))
par(mar = c(0, 0, 3, 1))
plotSPC(x, color='.horizon_ids_numeric', col.label = 'Horizon ID')
groupedProfilePlot(x, 'g', color='.horizon_ids_numeric', col.label = 'Horizon ID', group.name.offset = -15)
The inverse of combine()
is split()
:
subsets of the SoilProfileCollection
are split into list
elements, each containing a new SoilProfileCollection
.
Duplicate the first profile in sp4
(Colusa) 8 times,
resulting in a new SoilProfileCollection
object containing
unique profile IDs.
# an example soil profile
x <- data.frame(
id = 'A',
name = c('A', 'E', 'Bhs', 'Bt1', 'Bt2', 'BC', 'C'),
top = c(0, 10, 20, 30, 40, 50, 100),
bottom = c(10, 20, 30, 40, 50, 100, 125),
z = c(8, 5, 3, 7, 10, 2, 12)
)
# init SPC
depths(x) <- id ~ top + bottom
hzdesgnname(x) <- 'name'
# horizon depth variability for simulation
horizons(x)$.sd <- 2
# duplicate several times
x.dupes <- duplicate(x, times = 5)
# simulate some new profiles based on example
# 2cm constant standard deviation of transition between horizons assumed
x.sim <- perturb(x, n = 5, thickness.attr = '.sd')
# graphical check
par(mar = c(0, 2, 0, 4))
# inspect unique results
plotSPC(unique(x.dupes, vars = c('top', 'bottom')),
name.style = 'center-center',
width = 0.15)
The plotSPC()
method for
SoilProfileCollection
objects generates sketches of
profiles within the collection based on horizon boundaries, vertically
aligned to an integer sequence from 1 to the number of profiles.
Horizon names are automatically extracted from a horizon-level
attribute name
(if present), or via an alternate attributed
given as an argument: name='column.name'
.
Horizon colors are automatically generated from the horizon-level
attribute soil_color
, or any other attribute of
R-compatible color description given as an argument:
color='column.name'
. This function is highly customizable,
therefore, it is prudent to consult help(plotSPC)
from time
to time. Soil colors in Munsell notation can be converted to
R-compatible colors via munsell2rgb()
.
Many of the following examples make use of par()
to set
graphical options such as mar
(customized margins) and/or
xpd = NA
(turn off clipping) to optimize display of
different numbers of profiles and various plotSPC()
arguments.
A detailed description of each argument to plotSPC()
(and many examples) can be found in the Soil Profile
Sketches tutorial.
The explainPlotSPC()
function is helpful for adjusting
some of the more obscure arguments to plotSPC()
.
A basic plot with debugging information overlaid:
par(mar = c(4, 3, 2, 2))
explainPlotSPC(sp4, name = 'name')
Make sketches wider:
par(mar = c(4, 3, 2, 2))
explainPlotSPC(sp4, name = 'name', width = 0.3)
Move soil surface at 0cm “down” 5cm:
par(mar = c(4, 3, 2, 2))
explainPlotSPC(sp4, name = 'name', y.offset = 5)
Move soil surface at 0cm “up” 10cm; useful for sketches of shallow profiles:
par(mar = c(4, 3, 2, 2))
explainPlotSPC(sp4, name = 'name', y.offset = -10)
Scale depths by 50%:
par(mar = c(4, 3, 2, 2))
explainPlotSPC(sp4, name = 'name', scaling.factor = 0.5)
A graphical explanation of how profiles are re-arranged via
plot.order
argument:
par(mar = c(4, 3, 2, 2))
explainPlotSPC(sp4, name = 'name', plot.order = length(sp4):1)
Leave room for an additional 2 profile sketches:
par(mar = c(4, 3, 2, 2))
explainPlotSPC(sp4, name = 'name', n = length(sp4) + 2)
Making quality figures with fewer than 5 soil profiles usually
requires more customization of the basic call to plotSPC
.
In general, the following are a good starting point:
par
png()
) dimensions
and resolutioncex.names
width
, typically within
0.15-0.35line
values,
e.g. (depth.axis = list(line = -2)
)Get some example data from the Official Series Descriptions:
data(osd)
x <- osd
Using 5x6 inch output device:
Using 7x6 inch output device, slight adjustments to
width
usually required:
# set margins and turn off clipping
par(mar = c(0, 2, 0, 4), xpd = NA)
plotSPC(x[1:2, ], cex.names = 1, width = 0.25)
Using 8x6 inch output device, slight adjustments to are usually required:
Horizon depths can be labeled for each profile as an alternative to a single depth axis.
par(mar = c(0, 0, 1, 1))
plotSPC(
x,
cex.names = 1,
name.style = 'center-center',
width = 0.3,
depth.axis = FALSE,
hz.depths = TRUE,
hz.depths.offset = 0.08
)
As of aqp version 1.41, it is possible to “fix” overlapping horizon
depth labels with the new fixLabelCollisions
argument. This
approach to labeling depths works best when moving horizon designations
with the name.style
argument. Note that setting
fixLabelCollisions = TRUE
may incur a performance penalty
when plotting large SoilProfileCollection
objects.
As of aqp version 2.0, fixLabelCollisions
is set to
TRUE
when setting hz.depths = TRUE
.
Relative horizontal positioning via relative.pos
argument. Can be used with plot.order
but be careful:
relative.pos
must be specified in the final
ordering of profiles. See ?plotSPC
for details.
par(mar = c(4, 3, 2, 2))
pos <- jitter(1:length(sp4))
explainPlotSPC(sp4, name = 'name', relative.pos = pos)
Relative positioning works well when the vector of positions is close to the default spacing along an integer sequence, but not when positions are closer than the width of a profile sketch.
par(mar = c(4, 3, 2, 2))
pos <- c(1, 1.2, 3, 4, 5, 5.2, 7, 8, 9, 10)
explainPlotSPC(sp4, name = 'name', relative.pos = pos)
The fixOverlap()
function can be used to find a suitable
arrangement of profiles based on a compromise between the suggested
relative positions and minimization of overlap.
This is an iterative procedure, based on random perturbations of
overlapping profiles, therefore it is possible for the algorithm to stop
at a sub-optimal configuration. Results can be controlled using
set.seed()
.
par(mar = c(4, 3, 2, 2))
new.pos <- fixOverlap(pos)
explainPlotSPC(sp4, name = 'name', relative.pos = new.pos)
There are several parameters available for optimizing horizontal
position in the presence of overlap. See ?fixOverlap
for
details and further examples.
par(mar = c(4, 3, 2, 2))
new.pos <- fixOverlap(pos, thresh = 0.7)
explainPlotSPC(sp4, name = 'name', relative.pos = new.pos)
The SPC plotting ideas tutorial contains several additional examples.
Horizon-level attributes can be symbolized with color, in this case using the horizon-level attribute “clay”:
par(mar = c(0, 0, 3, 0))
plotSPC(sp4,
name = 'name',
color = 'clay',
col.label = 'Clay Content (%)')
Use a different set of colors:
par(mar = c(0, 0, 3, 0))
plotSPC(
sp4,
name = 'name',
color = 'clay',
col.palette = RColorBrewer::brewer.pal(10, 'Spectral'),
col.label = 'Clay Content (%)'
)
Categorical properties can also be used to make a thematic sketch.
Colors are interpolated when there are more classes than colors provided
by col.palette
:
par(mar = c(0, 0, 3, 0))
plotSPC(
sp4,
name = 'name',
color = 'name',
col.palette = RColorBrewer::brewer.pal(5, 'Set1'),
col.label = 'Original Horizon Name'
)
Try with generalized horizon labels:
par(mar = c(0, 0, 3, 0))
# generalize horizon names into 3 groups
sp4$genhz <- generalize.hz(sp4$name, new = c('A', 'AB', 'Bt'), pat = c('A[0-9]?', 'AB', '^Bt'))
plotSPC(
sp4,
name = 'name',
color = 'genhz',
col.palette = RColorBrewer::brewer.pal(3, 'Spectral'),
col.label = 'Generalized Horizon Name'
)
Horizon-level attributes that represent a volume fraction
(e.g. coarse-fragment percentage) can be added to an existing figure.
See ?addVolumeFraction
for adding layers to an existing
plot based on these attributes.
par(mar = c(0, 0, 3, 0))
# convert coarse rock fragment proportion to percentage
sp4$frag_pct <- sp4$CF * 100
# label horizons with fragment percent
plotSPC(sp4, name = 'frag_pct', color = 'frag_pct')
# symbolize volume fraction data
addVolumeFraction(sp4, colname = 'frag_pct')
Annotation of depth-intervals can be accomplished using “brackets”.
Here we extract top/bottom depths associated with top and bottom depth A
horizons using the min/max variants of the depthOf()
function. See ?depthOf for more details.
# extract top/bottom depths associated with all A horizons
tops <- minDepthOf(sp4, pattern = '^A', hzdesgn = 'name', top = TRUE)
bottoms <- maxDepthOf(sp4, pattern = '^A', hzdesgn = 'name', top = FALSE)
IDs <- profile_id(sp4)
# assemble data.frame
a <- data.frame(id = IDs, top = tops$top, bottom = bottoms$bottom)
# check
a
#> id top bottom
#> 1 colusa 0 8
#> 2 glenn 0 9
#> 3 kings 0 4
#> 4 mariposa 0 3
#> 5 mendocino 0 2
#> 6 napa 0 6
#> 7 san benito 0 8
#> 8 shasta 0 3
#> 9 shasta-trinity 0 12
#> 10 tehama 0 3
par(mar = c(0, 0, 0, 0))
plotSPC(sp4)
# annotate A horizon depth interval with brackets
addBracket(a, col = 'red', offset = -0.4)
Add labels:
par(mar = c(0, 0, 0, 0))
plotSPC(sp4, name = 'name')
# addBracket() looks for a column `label`; add a ID for each bracket
a$label <- site(sp4)$id
# note that depth brackets "knows which profiles to use" via profile ID
addBracket(
a,
col = 'red',
label.cex = 0.75,
missing.bottom.depth = 25,
offset = -0.4
)
It is possible to arrange profile sketches by site-level grouping variable:
par(mar = c(0, 0, 0, 0))
groupedProfilePlot(sp4, groups = 'group')
addBracket(a, col = 'red', offset = -0.4)
There need not be brackets for all profiles in a collection:
par(mar = c(0, 0, 0, 0))
a.sub <- a[1:4,]
groupedProfilePlot(sp4, groups = 'group')
addBracket(a.sub, col = 'red', offset = -0.4)
When bottom depths are missing an arrow is used:
a$bottom <- NA
par(mar = c(0, 0, 0, 0))
groupedProfilePlot(sp4, groups = 'group')
addBracket(a, col = 'red', offset = -0.4)
Manually define bottom depth:
par(mar = c(0, 0, 0, 0))
groupedProfilePlot(sp4, groups = 'group')
addBracket(
a,
col = 'red',
label.cex = 0.75,
missing.bottom.depth = 25,
offset = -0.4
)
Further customization of brackets:
par(mar = c(0, 0, 0, 0))
plotSPC(sp4, max.depth = 75)
# copy root-restricting features
a <- restrictions(sp4)
# add a label: restrictive feature 'kind'
a$label <- a$kind
# add restrictions using vertical bars
addBracket(
a,
col = 'red',
label.cex = 0.75,
tick.length = 0,
lwd = 3,
offset = -0.4
)
These functions (addBracket()
,
addDiagnosticBracket()
, and
addVolumeFraction()
) will automatically compensate for
alternative sketch ordering or relative positioning of profiles.
Sketches for use in layout tools such as Adobe Illustrator or
Inkscape should be stored in a vector file format. SVG is compatible
with most software titles. WMF output is compatible with MS Office tools
and can be written with the win.metafile()
function.
# library(svglite)
# svglite(filename = 'e:/temp/fig.svg', width = 7, height = 6, pointsize = 12)
#
# par(mar = c(0,2,0,4), xpd = NA)
# plotSPC(x, cex.names=1, depth.axis = list(line = -0.2), width=0.3)
#
# dev.off()
The profileApply()
function is an extension of the
familiar apply() family of functions extended to
SoilProfileCollection
objects.
The function named in the FUN
argument is evaluated once
for each profile in the collection, typically returning a single value
per profile. In this case, the ordering of the results would match the
ordering of values in the site level attribute table.
# max() returns the depth of a soil profile
sp4$soil.depth <- profileApply(sp4, FUN = max)
# max() with additional argument give max depth to non-missing 'clay'
sp4$soil.depth.clay <- profileApply(sp4, FUN = max, v = 'clay')
# nrow() returns the number of horizons
sp4$n.hz <- profileApply(sp4, FUN = nrow)
# compute the mean clay content by profile using an inline function
sp4$mean.clay <- profileApply(sp4, FUN = function(i) mean(i$clay))
# estimate soil depth based on horizon designation
sp4$soil.depth <- profileApply(sp4, estimateSoilDepth, name = 'name')
When FUN
returns a vector of the same length as the
number of horizons in a profile, profileApply()
can be used
to create new horizon-level attributes. For example, the change in clay
content by horizon depth (delta.clay
, below) could be
calculated as:
# save as horizon-level attribute
sp4$delta.clay <- profileApply(sp4, FUN = function(i) c(NA, diff(i$clay)))
# check results:
horizons(sp4)[1:6, c('id', 'top', 'bottom', 'clay', 'delta.clay')]
#> id top bottom clay delta.clay
#> 1 colusa 0 3 21 NA
#> 2 colusa 3 8 27 6
#> 3 colusa 8 30 32 5
#> 4 colusa 30 42 55 23
#> 5 glenn 0 9 25 NA
#> 6 glenn 9 34 34 9
More complex summaries can be generated by writing a custom function
that is then called by profileApply()
. Note that each
profile is passed into this function and accessed via a temporary
variable (i
), which is a SoilProfileCollection
object containing a single profile. See help(profileApply)
for details.
A list of SoilProfileCollection
objects returned from a
custom function can be combined into a single
SoilProfileCollection
object via
combine()
.
# compute hz-thickness weighted mean exchangeable-Ca:Mg
wt.mean.ca.mg <- function(i) {
# use horizon thickness as a weight
thick <- i$bottom - i$top
# compute the thickness weighted mean, ignoring missing values
weighted.mean(i$ex_Ca_to_Mg, w = thick, na.rm = TRUE)
}
# apply our custom function and save results as a site-level attribute
sp4$wt.mean.ca.to.mg <- profileApply(sp4, wt.mean.ca.mg)
We can now use our some of our new site-level attributes to order the profiles when plotting.
In this case profiles are ordered based on the horizon-thickness weighted mean, exchangeable Ca:Mg values. Horizons are colored by exchangeable Ca:Mg values.
# plot the data using our new order based on Ca:Mg weighted average
# the result is an index of rank
new.order <- order(sp4$wt.mean.ca.to.mg)
par(mar = c(4, 0, 3, 0)) # tighten figure margins
plotSPC(sp4,
name = 'name',
color = 'ex_Ca_to_Mg',
plot.order = new.order)
# add an axis labeled with the sorting criteria
axis(1, at = 1:length(sp4), labels = round(sp4$wt.mean.ca.to.mg, 3), cex.axis = 0.75)
mtext(1, line = 2.25, text = 'Horizon Thickness Weighted Mean Ex. Ca:Mg', cex = 0.75)
dice()
Collections of soil profiles can be sliced (or re-sampled) into
regular depth-intervals with the dice()
function. The
slicing structure and variables of interest are defined via formula
notation:
# slice select horizon-level attributes
seq ~ var.1 + var.2 + var.3 + ...
# slice all horizon-level attributes
seq ~ .
where seq
is a sequence of integers
(e.g. 0:15
, c(5,10,15,20)
, etc.) and
var.1 + var.2 + var.3 + ...
are horizon-level attributes to
slice. Both continuous and categorical variables can be named on the
right-hand-side of the formula. The results returned by
dice()
is either a SoilProfileCollection
, or
data.frame
when called with the optional argument
SPC = FALSE
. For example, to slice our sample data set into
1-cm intervals, from 0–15 cm:
# resample to 1cm slices
s <- dice(sp4, fm = 0:15 ~ sand + silt + clay + name + ex_Ca_to_Mg)
# check the result
class(s)
#> [1] "SoilProfileCollection"
#> attr(,"package")
#> [1] "aqp"
# plot sliced data
par(mar = c(0, 0, 3, 0)) # tighten figure margins
plotSPC(s, color = 'ex_Ca_to_Mg')
Once soil profile data have been sliced, it is simple to extract
“chunks” of data by depth interval via [
-subsetting:
# slice from 0 to max depth in the collection
s <- dice(sp4, fm= 0:max(sp4) ~ sand + silt + clay + name + ex_Ca_to_Mg)
# extract all data over the range of 5--10 cm:
plotSPC(s[, 5:10])
# extract all data over the range of 25--50 cm:
plotSPC(s[, 25:50])
# extract all data over the range of 10--20 and 40--50 cm:
plotSPC(s[, c(10:20, 40:50)])
glom()
Select all horizons that overlap the interval of 5-15cm:
# truncate to the interval 5-15cm
clods <- glom(sp4, z1 = 5, z2 = 15)
# plot outlines of original profiles
par(mar = c(0, 0, 3, 1.5))
plotSPC(sp4, color = NA, name = NA, print.id = FALSE, depth.axis = FALSE, lwd = 0.5)
# overlay glom() depth criteria
rect(xleft = 0.5, ybottom = 15, xright = length(sp4) + 0.5, ytop = 5, lty = 2)
# add SoilProfileCollection with selected horizons
plotSPC(clods, add = TRUE, cex.names = 0.6, name = 'name', color = 'ex_Ca_to_Mg', name.style = 'center-center')
trunc()
Truncate the SPC to the interval of 5-15cm:
# truncate to the interval 5-15cm
sp4.truncated <- trunc(sp4, 5, 15)
# plot outlines of original profiles
par(mar = c(0, 0, 3, 1.5))
plotSPC(sp4, color = NA, name = NA, print.id = FALSE, lwd = 0.5)
# overlay truncation criteria
rect(xleft = 0.5, ybottom = 15, xright = length(sp4) + 0.5, ytop = 5, lty = 2)
# add truncated SoilProfileCollection
plotSPC(sp4.truncated, depth.axis = FALSE, add = TRUE, cex.names = 0.6, name = 'name', color = 'ex_Ca_to_Mg', name.style = 'center-center')
There is often a need for converting data to a new set of depth
intervals. This re-alignment of horizon depths via aggregation can be
considered a type of change of support. The operation on
SoilProfileCollection
objects requires some aggregation
function (mean, median, etc.) and possibly interpolation
(e.g. splines).
The slab()
function is the simplest way to implement a
change of depth support via aggregation. Profile grouping criteria and
horizon attribute selection is parameterized via formula: either
groups ~ var1 + var2 + var3
where named variables are
aggregated within groups
OR where named variables are
aggregated across the entire collection
~ var1 + var2 + var3
. The default summary function
(slab.fun
) computes the 5th, 25th, 50th, 75th, and 95th
percentiles via Harrell-Davis quantile estimator.
The depth structure (“slabs”) over which summaries are computed is
defined with the slab.structure
argument using:
10
): data are aggregated over a
regular sequence of 10-unit thickness slabsc(50, 60)
): data are
aggregated over depths spanning 50–60 unitsc(0, 5, 10, 50, 100)
): data are aggregated over the
depths spanning 0–5, 5–10, 10–50, 50–100 units
# aggregate a couple of the horizon-level attributes,
# across the entire collection,
# from 0--10 cm
# computing the mean value ignoring missing data
slab(
sp4,
fm = ~ sand + silt + clay,
slab.structure = c(0, 10),
slab.fun = mean,
na.rm = TRUE
)
#> variable all.profiles value contributing_fraction top bottom
#> 1 sand 1 47.63 1 0 10
#> 2 silt 1 31.15 1 0 10
#> 3 clay 1 21.11 1 0 10
# again, this time within groups defined by a site-level attribute:
slab(
sp4,
fm = group ~ sand + silt + clay,
slab.structure = c(0, 10),
slab.fun = mean,
na.rm = TRUE
)
#> variable group value contributing_fraction top bottom
#> 1 sand A 48.26 1 0 10
#> 2 sand B 47.00 1 0 10
#> 3 silt A 31.52 1 0 10
#> 4 silt B 30.78 1 0 10
#> 5 clay A 20.30 1 0 10
#> 6 clay B 21.92 1 0 10
# again, this time over several depth ranges
slab(
sp4,
fm = ~ sand + silt + clay,
slab.structure = c(0, 10, 25, 40),
slab.fun = mean,
na.rm = TRUE
)
#> variable all.profiles value contributing_fraction top bottom
#> 1 sand 1 47.63000 1.0000000 0 10
#> 2 sand 1 42.38931 0.8733333 10 25
#> 3 sand 1 32.14607 0.5933333 25 40
#> 4 silt 1 31.15000 1.0000000 0 10
#> 5 silt 1 29.41221 0.8733333 10 25
#> 6 silt 1 31.34831 0.5933333 25 40
#> 7 clay 1 21.11000 1.0000000 0 10
#> 8 clay 1 28.10687 0.8733333 10 25
#> 9 clay 1 36.26966 0.5933333 25 40
# again, this time along 1-cm slices, computing quantiles
agg <- slab(sp4, fm = ~ Mg + Ca + ex_Ca_to_Mg + CEC_7 + clay)
# see ?slab for details on the default aggregate function
head(agg)
#> variable all.profiles p.q5 p.q25 p.q50 p.q75 p.q95 contributing_fraction top bottom
#> 1 Mg 1 6.015 12.175 14.60 21.125 27.13 1 0 1
#> 2 Mg 1 6.015 12.175 14.60 21.125 27.13 1 1 2
#> 3 Mg 1 6.015 12.175 19.15 25.650 27.76 1 2 3
#> 4 Mg 1 6.195 13.175 21.05 25.050 30.73 1 3 4
#> 5 Mg 1 6.195 13.175 21.05 25.050 30.73 1 4 5
#> 6 Mg 1 6.195 13.175 21.05 26.250 31.72 1 5 6
# plot median +/i bounds defined by the 25th and 75th percentiles
# this is lattice graphics, syntax is a little rough
xyplot(top ~ p.q50 | variable, data = agg, ylab = 'Depth',
xlab = 'median bounded by 25th and 75th percentiles',
lower = agg$p.q25, upper = agg$p.q75, ylim = c(42, -2),
panel = panel.depth_function,
alpha = 0.25, sync.colors = TRUE,
par.settings = list(superpose.line = list(col = 'RoyalBlue', lwd = 2)),
prepanel = prepanel.depth_function,
cf = agg$contributing_fraction, cf.col = 'black', cf.interval = 5,
layout = c(5, 1), strip = strip.custom(bg = grey(0.8)),
scales = list(x = list(
tick.number = 4,
alternating = 3,
relation = 'free'
))
)
Depth-wise aggregation can be useful for visual evaluation of multivariate similarity among groups of profiles.
# processing the "napa" and tehama profiles
idx <- which(profile_id(sp4) %in% c('napa', 'tehama'))
napa.and.tehama <- slab(sp4[idx,], fm = ~ Mg + Ca + ex_Ca_to_Mg + CEC_7 + clay)
# combine with the collection-wide aggregate data
g <- make.groups(collection = agg, napa.and.tehama = napa.and.tehama)
# compare graphically:
xyplot(top ~ p.q50 | variable, groups = which, data = g, ylab = 'Depth',
xlab = 'median bounded by 25th and 75th percentiles',
lower = g$p.q25, upper = g$p.q75, ylim = c(42, -2),
panel = panel.depth_function,
alpha = 0.25, sync.colors = TRUE, cf = g$contributing_fraction, cf.interval = 10,
par.settings = list(superpose.line = list(
col = c('RoyalBlue', 'Red4'),
lwd = 2,
lty = c(1, 2)
)),
prepanel = prepanel.depth_function,
layout = c(5, 1),
strip = strip.custom(bg = grey(0.8)),
scales = list(x = list(
tick.number = 4,
alternating = 3,
relation = 'free'
)),
auto.key = list(columns = 2,
lines = TRUE,
points = FALSE)
)
The following example is based on a set of 9 randomly generated profiles, re-aligned to the Global Soil Map (GSM) standard depths.
library(data.table)
library(RColorBrewer)
# 9 random profiles
# 1 simulated property via logistic power peak (LPP) function
# 6, 7, or 8 horizons per profile
# result is a list of single-profile SPC
d <- lapply(
as.character(1:9),
random_profile,
n = c(6, 7, 8),
n_prop = 1,
method = 'LPP',
SPC = TRUE
)
# combine into single SPC
d <- combine(d)
# GSM depths
gsm.depths <- c(0, 5, 15, 30, 60, 100, 200)
# aggregate using mean: wt.mean within slabs
# see ?slab for ideas on how to parameterize slab.fun
d.gsm <- slab(d, fm = id ~ p1, slab.structure = gsm.depths, slab.fun = mean, na.rm = TRUE)
# note: result is in long-format
# note: horizon names are lost due to aggregation
head(d.gsm, 7)
#> variable id value contributing_fraction top bottom
#> 1 p1 1 45.44919 1.0000000 0 5
#> 2 p1 1 52.60948 1.0000000 5 15
#> 3 p1 1 50.62579 1.0000000 15 30
#> 4 p1 1 46.70293 1.0000000 30 60
#> 5 p1 1 37.54221 1.0000000 60 100
#> 6 p1 1 31.23363 0.7826087 100 200
#> 7 p1 2 16.26688 1.0000000 0 5
A simple graphical comparison of the original and re-aligned soil
profile data, after converting slab()
result from long
-> wide format with {data.table} dcast()
:
# reshape to wide format
# this scales to > 1 aggregated variables
d.gsm.pedons <- data.table::dcast(
data.table(d.gsm),
formula = id + top + bottom ~ variable,
value.var = 'value'
)
# init SPC
depths(d.gsm.pedons) <- id ~ top + bottom
# iterate over aggregated profiles and make new hz names
d.gsm.pedons$hzname <- profileApply(d.gsm.pedons, function(i) {
paste0('GSM-', 1:nrow(i))
})
# compare original and aggregated
par(mar = c(1, 0, 3, 3), mfrow = c(2, 1))
plotSPC(d, color = 'p1')
mtext('original depths', side = 2, line = -1.5)
plotSPC(d.gsm.pedons, color = 'p1')
mtext('GSM depths', side = 2, line = -1.5)
Note that re-aligned data may not represent reality (and should
therefore be used with caution) when the original soil depth is
shallower than the deepest of the new (re-aligned) horizon depths. The
contributing_fraction
metric returned by
slab()
can be useful for assessing how much real
data were used to generate the new set of re-aligned data.
# reshape to wide format
d.gsm.pedons.2 <- data.table::dcast(
data.table(d.gsm),
formula = id + top + bottom ~ variable,
value.var = 'contributing_fraction'
)
# init SPC
depths(d.gsm.pedons.2) <- id ~ top + bottom
# compare original and aggregated
par(mar = c(1, 0, 3, 3), mfrow = c(2, 1))
plotSPC(d.gsm.pedons, name = '', color = 'p1')
mtext('GSM depths', side = 2, line = -1.5)
plotSPC(
d.gsm.pedons.2,
name = '',
color = 'p1',
col.label = 'Contributing Fraction',
col.palette = RColorBrewer::brewer.pal(10, 'Spectral')
)
mtext('GSM depths', side = 2, line = -1.5)
NCSP()
Calculation of between-profile dissimilarity is performed using
NCSP()
(Numerical Classification of Soil Profiles). As of
aqp 2.0, NCSP()
replaces the older implementation in
profile_compare()
which is now deprecated. Dissimilarity
values depend on attributes selection (e.g. clay, CEC, pH , etc.),
optional depth-weighting parameter (k
), and a maximum depth
of evaluation (maxDepth
).
See the function manual page and this paper for details.
library(cluster)
library(sharpshootR)
# start fresh
data(sp4)
depths(sp4) <- id ~ top + bottom
hzdesgnname(sp4) <- 'name'
# eval dissimilarity:
# using Ex-Ca:Mg ratio and CEC at pH 7
# no depth-weighting (k = 0)
# to a maximum depth of 40 cm
d <- NCSP(sp4, vars = c('ex_Ca_to_Mg', 'CEC_7'), k = 0, maxDepth = 40)
# check distance matrix:
round(d, 1)
#> colusa glenn kings mariposa mendocino napa san benito shasta shasta-trinity
#> glenn 13.8
#> kings 16.0 13.1
#> mariposa 8.4 11.6 16.5
#> mendocino 12.1 8.2 17.0 15.5
#> napa 31.5 24.9 30.5 30.3 22.2
#> san benito 26.8 21.4 27.4 29.3 16.4 18.0
#> shasta 17.2 13.6 8.7 17.6 17.6 34.8 23.3
#> shasta-trinity 6.4 16.9 22.3 9.6 17.0 30.9 28.3 23.3
#> tehama 30.1 23.9 29.2 28.6 20.8 9.0 15.3 32.7 29.2
# visualize dissimilarity matrix via divisive hierarchical clustering
d.diana <- diana(d)
# this function is from the sharpshootR package
# requires some manual adjustments
par(mar = c(0, 0, 4, 0))
plotProfileDendrogram(
sp4,
d.diana,
scaling.factor = 0.9,
y.offset = 5,
cex.names = 0.7,
width = 0.3,
color = 'ex_Ca_to_Mg',
name.style = 'center-center',
hz.depths = TRUE,
depth.axis = FALSE
)
Some additional examples can be found in:
This document is based on aqp
version 2.1.0.