You have a collection of pedons that have been correlated to a soil series or component that you would would like to compute the Range in Characteristics (“low-rv-high” values; RIC) for.
For your exercise, we ask you to calculate Range in Characteristic(s) for a soil series or component.
To do this, you will assign Generalized Horizon Labels (GHLs) to pedons from your area of responsibility. These labels will be a grouping variable to help you to determine the RIC for one (or more) properties of your choice.
One way we can create GHLs is by matching patterns in the field horizon designations to correlate horizon data to a simpler set of labels. We have started to call this assignment of GHLs micro-correlation.
A SoilProfileCollection plot - showing the GHLs as horizon colors and field designation as the labels along the side of each profile.
Table of Range in Characteristics for your selected property in each GHL.
An R file, with comments for any changes you made to the default analysis. You may use this R file as a template.
Alternately, you may do the entire assignment in a .Rmd file (R Markdown) and submit the “knitted” HTML result.
You will need to move the code from the .R file to a .Rmd yourself. You can create a basic .Rmd from RStudio using File >> New File >> R Markdown… Here is a cheatsheet for basic R Markdown syntax: RStudio R Markdown Cheat Sheet
If you do the .Rmd file, include both the .Rmd and the knitted HTML result. If you use an R script, include the R script and PDF or screenshots of Results #1 and #2.
NOTE: There are some Exercise Tips at the end of the document to help you get going on modifications for your own analyses.
Send the results to your mentor with your first and last name in the file names. It may also be helpful to include a copy of your input pedon data (.Rda). Instructions for saving this can be found below.
Query NASIS database to load your selected set with some pedons. Replace the call to
data("loafercreek") and the subsetting of
loafercreek[1:20,]) with a routine that brings in your own data (e.g. using
Create a subset (in R) of soils to summarize (call it
pedons). You should have about
n = 20 for a managable demo that fits nicely in a SoilProfileCollection plot, but you are welcome to try the same routines on larger sets. STAT 1010: Chapter 2a, Section 4.2: Filtering
Inspect the field horizon designations (look at the pedons in R or NASIS, if needed). Think about which field horizon designation(s) should correlate to each “prototype” horizon.
Decide on “prototype” horizon designation scheme. Think of the “prototype” as as set of general horizon labels that are related – like the list of horizon designations that you provide for the Range in Characteristics in an OSD, or the list of layers you include in a SSURGO component.
Write a set of regular expressions (REGEX patterns) (you’ll need one pattern per generalized horizon) to do the correlations you thought about in #4. Test and learn more about regular expressions here: https://regexr.com/
Cross-tabulate your GHLs against the field horizon designations. This will show a table of the mapping from “old” to “new” (“field” to “correlated”).
Check if any horizon designations were NOT assigned a label (have label “not-used”). At a minimum you should be able to answer the question: “Which horizons were not assigned?” Bonus points if you can answer “Why [those horizons] weren’t assigned?”
Repeat steps 3 through 7 as needed. You don’t need to get the patterns perfect but we want you to think about how you could/would “improve” them, especially if they don’t work as intended.
Do a statistical summary on your horizon groups to produce a range in characteristics. You can use the example given below for
loafercreek (mean/sd/min/max clay content) or do something else of interest to you.
Be prepared to discuss issues you had with your mentor. In particular, what “decisions” did you have to make that might have influenced your final correlations and range in characteristics?
This document takes you through a demo of the exercise using a subset of the
loafercreek dataset from the soilDB package. You are encouraged to run through the code with
loafercreek before attempting it on your own data.
After reviewing this workflow, and with the help of your mentor, you should be able to apply this technique to your own data.
This assignment integrates several R/data analysis skills as well as brings on the “Great Unknown” of NASIS data inputs from across the country. With this type of uncharted territory, there is a lot of room for learning new things … and dealing with new problems. If your code does not work at first, do not be discouraged. Feel free to contact Andrew Brown (firstname.lastname@example.org), or your assigned mentor, if you have questions, issues or comments.
First, you will read over and run all the code in this document using the first 20 pedons from
loafercreek as a demonstration. This will help you get comfortable with the process. Then you will apply the same strategy to NASIS pedons from your area of responsibility, adjusting patterns and summaries as needed.
Here is a link to a mini-tutorial on using
When it comes time, you will replace this next block of code with code to get your own data. If you use
fetchNASIS(), note that it produces QC output that may be relevant to your analysis. 2.7: fetchNASIS data checks
library(aqp, warn.conflicts = FALSE) library(soilDB) ## STEP 1 # load sample `loafercreek` data from the soilDB package data("loafercreek") ## STEP 2 # keep only the first 20 pedons pedons <- loafercreek[1:20, ] # plot profile sketches par(mar=c(0,0,2,1)) plot(pedons, name='hzname', print.id=FALSE)
In order to help your mentor debug any issues you may be having, it is helpful to also provide them with an RData file containing the SoilProfileCollection of pedons, as below.
sThat way they won’t have to re-create your selected set to inspect your data.
# after loading your data as a SoilProfileCollection, save it save(pedons, file = "my_pedons.Rda")
Why use Generalized Horizon Labels?
We use Generalized Horizon Labels (GHL) to simplify the grouping in our input data. As soil scientists we put a lot of effort into our descriptions. We work hard to describe changes in profiles with corresponding changes in horizons. However, we don’t all necessarily have the same opinions on how that should be done.
Therefore, we generalize across profile descriptions, to deal with variation in:
description style / horizons designations used
horizon depths / boundaries
number of horizons described
When creating summaries of data we need a way to “relate” observations of particular horizons from particular pedons back to the typical set of horizons found in the “group” the data belong to (e.g. a series or a component).
Maybe we could use all the unique horizon designations in the data?
And then create a summary for each group?
## STEP 3 # tabulate hzname table(pedons$hzname)
## ## 2BC 2BCt 2Bt1 2Bt2 2Bt3 2Cr 2Crt 2R A A1 A2 BA BAt BC BCt Bt ## 1 1 2 3 2 1 1 2 16 3 3 10 3 1 3 3 ## Bt1 Bt2 Bt3 Bt4 Bw Bw1 Bw2 C Cr Crt H1 Oi R ## 15 14 7 2 4 1 1 2 9 1 1 1 8
# these are the _unique_ horizon designations in our subset `pedons` unique(pedons$hzname)
##  "A" "BA" "BAt" "Bt1" "Bt2" "Cr" "Bt3" "Oi" "Crt" "R" ##  "H1" "Bw" "BCt" "A1" "A2" "2Bt3" "2BCt" "2Cr" "Bt" "2Bt1" ##  "2Bt2" "2Crt" "Bw1" "Bw2" "BC" "Bt4" "C" "2R" "2BC"
With most decent-sized datasets, you will have a lot of groups when taking this simple approach to grouping.
Here we have 29 different horizon designations. Nobody would attempt to make separate ranges for each unique group, especially with such a small amount of data in some of the groups.
Depending on things like depth class or the nature of the parent material, the number of horizon RICs provided in a series or component will vary.
Many series concepts provide RICs for only a couple very generalized horizons (say, just an A and a Bt in a very deep soil) and avoided providing ranges for transitional horizons/vertical subdivisions. Other descriptions might have more layers broken out.
The great thing about the GHL approach is that you can “test” the effect of adding/removing groups. Then you can decide if it adds interpretive value (i.e. a layer with “significantly” different properties) to have more or less groups based on the data you have.
First, you first need your “prototype” scheme for the soil you are studying. Essentially this is the list of horizon labels that occurs in your hypothetical, idealized, “typical” soil.
This could be, for instance, the horizons that occur in the OSD/TUD/component pedon or some generalization of them.
For this exercise wen wil try to produce a set of REGEX patterns that correlate the field-observed horizon designations to your prototype horizons.
Let’s take a look at the horizon designations from the Loafercreek OSD for inspiration.
l <- fetchOSD('loafercreek') l$hzname
##  "Oi" "A" "BAt" "Bt1" "Bt2" "Bt3" "Crt" "R"
We might not be able to produce a unique RIC for each of those subdivisions of the Bt. And we probably don’t want to, even if we could. So we will have to generalize.
With GHLs, correlation decisions are being made on a horizon basis (in addition to at the pedon level), so we call it “micro-correlation.” In this process, we determine what data from each pedon contributes to each Range in Characteristics for the group the pedon is a member of.
This has always at least implicitly been a part of Soil Correlation – we are just making it explicit and reproducible by using R to track our decisions at the horizon level and facilitate analysis.
A simple micro-correlation would be: “this transitional AB horizon has ‘A’ as the first designation so it is be more like an ‘A’ than a ‘Bt’ horizon”. More complex micro-correlations take place when multiple properties are coming into consideration.
Grouping horizon observations by horizon designation is an excellent way to begin to explore the properties of a set of profiles. There are patterns and connotations in the way we are trained to designate soil horizons that will often yield useful groupings.
That said, you can (and should) look at more than just horizon designation. Often unusual data sneak through the cracks, either getting in a group they shouldn’t, or not getting matched at all – these need to be addressed with specific patterns or manual adjustment.
Here is an example of a prototype for horizonation for Loafercreek. It is a broad generalization of the labels we found in the Loafercreek OSD layers.
Our prototype labels include an A horizon, upper transitional horizon, argillic horizon, and a bedrock contact:
## STEP 4 # create 4 GHLs: A, upper transitional, argillic and bedrock prototype.labels <- c('A', 'BA', 'Bt', 'Cr')
Having just a single group for the argillic horizon (Bt) versus splitting out upper and lower (BAt/Bt1/Bt2), for instance, would be a great analysis to evaluate for your own extensions of this demo. We are deliberately keeping the example very general.
prototype.labels has 4 values in it. Therefore,
patterns.to.match must also contain 4 patterns.
Define the paired patterns for each
## STEP 5 # REGEX rules describing mapping from field data to prototype.labels patterns.to.match <- c('^A', '^B[^Ct]*$', '.*B.*t.*', 'Cr|R')
Here is a brief explanation of the function of each of the 4 patterns:
If the horizon designation starts with A, it goes in the “A” label
If the horizon designation starts with B, but does not contain “C” or “t”, it goes in the “BA” label
If the horizon designation contains “B” and “t”, it goes in the “Bt” label
If the horizon contains “R” or “Cr”, it goes in the “Cr”/bedrock label
Pay special attention to how caret
^ and dollar
$ symbols are used in REGEX. They function as anchors for the beginning and end of the string, respectively.
^ placed before an A horizon,
^A, will match any horizon designation that starts with A, such as Ap, Ap1, but not something merely containing A, such as BA.
$ after a Bt horizon,
Bt$, will match any horizon designation that ends with Bt, such as 2Bt or 3Bt, but not something with a vertical subdivision, such as Bt2.
Wrapping pattern with both
$ symbols will result only in exact matches – i.e. that start and end with the contents between
$. For example
^A$, will only match A, not Ap, Ap2, or Cg.
A pair of square brackets matches a single character, but could be one of many individual characters. For instance
[BC]+t matches “one or more of B or C followed by t”. This also makes use of
+ a quantifier operator (one or more).
* is another quantifier denoting zero or more.
Test and learn more about regular expressions here: https://regexr.com/
We use the
generalize.hz() to apply the patterns in
pedons$hzname and return the corresponding new GHL for horizons where a match is made.
Summarizing the previous two subsections, we created:
4 GHLs (
4 regular expression patterns (
patterns.to.match) to assign data to (
Importantly, the label for the last of the set of patterns to match is returned – so if the first and fourth pattern match the same horizon, the final result contains the fourth label.
loafercreek (and other SPCs coming out of
fetchNASIS()) already have a horizon-level variable called
genhz which has the contents of the NASIS Pedon Horizon Component Layer ID (
dspcomplayerid) by default (when populated).
At the end of this document there is an optional guide for importing the labels assigned by R into NASIS, which requires creating a special
|-delimited .txt file.
Since we don’t want to overwrite the data that came out of NASIS at this point, we will create a new horizon-level variable
newgenhz to hold our preliminary GHL assignments.
# apply prototype labels `new` to horizons matching `pat` pedons$newgenhz <- generalize.hz(x = pedons$hzname, new = prototype.labels, pat = patterns.to.match)
That’s it. We have generalized the horizons. Let’s take a look at how our patterns did.
We “cross-tabulate” the results of
generalize.hz() with the input data to see how our field-data got mapped to the new labels.
In particular we want to see if any horizons in the input data got “missed” by our patterns or if horizons are getting correlated to labels we did not expect.
## STEP 6 # cross-tabulate results oldvsnew <- addmargins(table(pedons$newgenhz, pedons$hzname)) oldvsnew
## ## 2BC 2BCt 2Bt1 2Bt2 2Bt3 2Cr 2Crt 2R A A1 A2 BA BAt BC BCt Bt ## A 0 0 0 0 0 0 0 0 16 3 3 0 0 0 0 0 ## BA 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 ## Bt 0 1 2 3 2 0 0 0 0 0 0 0 3 0 3 3 ## Cr 0 0 0 0 0 1 1 2 0 0 0 0 0 0 0 0 ## not-used 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ## Sum 1 1 2 3 2 1 1 2 16 3 3 10 3 1 3 3 ## ## Bt1 Bt2 Bt3 Bt4 Bw Bw1 Bw2 C Cr Crt H1 Oi R Sum ## A 0 0 0 0 0 0 0 0 0 0 0 0 0 22 ## BA 0 0 0 0 4 1 1 0 0 0 0 0 0 16 ## Bt 15 14 7 2 0 0 0 0 0 0 0 0 0 55 ## Cr 0 0 0 0 0 0 0 0 9 1 0 0 8 22 ## not-used 0 0 0 0 0 0 0 2 0 0 1 1 0 6 ## Sum 15 14 7 2 4 1 1 2 9 1 1 1 8 121
In this table you see that columns correspond to all the different horizon designations found in the original data.
And the rows correspond to our GHLs.
The numbers in each cell show how many observations (horizons) have that combination of field designation and GHL.
Note that the ‘not-used’ class is the default result when none of the patterns match. You can set alternate values for no-match case with
generalize.hz(..., non.matching.code = 'alternate-not-used-code').
# find which columns are greater than zero in row 'not-used' col.idx.not.used <- which(oldvsnew['not-used',] > 0) # what column indexes (field horizon designations) did not get mapped onto a row (generalized hz label)? col.idx.not.used
## 2BC BC C H1 Oi Sum ## 1 14 24 27 28 30
# show just those columns oldvsnew[, col.idx.not.used]
## ## 2BC BC C H1 Oi Sum ## A 0 0 0 0 0 22 ## BA 0 0 0 0 0 16 ## Bt 0 0 0 0 0 55 ## Cr 0 0 0 0 0 22 ## not-used 1 1 2 1 1 6 ## Sum 1 1 2 1 1 121
loafercreek example, we see that 2 “BC”, 2 “C” and 2 “Oi” horizons did not match any pattern.
Since we require a “t” to be in the “Bt” group, and “C” is not allowed in the “BA” group, the “BC” falls through the cracks. Likewise, “C” and “Oi” did not have patterns created to match them.
So, let’s say we’ve decided we don’t want these ‘not-used’ horizons lumped with our ‘A’, ‘BA’, ‘Bt’ OR ‘Cr’ groups. Therefore, we either need to add additional pairs of labels and patterns to match them OR leave them as ‘not-used’.
Since there are only a handful of observations for the C’s and O’s (4 and 2 of each, respectively) they may not be particularly “representative” for the “Loafercreek series.”
The lack of clay films (“t” subscript) may be a thing in common between “BC” and “C” – could they be combined?
If you were trying to apply generalized labels to Loafercreek, you could test the idea that they have an unusually large volume of rock fragments (
horizons(loafercreek)$total_frags_pct) – maybe some of them do and some don’t.
You could compare the range derived for your “C” to the range for “BC” to help you decide if they are similar to one another or not (if you were considering lumping them together). Do they have similar clay contents and colors?We will lump them for this demo, since it will be a small group with this subset no matter what.
If we had more observations of the
Oi we could estimate its thickness using the transition probabilities between GHLs. In this case (Loafercreek), they are seldom more than a few centimeters thick and are not much of an “O” horizon to speak of, so we have left this class out for now.
We apply the patterns as before, but create another GHL variable
pedons$newgenhz2 to hold the new result. This is to illustrate that the development of GHL patterns is an iterative process and your first pass may be far from perfect.
For a new
BC label pattern we match all horizons that contain
C and have zero or more characters that are NOT
t and put them in the
Because of the ordering of patterns,
Cr will be matched by patterns 4 and 5, but only the label for pattern 5 (
Cr) will be assigned. Let’s assign the new labels:
## REPEAT STEPS 4 AND 5 # create 5 generalized horizons: A, upper transitional, argillic, lower-transitional and bedrock prototype.labels.v2 <- c('A', 'BA', 'Bt', 'BC', 'Cr') # REGEX rules describing mapping from field data to prototype.labels patterns.to.match.v2 <- c('^A', '^B[^Ct]*$', '.*B.*t.*', 'C[^t]*', 'Cr|R') # use generalize.hz() to apply a set of patterns and paired labels # to the `pedons$hzname` character vector containing field designations pedons$newgenhz2 <- generalize.hz(x = pedons$hzname, new = prototype.labels.v2, pat = patterns.to.match.v2)
Now we cross-tabulate again, showing only
## REPEAT STEP 6 # create a second cross-tabulation, using the updated genhz oldvsnew2 <- addmargins(table(pedons$newgenhz2, pedons$hzname)) # find which table columns are greater than zero in row 'not-used' col.idx.not.used <- which(oldvsnew2['not-used',] > 0) # show just those columns oldvsnew2[, col.idx.not.used]
## ## H1 Oi Sum ## A 0 0 22 ## BA 0 0 16 ## Bt 0 0 51 ## BC 0 0 8 ## Cr 0 0 22 ## not-used 1 1 2 ## Sum 1 1 121
As you can see, the
C horizons that were
not-used before are now correlated to the
The only horizon data that are
not-used are the 2
Oi horizons. You can compare
pedons$newgenhz2 with the labels we created before
pedons$newgenhz and the labels loaded from NASIS Pedon Horizon Component Layer ID
pedons$genhz to see the differences.
# check for equality (assignment 1 versus assignment 2) pedons$newgenhz == pedons$newgenhz2
Let’s recreate the graph we did at the beginning, only now we will color horizons in the plot based on their GHL. This will make it clear how our patterns simplified the grouping of the pedon horizon data, and also provide us with a visual check on our logic.
Compare the coloring (based on
pedons$newgenhz2) with the field horizon designations (
pedons$hzname) to the right of each profile.
## RESULT #1 # plot profile sketches - first 20 profiles; color by gen hz. par(mar=c(0,0,3,1)) plotSPC(pedons, name = 'hzname', color = 'newgenhz2', print.id = FALSE)
Here are a few things that are evident for the Loafercreek example: Our upper transitional horizon (‘BA’ group) captures ‘BA’ as well as ‘Bw’. The bulk of the profile is the argillic horizon (Bt). Some pedons have lower gradational horizons (BC or C). Most pedons have Cr or Cr over R, but we treat the paralithic and lithic contacts equivalently for this demo.
In RStudio you can “Export” a plot from the drop down menu at top of “Plots” pane (after you run the code to make the plot).
Or save the plot using R code. See
?dev.off helpfiles for how to capture output sent to a graphics device (by
plot()) and save it to a file instead of sending it to the “Plots” pane.
We compare the the number of original horizon designations from the field data with the number of unique generalized horizon labels.
# original field data (27 levels) length(unique(pedons$hzname))
##  29
# new generalized data (6 levels, including not-used) length(unique(pedons$newgenhz2))
##  6
We went from 27 levels or “groups” in the field data to 5 groups “as correlated” (4 soil horizons + bedrock)
Let’s look at how we can generate RICs based on the labels we assigned (and subsequently revised).
Here we use
dplyr to produce statistical summaries for each of our generalized horizons.
We group our horizon data into sub-
data.frames using the GHLs we assigned in (
pedons$newgenhz2) as the grouping variable. Then we do some statistics on each “piece” (using
summarize) and combine the results for review.
## STEP 9 # get the horizon data frame out of the SPC hzdata <- horizons(pedons) library(dplyr, warn.conflicts = FALSE) # summarize horizon groups with single summary statistics # using mean, sd, min, max, quantile res_df <- hzdata %>% group_by(newgenhz2) %>% summarize(clay_mean = mean(clay, na.rm = TRUE), clay_sd = sd(clay, na.rm = TRUE), clay_min = min(clay, na.rm = TRUE), clay_max = max(clay, na.rm = TRUE), clay_Q05 = quantile(clay, probs = 0.05, na.rm = TRUE), clay_Q50 = quantile(clay, probs = 0.5, na.rm = TRUE), clay_Q95 = quantile(clay, probs = 0.95, na.rm = TRUE))
## `summarise()` ungrouping output (override with `.groups` argument)
This is an implementation of a Range in Characteristics for clay content. The 5th to 95th percentile range
clay_Q95 would, given the available data, include 90% of the observed data in each group, with the median
clay_Q50 being a central value.
Here is a summary of why it is beneficial to use percentiles for summary of soil survey data / development of Range in Characteristics:
Question: How do Loafercreek clay content mean, standard deviation, minimum and maximum compare to the selected quantiles? Why?
Question: Do you think there was any benefit (in terms of representing RIC for clay content) to adding the BC generalized horizon group? What else might you change about this analysis, now that you see the RIC?
You can save the table as a .csv or .Rda (Rdata object - in a file) file using
# save result #2 to file # save a text-based (comma-separated) version of the result table write.csv(res_df, file = "Your_RIC_table_output.csv") # save a binary file representation of the R object containing result table save(res_df, file = "Your_RIC_table_output.Rda")
To continue with your work, you might need these groups to be populated in NASIS Component Layer ID – learn how to do that next.
NOTE: THIS IS NOT REQUIRED PART OF THE EXERCISE – PROVIDED FOR YOUR INFORMATION AND FUTURE USE
In order to use/save GHLs in further analysis or perform manual adjustments, they need to be saved externally.
If you are a NASIS user then the following code will create a text file that can be read by NASIS and stored in the
dspcomplayerid field of the Pedon Horizon table.
The NASIS Pedon Horizon Calculation “Update horizon group aggregations using a text file” uses a text file
C:/data/horizon_agg.txt, which contains each
phiid (pedon horizon unique record ID) paired with a label to assign.
Here is the code to make a NASIS horizon group aggregation text file. This will write
newgenhz out to the
horizon_agg.txt file out for each
phiid in your object
# set output path genhz.file <- 'C:/data/horizon_agg.txt' # update genhz.var if you change the site(pedons) column with labels genhz.var <- 'newgenhz' # write blank output (gets rid of any old assignments saved in the file) write.table( data.frame(), file = genhz.file, row.names = FALSE, quote = FALSE, na = '', col.names = FALSE, sep = '|' ) # extract horizon data.frame h <- horizons(pedons) # strip-out 'not-used' genhz labels and retain horizon ID and genhz assignment h <- h[which(h[[genhz.var]] != 'not-used'), c('phiid', genhz.var)] # append to NASIS import file write.table( h, file = genhz.file, row.names = FALSE, quote = FALSE, na = '', col.names = FALSE, sep = '|', append = TRUE )
To import the file, run NASIS Pedon Horizon Calculation “Update horizon group aggregations using a text file.”
Some people prefer to adjust assignments in R while others prefer to make adjustments after loading the data into NASIS. Some combination of the two may be required depending on the type and extent of adjustments that need to be made.
Typically, NASIS is good for making final specific changes to relatively small numbers of micro-correlation decisions, whereas wholeseale re-assignments that affect many records in a consistent/programmatically-discernible way can be implemented much more efficiently in R.
You can also store temporary results in RData files.
# after updating genhz, save a new copy of the data save(pedons, file="my_pedons_genhz.Rda")
fetchNASIS() to get pedons from your selected set.
# then load data from the NASIS selected set into an R object called `pedons` pedons <- fetchNASIS(from = 'pedons')
Of course, you first need to query some from your NASIS Local Database to have them in there.
Then subset your
fetchNASIS() result to create a smaller group of pedons, called
# optionally subset the data, FOR INSTANCE: by taxon name - replace Loafercreek with your taxon name pedons <- pedons[grep(pattern = 'Loafercreek', x = f$taxonname, ignore.case = TRUE),]
Instead of using a hard-coded numeric index (for example:
1:20), you could subset your selected set using text-matching on a site/pedon attribute, for example, taxon name.
To subset on taxon name, we used the function
grep() to return just the numeric indices where
x = f$taxonname matches our pattern (
pattern='Loafercreek'). We set
ignore.case=TRUE so we will match “LOAFERCREEK”, “loafercreek” and “Loafercreek” – along with any other oddly-capitalized variants that might exist. There are numerous other attributes that we could have subsetted on. Finally, we use the data.frame notation for subsetting a SoilProfileCollection.
For this assignment, you must to do some sort of subset of your selected set using R – but it does not need to be complex.
Use any site or horizon level attribute. See the function
aqp::subset() for a slick way to do this for site- or horizon-level variables.
Now that you have seen the full demonstration and read the tips for applying this workflow to your own data, please return to the top of the document.
Run the code and perform analysis for your own pedons instead of
loafercreek. Consider the results and discussion that were provided for Loafercreek, BUT instead, consider the conditions in your data and adjust as needed. Discuss any issues you may have with your mentor.
This document is a demonstration of concepts from the presentation “Soil Data Aggregation in R” found here.
The contents are based on the ‘Assigning and Using Generalized Horizon Labels’ tutorial found here.