Introduction

This tutorial covers some of the new functionality in soilDB related to searching the USDA-NRCS Official (soil) Series Descriptions (OSDs) and working with summaries compiled by soil series name. Source data are the latest SSURGO snapshot (2018-10-01) and parsed OSD records as of 2018-10-01. These data were developed to support various components of SoilWeb, including SoilWeb Gmaps, Series Extent Explorer, and Series Data Explorer.

The Soil Series Concept

Excerpt from the Soil Survey Manual

The series represents a three-dimensional soil body having a unique combination of properties that distinguish it from neighboring series. The soil series concept was developed more than 100 years ago and somewhat followed the logic of the series as used to describe sediments in the geologic cross-section. Like the geologic formation, the soil series has served as the fundamental mapping concept. In geology, strata closely related in terms of their properties and qualities were members of a series in the sedimentary record. Initially, the soil series did not conform to a specific taxonomic class nor property class limits but rather to the predominant properties and qualities of the soil landscape, climate, and setting in which the soil occurred.

Today, the soil series category is the lowest level and the most homogeneous category in the U.S. system of taxonomy. As a class, a series is a group of soils or polypedons that have horizons similar in arrangement and in differentiating characteristics. The soils of a series have a relatively narrow range in sets of properties. Although part of Soil Taxonomy, soil series are not recorded in it.

Soil Science Division Staff. 2017. Soil survey manual. C. Ditzler, K. Scheffe, and H.C. Monger (eds.). USDA Handbook 18. Government Printing Office, Washington, D.C.

An Official (soil) Series Description

The Official Series Descriptions (OSD) contain definitions and relevant information on all soil series. The narrative style of the OSD makes it possible for specialists and non-specialists to extract relevant details. For example, a soil scientist may consult a number of OSDs to determine if a new set of samples can be allocated to an existing soil series. A farmer might use the “typical pedon” narrative and drainage class description when considering the purchase of a new parcel.

The OSDs are stored as text files with a standardized format. Deviations from the standard format, changes through time, local variation in style, and typos (manual and / or OCR-related) aren’t an issue for human readers but have complicated the analysis by machine for decades. There are plans in place for a structured database that would ultimately be used to generate the OSD narrative. Until that time, there are only a couple ways in which the OSD data can be searched:

As an example, here is the OSD for the Davidson soil series.

Setup

With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:

# stable packages from CRAN
install.packages('aqp', dep=TRUE)
install.packages('soilDB', dep=TRUE)
install.packages('sharpshootR', dep=TRUE)
install.packages('dendextend', dep=TRUE)

# latest versions from GitHub
devtools::install_github("ncss-tech/soilDB", dependencies=FALSE, upgrade_dependencies=FALSE)
devtools::install_github("ncss-tech/sharpshootR", dependencies=FALSE, upgrade_dependencies=FALSE)

Using this Tutorial

This tutorial can be used as a stand-alone demonstration of the functionality provided by fetchOSD and OSDquery, or as a template for your own project. Copy any paste blocks of code sequentially into your own script if you would like to follow-along.

First, you will need to load some packages. Be sure to install these if missing, and install the latest versions of soilDB and sharpshootR as much of this tutorial depends on recent updates.

library(soilDB)
library(sharpshootR)
library(dendextend)
library(latticeExtra)
library(ggplot2)
library(RColorBrewer)

Examples

The following examples can be readily adapted to your own needs by copying/pasting the code chunks into a new R script.

Getting Basic Soil Morphology and Taxonomic Data

The fetchOSD function provides an interface to the parsed OSD records, hosted by SoilWeb. These records contain a combination of horizon-level and site-level attributes, returned as a SoilProfileCollection object. Site-level attributes include: latest taxonomic data from the SC database and an acreage estimate. Horizon-level attributes include horizon depths (cm), designations, moist/dry colors, pH, and texture. The original horizon narrative is also included. Moist soil colors are converted by default to sRGB values suitable for on-screen display.

# a vector of named soil series
# the search is case insensitive
soils <- c('amador', 'pentz', 'pardee', 'auburn', 'loafercreek', 'millvilla')

# moist colors are converted from Munsell -> sRGB by default
s <- fetchOSD(soils)
# also convert dry colors
s.dry <- fetchOSD(soils, colorState = 'dry')

# quickly compare moist to dry colors
par(mar=c(1,0,2,1), mfrow=c(2,1))
plot(s) ; title('Moist Colors')
plot(s.dry) ; title('Dry Colors')

The SoilTaxonomyDendrogram function from sharpshootR package knows how to use elements from the soil classification (order/suborder/great group/subgroup) to build a unique layout of profile sketches. The dissimilarity matrix is computed using Gower’s distance metric for nominal-scale data.

par(mar=c(0,0,1,1))
SoilTaxonomyDendrogram(s, scaling.factor = 0.015)

Take a closer look at the object returned by fetchKSSL. Consult the SoilProfileCollection reference for details on the internal structure of s and relevant methods.

# internal structure, note that this is an S4 object with slots
str(s)
# site-level attribute names
siteNames(s)
# horizon-level attribute names
horizonNames(s)

Important Assumptions and Limitations

Here are the narratives associated with the first profile in the collection. Regular expression was used to extract specific morphologic data. Some data may be missing or incorrect due to formatting errors, inconcsistencies, or typos. Many of the OSDs were converted from paper copies via OCR.

A--0 to 15 cm; light gray (10YR 7/2) loam, dark grayish brown (10YR 4/2) moist; moderate coarse subangular blocky structure; slightly hard, friable, slightly sticky and slightly plastic; many very fine roots; many very fine and few fine tubular and many very fine interstitial pores; 10 percent gravel; strongly acid (pH 5.1); clear wavy boundary. (3 to 20 cm thick) 

Bw1--15 to 28 cm; light gray (10YR 7/2) loam, brown (10YR 5/3) moist; weak medium subangular blocky structure; slightly hard, friable, slightly sticky and slightly plastic; common very fine roots; many very fine and common fine tubular and many very fine interstitial pores; 5 percent gravel; strongly acid (pH 5.1); clear wavy boundary. 

Bw2--28 to 48 cm; light gray (10YR 7/2) loam, brown (10YR 5/3) moist; weak medium subangular blocky structure; slightly hard, friable, slightly sticky and slightly plastic; many very fine and common fine tubular and many very fine interstitial pores; few thin colloidal stains on mineral grains, 10 percent gravel mostly at the base of horizon; very strongly acid (pH 5.0); abrupt wavy boundary. (Combined thickness of the Bw horizons is 23 to 41 cm) 

Cr--48 to 56 cm; pale yellow (2.5Y 8/2) weakly consolidated rhyolitic tuffaceous sediments, pale yellow (2.5Y 7/4) moist; very strongly acid (pH 4.5). 

Here are the data elements extracted from the narratives above. Depths are always converted to cm. Note that not all of the morphologic data have been recovered.
top bottom hzname soil_color hue value chroma dry_hue dry_value dry_chroma texture_class cf_class pH pH_class
0 15 A #6F5F4CFF 10YR 4 2 10YR 7 2 loam NA 5.1 strongly acid
15 28 Bw1 #8F775BFF 10YR 5 3 10YR 7 2 loam NA 5.1 strongly acid
28 48 Bw2 #8F775BFF 10YR 5 3 10YR 7 2 loam NA 5.0 strongly acid
48 56 Cr #C5AD7DFF 2.5Y 7 4 2.5Y 8 2 NA NA 4.5 strongly acid

Extended Summaries

Take a peek at the new extended summaries. Background on how concepts such as “parent material”, “surface shape”, and “geomorphic component” are descirbed within USDA-NRCS soil survey data.

soils <- c('argonaut', 'pierre', 'zook', 'cecil')
s <- fetchOSD(soils, extended = TRUE)

# note additional data, packed into a list
str(s, 1)
## List of 11
##  $ SPC             :Formal class 'SoilProfileCollection' [package "aqp"] with 8 slots
##  $ competing       :'data.frame':    34 obs. of  3 variables:
##  $ geomcomp        :'data.frame':    4 obs. of  9 variables:
##  $ hillpos         :'data.frame':    4 obs. of  8 variables:
##  $ mtnpos          :'data.frame':    1 obs. of  9 variables:
##  $ pmkind          :'data.frame':    6 obs. of  5 variables:
##  $ pmorigin        :'data.frame':    18 obs. of  5 variables:
##  $ mlra            :'data.frame':    39 obs. of  4 variables:
##  $ climate.annual  :'data.frame':    32 obs. of  12 variables:
##  $ climate.monthly :'data.frame':    96 obs. of  14 variables:
##  $ soilweb.metadata:'data.frame':    12 obs. of  2 variables:
Competing soil series are those series that share a family-level classification. For example, the soil series competing with Cecil would have a family level classification of fine, kaolinitic, thermic typic kanhapludults. These data are sourced from the current snapshot of the Soil Classification database via SoilWeb.
series competing family
CECIL SAW fine, kaolinitic, thermic typic kanhapludults
CECIL SPARTANBURG fine, kaolinitic, thermic typic kanhapludults
CECIL TARRUS fine, kaolinitic, thermic typic kanhapludults
CECIL TUMBLETON fine, kaolinitic, thermic typic kanhapludults
CECIL WEDOWEE fine, kaolinitic, thermic typic kanhapludults
CECIL APPLING fine, kaolinitic, thermic typic kanhapludults
CECIL BETHLEHEM fine, kaolinitic, thermic typic kanhapludults
CECIL GEORGEVILLE fine, kaolinitic, thermic typic kanhapludults
CECIL HERNDON fine, kaolinitic, thermic typic kanhapludults
CECIL LAURENS fine, kaolinitic, thermic typic kanhapludults
CECIL MADISON fine, kaolinitic, thermic typic kanhapludults
CECIL NANFORD fine, kaolinitic, thermic typic kanhapludults
CECIL NANKIN fine, kaolinitic, thermic typic kanhapludults
CECIL NEESES fine, kaolinitic, thermic typic kanhapludults
CECIL PACOLET fine, kaolinitic, thermic typic kanhapludults
Hillslope position (1D) and geomorphic component (2D) probabilities have been estimated by series by dividing the number of components assigned the each hillslope position or geomorphic component by the total number of components. The total number of components and Shannon entropy are also included in the summary.
series Toeslope Footslope Backslope Shoulder Summit n shannon_entropy
ARGONAUT 0.10 0.12 0.38 0.36 0.05 42 0.84
CECIL 0.00 0.00 0.29 0.35 0.35 1070 0.68
PIERRE 0.00 0.14 0.73 0.01 0.13 465 0.51
ZOOK 0.99 0.01 0.00 0.00 0.00 542 0.04
series Interfluve Crest Head Slope Nose Slope Side Slope Base Slope n shannon_entropy
ARGONAUT 0.51 0.04 0.06 0.04 0.35 0 51 0.70
CECIL 0.56 0.03 0.00 0.02 0.39 0 843 0.55
PIERRE 0.06 0.00 0.03 0.09 0.83 0 105 0.39
ZOOK 0.00 0.00 0.00 0.00 0.00 1 95 0.00
Parent material kind and origin probabilities have been estimated in a similar manner. The summaries are in long format due to the large set of possible pmkind and pmorigin classes.
series pmkind n total P
ARGONAUT Residuum 50 61 0.8197
ARGONAUT Colluvium 11 61 0.1803
CECIL Residuum 766 892 0.8587
CECIL Saprolite 126 892 0.1413
PIERRE Residuum 105 105 1.0000
ZOOK Alluvium 283 283 1.0000
series pmorigin n total P
ARGONAUT Andesite 12 60 0.2000
ARGONAUT Metasedimentary rock 12 60 0.2000
ARGONAUT Igneous and metamorphic rock 10 60 0.1667
ARGONAUT Diabase 10 60 0.1667
ARGONAUT Metavolcanics 9 60 0.1500
ARGONAUT Metamorphic rock 5 60 0.0833
ARGONAUT Slate 1 60 0.0167
ARGONAUT Shale 1 60 0.0167
CECIL Granite and gneiss 455 982 0.4633
CECIL Schist 231 982 0.2352
CECIL Granite 123 982 0.1253
CECIL Gneiss 121 982 0.1232
CECIL Igneous and metamorphic rock 46 982 0.0468
CECIL Metamorphic rock 3 982 0.0031
CECIL Mica schist 3 982 0.0031
PIERRE Shale 63 105 0.6000
PIERRE Clayey shale 29 105 0.2762
PIERRE Sandstone and shale 13 105 0.1238
A rough approximation of MLRA membership has been computed by taking the intersection of all map unit polygons and current MLRA polygons. Membership values are based on intsersecting areas weighted by component percentage divided by total soil series area.
series mlra area_ac membership
ARGONAUT 18 36291 0.840
ARGONAUT 17 2745 0.064
ARGONAUT 20 2252 0.052
ARGONAUT 22A 1790 0.041
ARGONAUT 15 103 0.002
Annual and monthly climate summaries have been estimated from the SSR2 standard stack of 1981–2010 PRISM data. For now, percentiles are estimates from a single sample from within each map unit polygon, weighted by \(log(polygon\ area * component\ percentage)\).
series climate_var minimum q01 q05 q25 q50 q75 q95 q99 maximum n
ARGONAUT Elevation (m) 4.00 60.00 76.00 139.00 261.00 437.00 760.00 1034.00 1271.00 842
ARGONAUT Effective Precipitation (mm) -368.26 -284.42 -268.45 -132.91 -16.34 77.01 290.86 460.90 747.55 842
ARGONAUT Frost-Free Days 184.00 211.08 241.00 277.00 312.00 328.00 346.00 354.00 365.00 842
ARGONAUT Mean Annual Air Temperature (degrees C) 12.33 13.76 14.82 15.52 16.15 16.48 16.66 16.88 17.00 842
ARGONAUT Mean Annual Precipitation (mm) 498.00 568.00 580.00 730.00 830.00 905.00 1094.00 1238.00 1509.00 842
ARGONAUT Growing Degree Days (degrees C) 1894.00 2148.45 2318.00 2431.00 2544.00 2599.00 2627.00 2674.00 2745.00 842
ARGONAUT Fraction of Annual PPT as Rain 93.00 96.00 97.00 99.00 99.00 99.00 99.00 99.00 100.00 842
ARGONAUT Design Freeze Index (degrees C) 0.00 0.00 0.00 0.00 0.00 2.00 6.00 11.00 35.00 842
series climate_var minimum q01 q05 q25 q50 q75 q95 q99 maximum n month variable
9 ARGONAUT ppt1 94.00 101.00 104.00 126.00 144.00 153.00 182.00 223.00 250.00 842 1 Precipitation (mm)
10 ARGONAUT ppt2 90.00 98.00 101.00 126.00 144.00 156.00 181.00 210.00 253.00 842 2 Precipitation (mm)
11 ARGONAUT ppt3 74.00 88.00 89.00 107.00 126.00 140.00 165.00 190.00 224.00 842 3 Precipitation (mm)
21 ARGONAUT pet1 13.36 13.65 13.88 14.40 14.72 14.91 15.23 15.47 25.48 842 1 Potential ET (mm)
22 ARGONAUT pet2 14.51 16.53 18.30 20.65 21.27 21.77 22.36 22.83 26.54 842 2 Potential ET (mm)
23 ARGONAUT pet3 25.81 27.22 29.42 33.81 35.27 37.03 38.13 38.65 39.94 842 3 Potential ET (mm)

Climate Summaries

Here is an example visualization of the climate summaries using select percentiles.

# control color like this
trellis.par.set(plot.line=list(col='RoyalBlue'))

# control centers symbol and size here
res <- vizAnnualClimate(s$climate.annual, IQR.cex = 1.25, cex=1.1, pch=18)

print(res$fig)

One possible depiction of monthly PPT and PET.

# reasonable colors for a couple of groups
cols <- brewer.pal(9, 'Set1') 
cols <- cols[c(1:5,7,9)]

ggplot(s$climate.monthly, aes(x = month, group=series)) + 
  geom_ribbon(aes(ymin = q25, ymax = q75, fill=series)) + 
  geom_line(aes(month, q25)) + 
  geom_line(aes(month, q75)) + 
  geom_abline(intercept=0, slope = 0, lty=2) +
  xlab('') + ylab('mm') + 
  ggtitle('Monthly IQR') +
  scale_fill_manual(values=alpha(cols, 0.75)) +
  facet_wrap(vars(variable), scales = 'free_y') +
  theme_bw()

Vizualization of 1D and 2D Surface Shape Proportions

Hillslope position.

# result is a lattice graphics object
res <- vizHillslopePosition(s$hillpos)
print(res$fig)

Geomorphic component.

# result is a lattice graphics object
res <- vizGeomorphicComponent(s$geomcomp)
print(res$fig)

Series Siblings and Cousins

TODO: write some definitions

Siblings

TODO: document major/minor siblings.

Get sibling data for Amador soil series via SoilWeb.

# new function as of 2.2-8, single soil series at a time
s <- siblings('amador', component.data = TRUE)

Sibling data for the Amador soil series. The number of times each sibling occurs in a map unit with Amador is tabulated in the n column.

kable_styling(kable(s$sib, format = 'html'), full_width = FALSE)
series sibling majcompflag n
amador Miltonhills TRUE 2
amador Gillender TRUE 2
amador Pardee TRUE 1
amador Vleck TRUE 1
amador Pardee FALSE 8
amador Hornitos FALSE 5
amador Redding FALSE 4
amador Exchequer FALSE 4
amador Miltonhills FALSE 4
amador Pentz FALSE 4
amador Gillender FALSE 3
amador Ranchoseco FALSE 2
amador Peters FALSE 2
amador Vleck FALSE 2
amador Corning FALSE 1
amador Inks FALSE 1
# get basic OSD data for queried series and siblings
s.list <- unique(c(s$sib$series[1], s$sib$sibling))
h <- fetchOSD(s.list)

# plot dendrogram + profiles
SoilTaxonomyDendrogram(h)