1 Introduction

This tutorial covers some of the new functionality in soilDB related to searching the USDA-NRCS Official (soil) Series Descriptions (OSDs) and working with summaries compiled by soil series name. Source data are the latest SSURGO snapshot (2018-10-01) and parsed OSD records as of 2018-10-01. These data were developed to support various components of SoilWeb, including SoilWeb Gmaps, Series Extent Explorer, and Series Data Explorer.

2 The Soil Series Concept

Excerpt from the Soil Survey Manual

The series represents a three-dimensional soil body having a unique combination of properties that distinguish it from neighboring series. The soil series concept was developed more than 100 years ago and somewhat followed the logic of the series as used to describe sediments in the geologic cross-section. Like the geologic formation, the soil series has served as the fundamental mapping concept. In geology, strata closely related in terms of their properties and qualities were members of a series in the sedimentary record. Initially, the soil series did not conform to a specific taxonomic class nor property class limits but rather to the predominant properties and qualities of the soil landscape, climate, and setting in which the soil occurred.

Today, the soil series category is the lowest level and the most homogeneous category in the U.S. system of taxonomy. As a class, a series is a group of soils or polypedons that have horizons similar in arrangement and in differentiating characteristics. The soils of a series have a relatively narrow range in sets of properties. Although part of Soil Taxonomy, soil series are not recorded in it.

Soil Science Division Staff. 2017. Soil survey manual. C. Ditzler, K. Scheffe, and H.C. Monger (eds.). USDA Handbook 18. Government Printing Office, Washington, D.C.

2.1 An Official (soil) Series Description

The Official Series Descriptions (OSD) contain definitions and relevant information on all soil series. The narrative style of the OSD makes it possible for specialists and non-specialists to extract relevant details. For example, a soil scientist may consult a number of OSDs to determine if a new set of samples can be allocated to an existing soil series. A farmer might use the “typical pedon” narrative and drainage class description when considering the purchase of a new parcel.

The OSDs are stored as text files with a standardized format. Deviations from the standard format, changes through time, local variation in style, and typos (manual and / or OCR-related) aren’t an issue for human readers but have complicated the analysis by machine for decades. There are plans in place for a structured database that would ultimately be used to generate the OSD narrative. Until that time, there are only a couple ways in which the OSD data can be searched:

As an example, here is the OSD for the Davidson soil series.

3 Setup

With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:

# stable packages from CRAN
install.packages('aqp', dep=TRUE)
install.packages('soilDB', dep=TRUE)
install.packages('sharpshootR', dep=TRUE)
install.packages('dendextend', dep=TRUE)

# latest versions from GitHub
devtools::install_github("ncss-tech/aqp", dependencies=FALSE, upgrade_dependencies=FALSE)
devtools::install_github("ncss-tech/soilDB", dependencies=FALSE, upgrade_dependencies=FALSE)
devtools::install_github("ncss-tech/sharpshootR", dependencies=FALSE, upgrade_dependencies=FALSE)

3.1 Using this Tutorial

This tutorial can be used as a stand-alone demonstration of the functionality provided by fetchOSD and OSDquery, or as a template for your own project. Copy any paste blocks of code sequentially into your own script if you would like to follow-along.

First, you will need to load some packages. Be sure to install these if missing, and install the latest versions of soilDB and sharpshootR as much of this tutorial depends on recent updates.

library(aqp)
library(soilDB)
library(sharpshootR)
library(dendextend)
library(latticeExtra)
library(ggplot2)
library(RColorBrewer)

4 Examples

The following examples can be readily adapted to your own needs by copying/pasting the code chunks into a new R script.

4.1 Getting Basic Soil Morphology and Taxonomic Data

The fetchOSD function provides an interface to the parsed OSD records, hosted by SoilWeb. These records contain a combination of horizon-level and site-level attributes, returned as a SoilProfileCollection object. Site-level attributes include: latest taxonomic data from the SC database and an acreage estimate. Horizon-level attributes include horizon depths (cm), designations, moist/dry colors, pH, and texture. The original horizon narrative is also included. Moist soil colors are converted by default to sRGB values suitable for on-screen display.

# a vector of named soil series
# the search is case insensitive
soils <- c('amador', 'pentz', 'pardee', 'auburn', 'loafercreek', 'millvilla')

# moist colors are converted from Munsell -> sRGB by default
s <- fetchOSD(soils)
# also convert dry colors
s.dry <- fetchOSD(soils, colorState = 'dry')

# quickly compare moist to dry colors
par(mar = c(1,0,2,1), mfrow = c(2,1))
plotSPC(s, name.style = 'center-center') ; title('Moist Colors')
plotSPC(s.dry, name.style = 'center-center') ; title('Dry Colors')

The SoilTaxonomyDendrogram function from sharpshootR package knows how to use elements from the soil classification (order/suborder/great group/subgroup) to build a unique layout of profile sketches. The dissimilarity matrix is computed using Gower’s distance metric for nominal data.

id soilorder suborder greatgroup subgroup
AMADOR inceptisols xerepts haploxerepts typic haploxerepts
AUBURN inceptisols xerepts haploxerepts lithic haploxerepts
LOAFERCREEK alfisols xeralfs haploxeralfs ultic haploxeralfs
MILLVILLA alfisols xeralfs haploxeralfs ultic haploxeralfs
PARDEE alfisols xeralfs haploxeralfs lithic mollic haploxeralfs
PENTZ mollisols xerolls haploxerolls ultic haploxerolls
par(mar=c(0,0,1,1))
SoilTaxonomyDendrogram(s, scaling.factor = 0.02, width = 0.28, name.style = 'center-center', depth.axis = list(line = -4))

Take a closer look at the object returned by fetchKSSL. Consult the SoilProfileCollection reference for details on the internal structure of s and relevant methods.

# internal structure, note that this is an S4 object with slots
str(s)
# site-level attribute names
siteNames(s)
# horizon-level attribute names
horizonNames(s)

4.1.1 Important Assumptions and Limitations

Here are the narratives associated with the first profile in the collection. Regular expression was used to extract specific morphologic data. Some data may be missing or incorrect due to formatting errors, inconcsistencies, or typos. Many of the OSDs were converted from paper copies via OCR.

A–0 to 15 cm; light gray (10YR 7/2) loam, dark grayish brown (10YR 4/2) moist; moderate coarse subangular blocky structure; slightly hard, friable, slightly sticky and slightly plastic; many very fine roots; many very fine and few fine tubular and many very fine interstitial pores; 10 percent gravel; strongly acid (pH 5.1); clear wavy boundary. (3 to 20 cm thick)

Bw1–15 to 28 cm; light gray (10YR 7/2) loam, brown (10YR 5/3) moist; weak medium subangular blocky structure; slightly hard, friable, slightly sticky and slightly plastic; common very fine roots; many very fine and common fine tubular and many very fine interstitial pores; 5 percent gravel; strongly acid (pH 5.1); clear wavy boundary.

Bw2–28 to 48 cm; light gray (10YR 7/2) loam, brown (10YR 5/3) moist; weak medium subangular blocky structure; slightly hard, friable, slightly sticky and slightly plastic; many very fine and common fine tubular and many very fine interstitial pores; few thin colloidal stains on mineral grains, 10 percent gravel mostly at the base of horizon; very strongly acid (pH 5.0); abrupt wavy boundary. (Combined thickness of the Bw horizons is 23 to 41 cm)

Cr–48 to 56 cm; pale yellow (2.5Y 8/2) weakly consolidated rhyolitic tuffaceous sediments, pale yellow (2.5Y 7/4) moist; very strongly acid (pH 4.5).


Here are the data elements extracted from the narratives above. Depths are always converted to cm. Note that not all of the morphologic data have been recovered.
top bottom hzname soil_color hue value chroma dry_hue dry_value dry_chroma texture_class cf_class pH pH_class
0 15 A #6E5E4BFF 10YR 4 2 10YR 7 2 loam NA 5.1 strongly acid
15 28 Bw1 #8E775BFF 10YR 5 3 10YR 7 2 loam NA 5.1 strongly acid
28 48 Bw2 #8E775BFF 10YR 5 3 10YR 7 2 loam NA 5.0 very strongly acid
48 56 Cr #C5AD7DFF 2.5Y 7 4 2.5Y 8 2 NA NA 4.5 very strongly acid

4.2 Extended Summaries

Take a peek at the new extended summaries. Background on how concepts such as “parent material”, “surface shape”, and “geomorphic component” are descirbed within USDA-NRCS soil survey data.

soils <- c('argonaut', 'pierre', 'zook', 'cecil')
s <- fetchOSD(soils, extended = TRUE)

# note additional data, packed into a list
str(s, 1)
## List of 17
##  $ SPC             :Formal class 'SoilProfileCollection' [package "aqp"] with 9 slots
##  $ competing       :'data.frame':    34 obs. of  3 variables:
##  $ geog_assoc_soils:'data.frame':    27 obs. of  2 variables:
##  $ geomcomp        :'data.frame':    4 obs. of  9 variables:
##  $ hillpos         :'data.frame':    4 obs. of  8 variables:
##  $ mtnpos          :'data.frame':    1 obs. of  9 variables:
##  $ terrace         :'data.frame':    3 obs. of  5 variables:
##  $ flats           :'data.frame':    1 obs. of  7 variables:
##  $ shape_across    :'data.frame':    4 obs. of  8 variables:
##  $ shape_down      :'data.frame':    4 obs. of  8 variables:
##  $ pmkind          :'data.frame':    6 obs. of  5 variables:
##  $ pmorigin        :'data.frame':    18 obs. of  5 variables:
##  $ mlra            :'data.frame':    34 obs. of  4 variables:
##  $ climate.annual  :'data.frame':    32 obs. of  12 variables:
##  $ climate.monthly :'data.frame':    96 obs. of  14 variables:
##  $ NCCPI           :'data.frame':    4 obs. of  16 variables:
##  $ soilweb.metadata:'data.frame':    20 obs. of  2 variables:
Competing soil series are those series that share a family-level classification. For example, the soil series competing with Cecil would have a family level classification of fine, kaolinitic, thermic typic kanhapludults. These data are sourced from the current snapshot of the Soil Classification database via SoilWeb.
series competing family
CECIL SAW fine, kaolinitic, thermic typic kanhapludults
CECIL SPARTANBURG fine, kaolinitic, thermic typic kanhapludults
CECIL TARRUS fine, kaolinitic, thermic typic kanhapludults
CECIL TUMBLETON fine, kaolinitic, thermic typic kanhapludults
CECIL WEDOWEE fine, kaolinitic, thermic typic kanhapludults
CECIL APPLING fine, kaolinitic, thermic typic kanhapludults
CECIL BETHLEHEM fine, kaolinitic, thermic typic kanhapludults
CECIL GEORGEVILLE fine, kaolinitic, thermic typic kanhapludults
CECIL HERNDON fine, kaolinitic, thermic typic kanhapludults
CECIL LAURENS fine, kaolinitic, thermic typic kanhapludults
CECIL MADISON fine, kaolinitic, thermic typic kanhapludults
CECIL NANFORD fine, kaolinitic, thermic typic kanhapludults
CECIL NANKIN fine, kaolinitic, thermic typic kanhapludults
CECIL NEESES fine, kaolinitic, thermic typic kanhapludults
CECIL PACOLET fine, kaolinitic, thermic typic kanhapludults
Geographically associated soils.
series gas
CECIL APPLING
CECIL BETHLEHEM
CECIL CATAULA
CECIL CHESTATEE
CECIL CULLEN
CECIL DURHAM
CECIL LLOYD
CECIL LOUISBURG
CECIL MADISON
CECIL MECKLENBURG
CECIL PACOLET
CECIL RION
CECIL SAW
CECIL WEDOWEE
CECIL WORSHAM

Hillslope position (2D) and geomorphic component (3D) probabilities have been estimated by series by dividing the number of component records assigned the each hillslope position or geomorphic component by the total number of component records. The number of records used and Shannon entropy are included to asess uncertainty.

Hillslope position (2D):
series Toeslope Footslope Backslope Shoulder Summit n shannon_entropy
ARGONAUT 0.10 0.12 0.38 0.36 0.05 42 1.96
CECIL 0.00 0.00 0.31 0.35 0.34 1022 1.58
PIERRE 0.00 0.14 0.73 0.01 0.12 454 1.16
ZOOK 0.99 0.01 0.00 0.00 0.00 482 0.11
Geomorphic component (3D), for “hills”, “mountains”, “terraces”, and “flats”:
series Interfluve Crest Head Slope Nose Slope Side Slope Base Slope n shannon_entropy
ARGONAUT 0.51 0.04 0.06 0.04 0.35 0 51 1.63
CECIL 0.54 0.03 0.00 0.02 0.40 0 811 1.29
PIERRE 0.05 0.00 0.03 0.07 0.85 0 120 0.83
ZOOK 0.00 0.00 0.00 0.00 0.00 1 96 0.00
series Mountaintop Mountainflank Upper third of mountainflank Center third of mountainflank Lower third of mountainflank Mountainbase n shannon_entropy
ARGONAUT 0 1 0 0 0 0 1 0
series Tread Riser n shannon_entropy
ARGONAUT 1 0 1 0
PIERRE 1 0 2 0
ZOOK 1 0 125 0
series Dip Talf Flat Rise n shannon_entropy
ZOOK 0.28 0.72 0 0 653 0.85
Parent material kind and origin probabilities have been estimated in a similar manner. The summaries are in long format due to the large set of possible pmkind and pmorigin classes.
series pmkind n total P
ARGONAUT Residuum 50 61 0.8197
ARGONAUT Colluvium 11 61 0.1803
CECIL Residuum 751 877 0.8563
CECIL Saprolite 126 877 0.1437
PIERRE Residuum 105 105 1.0000
ZOOK Alluvium 284 284 1.0000
series pmorigin n total P
ARGONAUT Metasedimentary rock 12 60 0.2000
ARGONAUT Andesite 12 60 0.2000
ARGONAUT Igneous and metamorphic rock 10 60 0.1667
ARGONAUT Diabase 10 60 0.1667
ARGONAUT Metavolcanics 9 60 0.1500
ARGONAUT Metamorphic rock 5 60 0.0833
ARGONAUT Slate 1 60 0.0167
ARGONAUT Shale 1 60 0.0167
CECIL Granite and gneiss 444 967 0.4592
CECIL Schist 231 967 0.2389
CECIL Granite 123 967 0.1272
CECIL Gneiss 121 967 0.1251
CECIL Igneous and metamorphic rock 42 967 0.0434
CECIL Metamorphic rock 3 967 0.0031
CECIL Mica schist 3 967 0.0031
PIERRE Shale 63 105 0.6000
PIERRE Clayey shale 29 105 0.2762
PIERRE Sandstone and shale 13 105 0.1238
A rough approximation of MLRA membership has been computed by taking the intersection of all map unit polygons and current MLRA polygons. Membership values are based on intsersecting areas weighted by component percentage divided by total soil series area.
series mlra area_ac membership
ARGONAUT 18 36299 0.834
ARGONAUT 17 3073 0.071
ARGONAUT 19 2252 0.052
ARGONAUT 22A 1805 0.041
ARGONAUT 15 103 0.002
Annual and monthly climate summaries have been estimated from the SSR2 standard stack of 1981–2010 PRISM data. For now, percentiles are estimates from a single sample from within each map unit polygon, weighted by log(polygon area*component percentage).
series climate_var minimum q01 q05 q25 q50 q75 q95 q99 maximum n
ARGONAUT Elevation (m) 4.00 60.00 76.00 138.00 259.00 437.00 760.00 1034.00 1271.00 840
ARGONAUT Effective Precipitation (mm) -368.26 -284.42 -268.45 -132.91 -17.73 76.81 290.86 460.90 747.55 840
ARGONAUT Frost-Free Days 184.00 210.26 241.00 277.00 312.00 328.00 346.00 354.00 365.00 840
ARGONAUT Mean Annual Air Temperature (degrees C) 12.33 13.71 14.82 15.52 16.15 16.48 16.66 16.88 17.00 840
ARGONAUT Mean Annual Precipitation (mm) 498.00 568.00 580.00 730.00 828.00 904.00 1098.46 1238.00 1509.00 840
ARGONAUT Growing Degree Days (degrees C) 1894.00 2145.82 2318.00 2431.00 2544.00 2599.00 2627.00 2674.00 2745.00 840
ARGONAUT Fraction of Annual PPT as Rain 93.00 96.00 97.00 99.00 99.00 99.00 99.00 99.00 100.00 840
ARGONAUT Design Freeze Index (degrees C) 0.00 0.00 0.00 0.00 0.00 2.00 6.00 11.00 35.00 840
series climate_var minimum q01 q05 q25 q50 q75 q95 q99 maximum n month variable
9 ARGONAUT ppt1 94 101 104 126 144.00 153 182 223.0 250 840 1 Precipitation (mm)
10 ARGONAUT ppt2 90 98 101 126 144.00 156 181 210.1 253 840 2 Precipitation (mm)
11 ARGONAUT ppt3 74 88 89 107 125.57 140 165 190.0 224 840 3 Precipitation (mm)
21 ARGONAUT pet1 13 14 14 14 15.00 15 15 15.0 28 840 1 Potential ET (mm)
22 ARGONAUT pet2 14 17 18 21 21.00 22 22 23.0 28 840 2 Potential ET (mm)
23 ARGONAUT pet3 26 28 30 34 35.00 37 38 39.0 40 840 3 Potential ET (mm)

4.2.1 Climate Summaries

Here is an example visualization of the climate summaries using select percentiles.

# control color like this
trellis.par.set(plot.line = list(col = 'RoyalBlue'))

# control centers symbol and size here
res <- vizAnnualClimate(s$climate.annual, IQR.cex = 1.25, cex=1.1, pch=18)

print(res$fig)

One possible depiction of monthly PPT and PET.

# reasonable colors for a couple of groups
cols <- brewer.pal(9, 'Set1') 
cols <- cols[c(1:5,7,9)]

ggplot(s$climate.monthly, aes(x = month, group=series)) + 
  geom_ribbon(aes(ymin = q25, ymax = q75, fill=series)) + 
  geom_line(aes(month, q25)) + 
  geom_line(aes(month, q75)) + 
  geom_abline(intercept=0, slope = 0, lty=2) +
  xlab('') + ylab('mm') + 
  ggtitle('Monthly IQR') +
  scale_fill_manual(values=alpha(cols, 0.75)) +
  facet_wrap(vars(variable), scales = 'free_y') +
  theme_bw()

4.3 Vizualization of 2D and 3D Geomorphic Descriptions

Numbers on the left-hand side of proportion (bars) represent the number of copmonents uses in the estimate. Numbers of the right-hand side are the normalized Shannon entropy. Proportions based on a larger number of component records are likely more realistic. Shannon entropy values closer to 0 suggest simple association between series and geomorphology while values closer to 1 suggest more complexity (or confusion).

Hillslope position.

# result is a lattice graphics object
res <- vizHillslopePosition(s$hillpos)
print(res$fig)

Geomorphic component for “hills”.

# result is a lattice graphics object
res <- vizGeomorphicComponent(s$geomcomp)
print(res$fig)

7 A Classic Catena

Diagram showing relationship of dominant soils in Lloyd-Davidson association (Soil Survey of Morgan County, Georgia; 1965).

Code.

soils <- c('cecil', 'altavista', 'lloyd', 'wickham', 'wilkes',  'chewacla', 'congaree')

# get morphology + extended summaries
s <- fetchOSD(soils, extended = TRUE)

res <- vizHillslopePosition(s$hillpos)
print(res$fig)

par(mar=c(0, 0, 2, 0))
plotSPC(s$SPC, plot.order = res$order, name.style = 'center-center', cex.names = 0.66, hz.depths = TRUE, hz.depths.offset = 0.05, fixLabelCollisions = TRUE, depth.axis = FALSE)
title('Hydrologic Ordering via Hillslope Position Proportions')

res <- vizGeomorphicComponent(s$geomcomp)
print(res$fig)

par(mar=c(0, 0, 2, 0))
plot(s$SPC, plot.order = res$order, name.style = 'center-center', cex.names = 0.66, hz.depths = TRUE, hz.depths.offset = 0.05, fixLabelCollisions = TRUE, depth.axis = FALSE)
title('Hydrologic Ordering via Geomorphic Component Proportions')

par(mar=c(0,0,1,1))
SoilTaxonomyDendrogram(s$SPC, y.offset = 0.4, scaling.factor = 0.015, name.style = 'center-center', cex.names = 0.66, hz.depths = TRUE, hz.depths.offset = 0.05, fixLabelCollisions = TRUE, depth.axis = FALSE, width = 0.28)

This document is based on aqp version 2.0, soilDB version 2.7.8, and sharpshootR version 2.2.