1 Introduction

This document demonstrates querying Soil Data Access (SDA) for NCSS laboratory characterization data. A simplified, searchable interface to the associated metadata is included to assist with field names, units of measure, table names, and table linkages.

2 Setup

The following R packages are used in this document.

install.packages('soilDB')
install.packages('aqp')
install.packages('mapview')
install.packages('sf')
install.packages('soiltexture')
install.packages('scales')
install.packages('Hmisc')

3 Table Relationships

Open in a new browser tab for a larger version. Interactive version. table relationships

3.1 Table Descriptions

Table Description
lab_analysis_procedure The Analysis Procedure table is a lookup table which contains information about the analysiscprocedure. The analysis procedure is the basic analytical method applied to a prepared sample. This procedure may be applied to different sample preparations, but the default size-fraction is the most commonly used size-fraction. An analysis may have more than one procedure, one of which is chosen by the analyst to complete the analysis.
lab_analyte The Analyte table is a lookup table which contains information about an analyte (primary result of an analysis by a procedure). The LIMS uses this data to format the data, provide abbreviations and units of measure for reporting the data.
lab_area A lookup table containing the list of valid NRCS areas, both current and obsolete for Country, County, Soil Survey Area and MLRA. Note that a soil survey area code is a concatenation of a U.S. state FIPS code and a soil survey area number. Since this table is intended to record only U.S. and U.S. territory soil survey areas, every soil survey area code in this table should be unique.
lab_calculations_including_estimates_and_default_values This table contains the results of calculations using measured, estimated and default values.
lab_chemical_properties This table contains the results of the analysis of chemical properties of the soil that are measured by the KSSL and cooperating university laboratories.
lab_combine_nasis_ncss This table is a linkage betwwen NASIS Pedon morphological and the Lab data. This table stores information about locations, layers or horizons of a pedon that has had samples submitted to the SSL at the National Soil Survey Center or one of the various laboratories at NCSS cooperating universities. Data is periodically transferred from the SSL database to this table.
lab_layer The Layer table contains information about the layer, including the field identifier, horizon designation, reporting, and depths. Layer in the LIMS can be a true soil horizon or a reporting layer which does not correspond to the traditional soil horizon. The lay_rpt_seq_num sets the order of display for layer reports. The layer is the intended way analytical results are reported in the LIMS. New to the LIMS is the ability to allow stratified or vertically subdivided layers to be reported.
lab_major_and_trace_elements_and_oxides This table contains major and trace elements, total oxides, x-ray data and the method codes related to the methods used.
lab_method_code This table relates a non-connotative method code with the preparation, instrument, size fraction, analyte, procedure (if any) and calculation algorithm (if any) represented by that method code.
lab_mineralogy_glass_count The Optical Mineralogy glass count Count Data table holds the raw data for the sample. The table hold what type of optical count, the size fraction and replicate number.
lab_mir Spectroscopy is the study of the interaction between matter and electromagnetic radiation. It is based on the principle that molecular vibrations and electronic transitions associated with soil constituents absorb light while interacting with radiation.
lab_mir_wavelength The lab_mir_wavelength table is a lookup of the collection of wavelengths used for sampling the samples
lab_pedon The LIMS Pedon table holds general information about the pedon. The pedon is associated with a site. The converse is not necessarily true.
lab_physical_properties This table contains the results of analysis of physical properties of the soil that are measured by the KSSL and cooperating university laboratories.
lab_preparation The Preparation table holds the information about the type of preparation, size of the orginal and final fractions prepared, default container type and preparation sort order.
lab_rosetta_key ROSETTA, which implements five hierarchical pedotransfer functions (PTFs) for the estimation of water retention, and the saturated and unsaturated hydraulic conductivity.
lab_site The LIMS Site table holds the site (location) information about the place where the samples or pedon originated.
lab_webmap This table is used to develop a webmap which is used to interactive display of geographic information pertaining to laboratory information. The interactive map offers a national database of soil characterization data, allowing location of pedons spatially. Pedons are the smallest unit of soil containing all the soil horizons of a particular soil type. Clicking on a pedon location within the interactive map will enable users to access lab data about that pedon. View the new interactive map by clicking here offsite link image. You can also view the map on the Soil Lab Data Mart website (bottom left of page) here: https://ncsslabdatamart.sc.egov.usda.gov/.
lab_xray_and_thermal The X-ray Diffraction Data table holds the raw data for the each mineral in the sample.
table_name table_description

3.2 Column Descriptions

4 Manually Written Queries

4.1 Sequoia-Kings Canyon (CA792)

This example demonstrates a 2-stage approach to identifying pedons that have been sampled for laboratory characterization by soil survey area symbol.

Load required libraries

library(soilDB)
library(aqp)

Querying by soil survey area code is useful for reviewing those data collected to support a new soil survey or to perform updates to an existing soil survey.

sql <- "
SELECT
pedon_key, pedlabsampnum, pedoniid, upedonid, longitude_decimal_degrees, latitude_decimal_degrees,
area_code
FROM
lab_combine_nasis_ncss
JOIN
lab_area ON area_key = ssa_key AND area_type = 'ssa'
WHERE area_code = 'CA792'
;"

pedons <- SDA_query(sql)
nrow(pedons)
## [1] 39
head(pedons)
##   pedon_key pedlabsampnum pedoniid      upedonid longitude_decimal_degrees latitude_decimal_degrees
## 1     69913      13N97435  1147616 S2013CA107001                 -118.7045                 36.44555
## 2     69914      13N97436  1147617 S2013CA107002                 -118.9434                 36.72390
## 3     69915      13N97437  1147615 S2013CA019003                 -118.6617                 36.79047
## 4     70018       13N0500  1147611 S2012CA019001                 -118.5989                 37.10028
## 5     70019       13N0501  1147612 S2012CA019002                 -118.6431                 37.12112
## 6     70020       13N0502  1147613 S2012CA019003                 -118.6934                 37.15224
##   area_code
## 1     CA792
## 2     CA792
## 3     CA792
## 4     CA792
## 5     CA792
## 6     CA792

Review pedon locations with an interactive map.

library(mapview)
library(sf)

pedons.sub <- subset(pedons, subset = !is.na(longitude_decimal_degrees) & !is.na(latitude_decimal_degrees))

pedons.sf <- st_as_sf(pedons.sub, coords = c('longitude_decimal_degrees', 'latitude_decimal_degrees'), crs = 4326)
mapview(pedons.sf)

Connect physical and chemical properties to sampled “layers” (usually genetic soil horizons).

Join conditions: * lab_pedon -> lab_layer [pedon_key] * lab_layer -> lab_physical_properties [labsampnum]

# make a quoted vector of pedon keys, for use in the next query
pedon.keys <- format_SQL_in_statement(pedons$pedon_key)

## layer + physical properties
sql <- sprintf("
SELECT
l.pedon_key, l.labsampnum, hzn_top, hzn_bot, hzn_desgn,
sand_total, silt_total, clay_total, particle_size_method,
ph_h2o, ph_h2o_method,
bulk_density_third_bar, bulk_density_third_bar_method
FROM
lab_layer AS l
JOIN lab_physical_properties AS p ON l.labsampnum = p.labsampnum
JOIN lab_chemical_properties AS c ON l.labsampnum = c.labsampnum
WHERE l.pedon_key IN %s
ORDER BY l.pedon_key, hzn_top
;", pedon.keys)

hz <- SDA_query(sql)
nrow(hz)
## [1] 222
head(hz)
##   pedon_key labsampnum hzn_top hzn_bot hzn_desgn sand_total silt_total clay_total
## 1     69913   13N13137       0       2        Oi         NA         NA         NA
## 2     69913   13N13138       2       7         A       71.2       21.7        7.1
## 3     69913   13N13139       7      19       Bt1       75.0       19.1        5.9
## 4     69913   13N13140      19      39       Bt2       79.9       15.7        4.4
## 5     69913   13N13141      39      64       Bt3       83.0       11.9        5.1
## 6     69913   13N13142      64      92       Bt4       83.3       10.6        6.1
##   particle_size_method ph_h2o ph_h2o_method bulk_density_third_bar bulk_density_third_bar_method
## 1                 <NA>     NA          <NA>                     NA                          <NA>
## 2               3A1a1a    6.1        4C1a2a                   1.12                         DbWR1
## 3               3A1a1a    5.3        4C1a2a                   1.31                         DbWR1
## 4               3A1a1a    5.3        4C1a2a                   1.49                         DbWR1
## 5               3A1a1a    5.5        4C1a2a                   1.62                         DbWR1
## 6               3A1a1a    5.6        4C1a2a                   1.79                         DbWR1
# unique set of methods
table(hz$particle_size_method)
## 
## 3A1a1a 
##    201

Investigate soil texture.

# remove missing values
ssc <- na.omit(hz[, c('sand_total', 'silt_total', 'clay_total')])

# re-name columns, required by textureTriangleSummary()
names(ssc) <- c('SAND', 'SILT', 'CLAY')

textureTriangleSummary(ssc, cex = 0.5)

Review thematic (1:1 soil:water pH) soil profile sketches.

# make a copy and
x <- hz

# init SoilProfileCollection
depths(x) <- pedon_key ~ hzn_top + hzn_bot
hzdesgnname(x) <- 'hzn_desgn'

par(mar = c(0, 0, 3, 0))
plotSPC(
  x, 
  print.id = FALSE, 
  name.style = 'center-center', 
  width = 0.33, 
  color = 'ph_h2o', 
  depth.axis = list(cex = 0.85, line = -4)
)

4.2 Investigate Bx Horizons

library(soiltexture)
library(scales)

# SDA Interface, this requires a network connection but otherwise it just WORKS!
qq <- "SELECT
hzn_top, hzn_bot, hzn_desgn, sand_total AS sand, silt_total AS silt, clay_total AS clay
FROM lab_layer
JOIN lab_physical_properties ON lab_layer.labsampnum = lab_physical_properties.labsampnum
WHERE hzn_desgn LIKE 'B%x%' ;"

# run query
bx <- SDA_query(qq)

##
## summarize sand, silt, clay for Bx horizons  
##

# number of records
nrow(bx)
## [1] 4056
# preview data
head(bx)
##   hzn_top hzn_bot hzn_desgn sand silt clay
## 1     142     168       Bx3 26.5 55.3 18.2
## 2      66      91     Btx21 52.9 37.7  9.4
## 3     114     152      Btx3 45.9 41.6 12.5
## 4      81      99       Bx2 56.7 30.8 12.5
## 5      60      75      Bx/C 41.9 23.1 35.0
## 6      70     130        Bx 56.0 27.1 16.9
# extract components of texture, removing rows with missing data
ssc <- bx[, c('sand', 'silt', 'clay')]
ssc <- na.omit(ssc)

# adjust names for plotting with TT.plot()
# names must be SAND, SILT, CLAY
names(ssc) <- toupper(names(ssc))

# test of bogus data
ssc$sum <- rowSums(ssc[, c('SAND', 'SILT', 'CLAY')])
# > 5% deviation from 100%
idx <- which(abs(ssc$sum - 100) > 5)

# check errors: just one
ssc[idx, ]
##      SAND SILT CLAY  sum
## 3284 57.5 23.8  5.6 86.9
# plot data
# note that there are many arguments used to adjust style
TT.plot(
  class.sys= "USDA-NCSS.TT",    # use "our" texture triangle
  tri.data=ssc,                 # data.frame with sand, silt, clay values
  main= "Bx Horizons",          # title
  tri.sum.tst=FALSE,            # do not test for exact sum(sand, silt, clay) == 100
  cex.lab=0.75,                 # scaling of label text
  cex.axis=0.75,                # scaling of axis
  cex=0.5,                      # scaling of point symbols
  col=alpha('royalblue', 0.125),  # color of point symbols, with transparency
  frame.bg.col='white',         # background color
  class.lab.col='black',        # color for texture class labels
  lwd.axis=1.5,                    # line thickness for axis
  arrows.show=TRUE
)

5 Convenience Functions

5.1 fetchLDM() via SDA

Safely iterate over soil series names, querying “correlated as” when present, otherwise “sampled as” when NULL.

# useful for iteration with progress bar
library(purrr)

getData <- function(i) {
  
  # formulate WHERE clause for series 'i'
  .where <- sprintf("CASE WHEN corr_name IS NOT NULL THEN LOWER(corr_name) ELSE LOWER(samp_name) END = '%s' ", i)
  
  # query SDA
  lab <- fetchLDM(
    WHERE = .where, 
  )
  
  # return results
  return(lab)
}

# series names to iterate over
taxa <- c('pierre', 'lucy', 'zook')

# result is a list of SoilProfileCollection objects
s <- map(taxa, .f = getData, .progress = TRUE)

# flatten into a single SoilProfileCollection
s <- combine(s)

# graphicaly check a couple
par(mar = c(1, 1, 3, 3))
plotSPC(s[1:10, ], color = 'clay_total', width = 0.33, name.style = 'center-center', max.depth = 150)


This document is based on aqp version 2.0.3 and soilDB version 2.8.1.