Introduction

This is a short tutorial on how to interact with the Soil Data Access (SDA) web-service using R. Queries are written using a dialect of SQL. On first glance SQL appears similar to the language used to write NASIS queries and reports, however, these are two distinct languages. Soil Data Access is a "window" into the spatial and tabular data associated with the current SSURGO snapshot. Queries can contain spatial and tabular filters. If you are new to SDA or SQL, have a look at this page.

If this is your first time using SDA, please see a related tutorial to get started.

Additional tips on advanced spatial queries can be found here.

[details pending]

Follow along with the blocks of code below by copying / pasting into a new R "script" document. Each block of commands can be run by pasting them into the R console, or by "stepping through" lines of code by moving the cursor to the top of a block (in the R script panel) and repeatedly pressing ctrl + enter.

Install Required R Packages

You only need to do this once. If you haven't installed these packages, then copy the code below and paste into the RStudio "console" pane.

# run these commands in the R console
# stable version from CRAN + dependencies
install.packages("httr", dep=TRUE)
install.packages("soilDB", dep=TRUE)
install.packages("rgdal", dep = TRUE)
install.packages("raster", dep = TRUE)
install.packages("rgeos", dep = TRUE)

Critical Note: SSURGO vs. STATSGO

SSURGO (1:24k soil survey) and STATSGO (1:250k soil survey) records are stored together in SDA. Therefore, it is critical that evey query to SDA include some kind of filter for selecting the appropriate records. Filtering strategies include:

  • explicit exclusion of STATSGO records, via legend.areasymbol != 'US' in the WHERE clause
  • implicit exclusion of STATSGO records, via SSURGO areasymbol in the WHERE clause
  • spatial queries using SDA helper functions: e.g. SDA_Get_Mukey_from_intersection_with_WktWgs84()
  • explicit selection of SSURGO / STATSGO records by record ID: e.g. mukey, cokey, etc.

Simple Queries

These queries implicitly filter-out STATSGO via SDA_Get_MupolygonWktWgs84_from_Mukey.

library(soilDB)
library(rgeos)
library(sp)
library(raster)
library(maps)


# get polygons for a single mukey
q <- "SELECT G.MupolygonWktWgs84 as geom, '462594' as mukey from SDA_Get_MupolygonWktWgs84_from_Mukey('462594') as G"
res <- SDA_query(q)

# result is a data.frame, "MupolygonWktWgs84" contains WKT representation of geometry
str(res)
## 'data.frame':    38 obs. of  2 variables:
##  $ geom : chr  "POLYGON ((-120.77740109665115 37.628472182459824, -120.77724116990088 37.628467823952107, -120.77715902715808 3"| __truncated__ "POLYGON ((-120.87307467552661 37.62900777895004, -120.87300762026977 37.628995038704751, -120.87297861832062 37"| __truncated__ "POLYGON ((-120.64105231387907 37.63978440993904, -120.6407408423507 37.639925058964508, -120.64023222870367 37."| __truncated__ "POLYGON ((-120.84573357077223 37.618002507943245, -120.84559225131953 37.617927071505925, -120.84513543753337 3"| __truncated__ ...
##  $ mukey: int  462594 462594 462594 462594 462594 462594 462594 462594 462594 462594 ...
##  - attr(*, "SDA_id")= chr "Table"
# convert to SPDF
s <- processSDA_WKT(res)

# check
head(s@data)
##   gid  mukey
## 1   1 462594
## 2   2 462594
## 3   3 462594
## 4   4 462594
## 5   5 462594
## 6   6 462594
plot(s)

# get polygons associated with map units that contain "amador" as a major component
q <- "select G.MupolygonWktWgs84 as geom, mapunit.mukey, muname
FROM mapunit
CROSS APPLY SDA_Get_MupolygonWktWgs84_from_Mukey(mapunit.mukey) as G
WHERE mukey IN (SELECT DISTINCT mukey FROM component WHERE compname like 'amador%' AND majcompflag = 'Yes')"

# result is a data.frame, "MupolygonWktWgs84" contains WKT representation of geometry
res <- SDA_query(q)
str(res)
## 'data.frame':    262 obs. of  3 variables:
##  $ geom  : chr  "POLYGON ((-121.02343389703034 38.345582476645284, -121.02352810937913 38.345557666497072, -121.02377101663964 3"| __truncated__ "POLYGON ((-121.02221248788987 38.367868787138981, -121.022323632077 38.367780609633911, -121.02235095713316 38."| __truncated__ "POLYGON ((-121.02388803359727 38.390708806570395, -121.02378946242855 38.39055424377397, -121.02353549108859 38"| __truncated__ "POLYGON ((-121.02298714184944 38.350975729070726, -121.02289158760166 38.3507891479186, -121.02293316234896 38."| __truncated__ ...
##  $ mukey : int  1612048 1612048 1612048 1612048 1612048 1540891 1403440 1403440 1540891 1540891 ...
##  $ muname: chr  "Amador-Gillender complex, 2 to 15 percent slopes" "Amador-Gillender complex, 2 to 15 percent slopes" "Amador-Gillender complex, 2 to 15 percent slopes" "Amador-Gillender complex, 2 to 15 percent slopes" ...
##  - attr(*, "SDA_id")= chr "Table"
# convert to SPDF
s <- processSDA_WKT(res)

# check: OK
head(s@data)
##   gid   mukey                                           muname
## 1   1 1612048 Amador-Gillender complex, 2 to 15 percent slopes
## 2   2 1612048 Amador-Gillender complex, 2 to 15 percent slopes
## 3   3 1612048 Amador-Gillender complex, 2 to 15 percent slopes
## 4   4 1612048 Amador-Gillender complex, 2 to 15 percent slopes
## 5   5 1612048 Amador-Gillender complex, 2 to 15 percent slopes
## 6   6 1540891              Amador loam, 8 to 30 percent slopes
# map
par(mar=c(0,0,0,0))
map('county', 'California', xlim=c(-123.25, -118.75), ylim=c(36.5, 39))
plot(s, add=TRUE)


This document is based on soilDB version 2.0-1.