Skip to contents

Submit a query to the Soil Data Access (SDA) REST/JSON web-service and return the results as a data.frame. There is a 100,000 record and 32Mb JSON serialization limit per query. Queries should contain a WHERE clause or JOIN condition to limit the number of rows affected / returned. Consider wrapping calls to SDA_query() in a function that can iterate over logical chunks (e.g. areasymbol, mukey, cokey, etc.). The function makeChunks() can help with such iteration. All usages of SDA_query() should handle the possibility of a try-error result in case the web service connection is down or if an invalid query is passed to the endpoint.


SDA_query(q, dsn = NULL)



character. A valid T-SQL query surrounded by double quotes.


character. Default: NULL uses Soil Data Access remote data source via REST API. Alternately, dsn may be a file path to an SQLite database using the SSURGO schema, or a DBIConnection that has already been created.


A data.frame result for queries that return a single table. A list of data.frame for queries that return multiple tables. NULL if result is empty, and try-error on error.


The SDA website can be found at and query examples can be found at A library of query examples can be found at

SSURGO (detailed soil survey) and STATSGO (generalized soil survey) data are stored together within SDA. This means that queries that don't specify an area symbol may result in a mixture of SSURGO and STATSGO records. See the examples below and the SDA Tutorial for details.


This function requires the httr, jsonlite, and xml2 packages


D.E. Beaudette, A.G Brown


# \donttest{
  ## get SSURGO export date for all soil survey areas in California
  # there is no need to filter STATSGO
  # because we are filtering on SSURGO area symbols
  q <- "SELECT areasymbol, saverest FROM sacatalog WHERE areasymbol LIKE 'CA%';"
  x <- SDA_query(q)
#> single result set, returning a data.frame
#>   areasymbol              saverest
#> 1      CA011  8/28/2023 9:44:47 PM
#> 2      CA013  9/12/2023 8:16:32 PM
#> 3      CA021   9/6/2023 8:39:55 PM
#> 4      CA031 8/31/2023 10:37:14 PM
#> 5      CA033  8/28/2023 9:47:32 PM
#> 6      CA041 9/11/2023 11:41:42 PM

  ## get SSURGO component data associated with the
  ## Amador series / major component only
  # this query must explicitly filter out STATSGO data
  q <- "SELECT cokey, compname, comppct_r FROM legend
    INNER JOIN mapunit mu ON mu.lkey = legend.lkey
    INNER JOIN component co ON mu.mukey = co.mukey
    WHERE legend.areasymbol != 'US' AND compname = 'Amador';"

  res <- SDA_query(q)
#> single result set, returning a data.frame
#> 'data.frame':	54 obs. of  3 variables:
#>  $ cokey    : int  24044962 24601226 24047878 24051493 24067614 24610752 24613003 24069065 24044875 24044696 ...
#>  $ compname : chr  "Amador" "Amador" "Amador" "Amador" ...
#>  $ comppct_r: int  45 10 3 3 10 10 85 5 3 3 ...
#>  - attr(*, "SDA_id")= chr "Table"

  ## get component-level data for a specific soil survey area (Yolo county, CA)
  # there is no need to filter STATSGO because the query contains
  # an implicit selection of SSURGO data by areasymbol
  q <- "SELECT
    component.mukey, cokey, comppct_r, compname, taxclname,
    taxorder, taxsuborder, taxgrtgroup, taxsubgrp
    FROM legend
    INNER JOIN mapunit ON mapunit.lkey = legend.lkey
    LEFT OUTER JOIN component ON component.mukey = mapunit.mukey
    WHERE legend.areasymbol = 'CA113' ;"

  res <- SDA_query(q)
#> single result set, returning a data.frame
#> 'data.frame':	609 obs. of  9 variables:
#>  $ mukey      : int  459154 459204 459205 459208 459208 459208 459208 459209 459209 459209 ...
#>  $ cokey      : int  24793311 24793145 24793604 24793157 24793158 24793159 24793156 24793317 24793320 24793318 ...
#>  $ comppct_r  : int  100 100 100 5 85 5 5 5 2 3 ...
#>  $ compname   : chr  "Water" "Gravel pits" "Water" "Positas" ...
#>  $ taxclname  : chr  NA NA NA NA ...
#>  $ taxorder   : chr  NA NA NA NA ...
#>  $ taxsuborder: chr  NA NA NA NA ...
#>  $ taxgrtgroup: chr  NA NA NA NA ...
#>  $ taxsubgrp  : chr  NA NA NA NA ...
#>  - attr(*, "SDA_id")= chr "Table"

  ## get tabular data based on result from spatial query
  # there is no need to filter STATSGO because
  # SDA_Get_Mukey_from_intersection_with_WktWgs84() implies SSURGO
  p <- wk::as_wkt(wk::rct(-120.9, 37.7, -120.8, 37.8))
  q <- paste0("SELECT mukey, cokey, compname, comppct_r FROM component
      SDA_Get_Mukey_from_intersection_with_WktWgs84('", p,
       "')) ORDER BY mukey, cokey, comppct_r DESC")

   x <- SDA_query(q)
#> single result set, returning a data.frame
#> 'data.frame':	337 obs. of  4 variables:
#>  $ mukey    : int  462527 462527 462527 462554 462554 462554 462555 462555 462555 462558 ...
#>  $ cokey    : int  24613423 24613424 24613425 24613451 24613452 24613453 24613021 24613022 24613023 24613673 ...
#>  $ compname : chr  "Alamo" "Madera" "San Joaquin" "Corning" ...
#>  $ comppct_r: int  85 10 5 85 5 10 85 5 10 85 ...
#>  - attr(*, "SDA_id")= chr "Table"
# }