Introduction

During FY15 the Las Cruces SSO was contacted by New Mexico State University and NM Department of Transportation, who were interested in dust mitigation for a large playa along interstate 10 near the AZ - NM border. The project area falls within the Hidalgo County Soil Survey (NM023). A majority of the playa was mapped as miscellaneous area “playa” components and component, Hondale soils (Fine, mixed, superactive, thermic Typic Natrargids). During field reconnaissance in FY14 & 16, the survey crew determined that there was a new component that was previously un-described. This component classifies as a Fine-loamy, mixed, superactive, thermic Sodic Haplocalcids, and there are currently no soil OSD’s that fit this classification within MLRA 41. It was determined during a field review that we needed to establish a new soil OSD to capture this component.

As the majority of my previous work in R focused in spatial data analysis and modeling, I decided that I would use this opportunity to work with the NASIS tabular data in R. This report contains my process of exploratory data analysis, to review the pedon data being used to create the Highlonesome soil OSD and it’s relevant range in characteristics. I have also taken this opportunity to use R Markdown to knit an HTML document as my final product.

*note this document currently has the following options set: knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE). I did this to clean up the output of the final document. I have also commented out (#) specific lines where I was examining the structure of an object.

Review of pedon data for the creation of Highlonesome soil OSD

Using the fetchNASIS function of the soilDB package to bring in pedon data from NASIS selected set. My selected set contains pedons collected during FY15 and FY16 for the Animas Valley Playa update soil survey project.

library(soilDB)

# get pedons from NASIS
p <- fetchNASIS() # no warning errors when loading pedons

length(p) 
## [1] 42
# There are 42 pedons in the selected set

# reviewing the SPC object p
class(p)
## [1] "SoilProfileCollection"
## attr(,"package")
## [1] "aqp"
#str(p)

# get list of site ids for reference puropsesx
s.id <- p@site$site_id

#print(s.id)

Pedon Locations

Creating a map of the pedon locations

library(sp)
library(maps)

# subsetting WGS84 decimal degree corrdinates from p SPC
p.locations <- site(p)[, c('site_id', 'x_std', 'y_std')]

# initialize coordinates in an SPDF
coordinates(p.locations) <- ~ x_std + y_std

# define coordinate system
proj4string(p.locations) <- '+proj=longlat +datum=WGS84'

# set plot margins
par(mar=c(1,1,1,1))

# plot county boundaries for all of NM
map('county', 'new mexico')

# add pedon data locations
points(p.locations, cex = 0.5, pch = 3, col = 'red')

# all of the pedons fall within Hidalgo County, need to zoom in to map

# get extent from p.locations

p.locations@bbox # used this as a rough quide for extent of map
##              min      max
## x_std -108.95114 108.6004
## y_std   31.80595  32.9217
# set up custom xy coordinates for new map extent
xmin <- p.locations@bbox[1,1]-.04
xmax <- p.locations@bbox[1,2]*-1 +.1
ymin <- p.locations@bbox[2,1]-.1
ymax <- p.locations@bbox[2,2]-.3

xlim <- c(xmin, xmax)
ylim <- c(ymin, ymax)

# create new map
map('county', 'New Mexico', xlim=xlim, ylim=ylim)

#add pedon data locations, not the symbol styling
points(p.locations, cex = 1, pch = 3, col = 'red')
box()

# write points to shape file for future use
library(rgdal)

writeOGR(p.locations, dsn = "C:/workspace1", layer = "highlonesome.shp", driver = "ESRI Shapefile", overwrite_layer = TRUE)

Adding Pedon Horizon Lab Results Data

Our office collects pH and EC in the field, using hand held digital meters. The current data structure of NASIS only allows us to populate EC in the pedon horizon lab results table. I realized that the SPC does not contain the pedon horizon lab results. The following was my work around. I have since realized that there is another way to do this utilizing the fetchNASISLabData() function or if using fetchNASIS(lab = TRUE).

# We are interested in seeing the range of EC values for the surface horizon, since we populate these values in the pedon horizon lab results, I will need to bring in the data as a csv and join to the SPC

library(readr)
# bringing in csv file containg ec data for every horizon
ec <- read_csv("C:/workspace1/ec.csv")

# inspecting ec object
head(ec)
## # A tibble: 6 x 6
##   `Pedon Rec ID` `Phorizon Rec ID` `Horizon Designation` `Top Depth`
##            <int>             <int> <chr>                       <int>
## 1        1093103           5073517 A                               0
## 2        1093103           5073518 Bn                             10
## 3        1093103           5073519 Bw1                            35
## 4        1093103           5073520 Bw2                            75
## 5        1093103           5073522 2Bn                           100
## 6        1093103           5073521 3Cn                           130
## # ... with 2 more variables: `Bottom Depth` <int>, EC <dbl>
#str(ec)
#head(p)

# changing column names
names(ec) <- c('peiid', 'phiid', 'name', 'top', 'bottom', 'ec')

# convert to data frame
ec.df <- as.data.frame(ec)

class(ec.df)
## [1] "data.frame"
head(ec)
## # A tibble: 6 x 6
##     peiid   phiid name    top bottom    ec
##     <int>   <int> <chr> <int>  <int> <dbl>
## 1 1093103 5073517 A         0     10  2.30
## 2 1093103 5073518 Bn       10     35  2.10
## 3 1093103 5073519 Bw1      35     75  4.10
## 4 1093103 5073520 Bw2      75    100  6.30
## 5 1093103 5073522 2Bn     100    130  2.50
## 6 1093103 5073521 3Cn     130    152  1.30
# set up new data frame ith only phiid and ec
ec.df <- as.data.frame(ec[, c(2,6)])

# extract horizon data from p
h <- horizons(p)

#str(h)

# add ec data to h 
h1 <- merge(h, ec.df, by = 'phiid')

#head(h1)

# create new SPC p1 from p and replace horizons with h1
p1 <- p
horizons(p1) <- h1
#str(p1)

Review of pedons

Highlonesome is a Fine-loamy, mixed, superactive, thermic, Sodic Haplocalcids soil formed in lacustrine and alluvial deposits of playa lakes found typically around the margins in playa steps.

Review of Landform data

library(Hmisc)# required for plot

# create lf object 
lf <- sort(table(p1$landform_string), decreasing = TRUE)

# plot of landforms
dotchart2(lf, col = 'black', xlim = c(0, max(lf)), cex.labels = 0.75)

# note 3 pedons should be reviewed one has fan piedmont, one has fan skirt, and one is swale, review in ArcGIS and fix in NASIS or remove from selected profiles
head(lf)
## 
##    playa step         playa alluvial flat   basin floor shore complex 
##            13            12             5             3             3 
##   playa slope 
##             2
#print(lf)
lf.rm <- lf[c(7:9)]
names(lf.rm)
## [1] "fan piedmont"           "fan skirt & playa step"
## [3] "swale"
# create an index with landforms to review
lf.idx <- which(p1$landform_string == 'swale' | p1$landform_string == 'fan piedmont' | p1$landform_string == 'fan skirt & playa step')

lf1 <- p1[lf.idx, ]

# site ids to review landforms on
print(lf1$site_id)
## [1] "2015NM023001" "2016NM023013" "2016NM023052"
#> print(lf1$site_id)
#[1] "2015NM023001" "2015NM023147" "2016NM023017"

Review of Textures

Due to the nature of these soils, rock fragments may be present but at less than 15 percent.

# looking into soil textures
sort(table(p$texture), decreasing=TRUE)
## 
##     SCL      CL       L    SICL      SL     SIL       C     SIC     COS 
##      51      41      27      18      12       9       7       2       1 
##      FS   GR-CL  GR-SCL GRV-COS GRV-SCL     LFS      LS 
##       1       1       1       1       1       1       1
# there are seversal gravelly and very gravelly textured pedons in NASIS, I want to remove them because Highlonesome sould have < 15% gravels

# inspecting data slots of SPC object p
#p$texture
#p$texture_class
#p$fragvoltot

# create an index for gravelly textures
# idx <- which(p$fragvoltot > 15)
#p2 <- p[idx, ]
#p2
# this method was not working for me used subsetProfiles below

p2 <- subsetProfiles(p, h = 'fragvoltot > 15')

table(p2$texture)
## 
##       C      CL   GR-CL  GR-SCL GRV-COS GRV-SCL       L     SCL      SL 
##       2       1       1       1       1       1       1       1       5
#table(p2$fragvoltot)


# plot profiles that have rock fragments
par(mar=c(0,0,2,1))
plot(p2, label='site_id')

# get site_id's for pedons with any horizon that has >% RF
print(p2$site_id)
## [1] "2015NM023074" "2015NM023147" "2015NM023034"
#> print(p2$site_id)
#[1] "2015NM023074" "2015NM023147" "2015NM023034"
# I changed these to Highlonesome family in NASIS

# I want to remove the profiles with > 15% RF from any further analysis
p1@site$pedon_id
##  [1] "2015NM023074" "2015NM023075" "2015NM023001" "2015NM023002"
##  [5] "2015NM023005" "2015NM023007" "2015NM023011" "2015NM023012"
##  [9] "2015NM023013" "2015NM023149" "2016NM023005" "2016NM023011"
## [13] "2016NM023020" "2016NM023021" "2016NM023022" "2016NM023053"
## [17] "2016NM023056" "2015NM023124" "2015NM023125" "2015NM023126"
## [21] "2015NM023127" "2015NM023129" "2015NM023132" "2015NM023133"
## [25] "2015NM023137" "2015NM023140" "2015NM023142" "2015NM023144"
## [29] "2015NM023147" "2016NM023013" "2016NM023014" "2016NM023015"
## [33] "2015NM023048" "2015NM023056" "2015NM023050" "2015NM023049"
## [37] "2015NM023035" "2015NM023034" "2016NM023017" "2016NM023019"
## [41] "2016NM023052" "2016NM023073"
# creating new spc objects without the pedons above
p3 <- p1[2:28]
p4 <- p1[30:36]
p5 <- p1[38:42]

length(p1) 
## [1] 40
# original number of pedons

# combining subset spcs into new spc called p1
p1 <- rbind(p3, p4, p5)

length(p1)
## [1] 37
# new number of pedons


# creating soil profile plots to look graphically look at the clay contents

par(mar=c(0,0,3,0)) # tighter figure margins
plot(p1[1:19, ], label = 'site_id', color = 'clay')

# note 2016NM023006 dosn't appear to be fine-loamy, review in NASIS
# this pedon is coarse-loamy remove from spc
p1@site$pedon_id
##  [1] "2015NM023075" "2015NM023001" "2015NM023002" "2015NM023005"
##  [5] "2015NM023011" "2015NM023012" "2015NM023013" "2015NM023149"
##  [9] "2016NM023005" "2016NM023011" "2016NM023020" "2016NM023021"
## [13] "2016NM023022" "2016NM023053" "2016NM023056" "2015NM023124"
## [17] "2015NM023125" "2015NM023126" "2015NM023127" "2015NM023129"
## [21] "2015NM023132" "2015NM023133" "2015NM023137" "2015NM023140"
## [25] "2015NM023142" "2015NM023144" "2015NM023147" "2016NM023014"
## [29] "2016NM023015" "2015NM023048" "2015NM023050" "2015NM023049"
## [33] "2015NM023035" "2015NM023034" "2016NM023019" "2016NM023052"
## [37] "2016NM023073"
# removing specific record
p1 <- p1[-10, ]

# verifying pedon id's
p1@site$pedon_id
##  [1] "2015NM023075" "2015NM023001" "2015NM023002" "2015NM023005"
##  [5] "2015NM023011" "2015NM023012" "2015NM023013" "2015NM023149"
##  [9] "2016NM023005" "2016NM023020" "2016NM023021" "2016NM023022"
## [13] "2016NM023053" "2016NM023056" "2015NM023124" "2015NM023125"
## [17] "2015NM023126" "2015NM023127" "2015NM023129" "2015NM023132"
## [21] "2015NM023133" "2015NM023137" "2015NM023140" "2015NM023142"
## [25] "2015NM023144" "2015NM023147" "2016NM023014" "2016NM023015"
## [29] "2015NM023048" "2015NM023050" "2015NM023049" "2015NM023035"
## [33] "2015NM023034" "2016NM023019" "2016NM023052" "2016NM023073"
par(mar=c(0,0,3,0)) # tighter figure margins
plot(p1[20:38, ], label = 'site_id', color = 'clay')

Range in Characteristics

Site Data

library(plyr)
library(soilReports)
library(reshape2)
library(knitr)

# get horizon and site level data from the SPC
h <- horizons(p1)
s <- site(p1)

Elevation, Slope, Aspect

# creating a vector of variables to use below
vars <- c('elev_field', 'slope_field')

# extracting elevation and slope from the site data
elev.slp <- subset(s, select = vars)

# not necessary, but renaiming variables
colnames(elev.slp) <- c("Elevation", "Slope")

# convert data from wide format to long for the plyr package
elev.slp <- melt(elev.slp)

# create a table with the ric
elev.slp.t <- ddply(elev.slp, .(variable), summarize, range = prettySummary(value))

# aspect - have to treat data differently since it is circular data
# creating a vector of which variable to select
vars <- c("aspect_field")

# extracting aspect data from site data
aspect <- subset(s, select = vars)

# renaming column
colnames(aspect) <- c("Aspect")

# converting data from wide format to long
aspect <- melt(aspect)

library(circular)# necessary for circular data
# extracting aspect values and replacing with circular data
aspect$value <- circular(aspect$value, template="geographic", units="degrees", modulo="2pi")

# create table with summarized ric
aspect.t <- ddply(aspect, .(variable), summarize, range = prettySummary(value))

# combined Elev, Slp, and Aspect using the rbind below
kable(rbind(elev.slp.t, aspect.t), caption = "Elevation, Slope, Aspect (min, 25th, median, 75th, max)(n)", align = 'c')
Elevation, Slope, Aspect (min, 25th, median, 75th, max)(n)
variable range
Elevation (1260, 1266, 1268, 1270, 1320)(36)
Slope (0, 0, 0.6, 1, 3.7)(36)
Aspect (135, 56, 0, 307, 182)(32)

Slope Shape

# inspecting slope shape
s$shapedown
##  [1] linear  linear  linear  linear  linear  linear  linear  linear 
##  [9] linear  linear  linear  linear  linear  linear  linear  linear 
## [17] concave linear  linear  linear  linear  linear  linear  linear 
## [25] linear  linear  linear  linear  linear  linear  linear  linear 
## [33] linear  linear  linear  linear 
## Levels: concave linear convex undulating complex
s$shapeacross
##  [1] linear linear linear linear linear linear linear linear linear linear
## [11] linear linear linear linear linear linear linear linear linear linear
## [21] linear linear linear linear linear linear linear linear linear linear
## [31] linear linear linear linear linear linear
## Levels: concave linear convex undulating complex
# creating a vector of variables
vars <- c('shapedown', 'shapeacross')

# subsetting site data selecting only variables
ss <- subset(s, select = vars)

# inspecting slope shape data table
#ss.t <- table(ss)

# dropping un-used data levels
ss.drop <- droplevels(ss)

# creating data table with slope shape
ss.t2 <- xtabs(~ shapedown + shapeacross, data = ss.drop, drop.unused.levels = TRUE)

# inspecting frequency table
ftable(ss.t2)
##           shapeacross linear
## shapedown                   
## concave                    1
## linear                    35
kable(ss.t2, caption = "Slope Shape (counts)", align = 'c')
Slope Shape (counts)
linear
concave 1
linear 35

Landform and Parent material

# review of gomorphic positions
library(Hmisc)# required for plot

# create lf object 
lf <- sort(table(p1$landform_string), decreasing = TRUE)

# plot of landforms
dotchart2(lf, col = 'black', xlim = c(0, max(lf)), cex.labels = 0.75)

## parent material vs landform

# select variables
vars <-  c('pmkind', 'landform_string')

# subset data
pm.l <- subset(s, select = vars)

# drop levels
pm.l <- droplevels(pm.l)

# create table
pm.l.t <- xtabs(~ pmkind + landform_string, data = pm.l)

# inspecting frequency table
ftable(pm.l.t)
##                            landform_string alluvial flat basin floor fan piedmont playa playa slope playa step shore complex swale
## pmkind                                                                                                                            
## alluvium                                               4           2            1     7           2         10             3     1
## alluvium & eolian deposits                             0           1            0     0           0          0             0     0
## alluvium & eolian sands                                0           0            0     0           0          1             0     0
## lacustrine deposits                                    0           0            0     3           0          1             0     0
kable(pm.l.t, caption = "Parent Material vs. Landform (counts)", align = 'c')
Parent Material vs. Landform (counts)
alluvial flat basin floor fan piedmont playa playa slope playa step shore complex swale
alluvium 4 2 1 7 2 10 3 1
alluvium & eolian deposits 0 1 0 0 0 0 0 0
alluvium & eolian sands 0 0 0 0 0 1 0 0
lacustrine deposits 0 0 0 3 0 1 0 0

Drainage

# select variables
vars <- c('drainagecl')

# subset data
drain <- subset(s, select = vars)

# rename cols
colnames(drain) <- c('Drainage Class')

# drop levels
#drain <- droplevels(drain)

# create table
drain.t <- table(drain)

drain.t # inspect table
## drain
##          excessively somewhat excessively                 well 
##                    0                    0                   36 
##      moderately well      somewhat poorly               poorly 
##                    0                    0                    0 
##          very poorly           subaqueous 
##                    0                    0
kable(drain.t, caption = "Drainage Class", align = 'c')
Drainage Class
drain Freq
excessively 0
somewhat excessively 0
well 36
moderately well 0
somewhat poorly 0
poorly 0
very poorly 0
subaqueous 0

Earth Cover Kind

earth cover kind 1 is on the x-axis and earth cover kind 2 is on the y-axis

# creating a vector of variables
vars <- c('earthcovkind1', 'earthcovkind2')

# subsetting data
eck <- subset(s, select = vars)

#eck #inspecting data element

eck <- droplevels(eck) #removing other levels

# create table
eck.t <- xtabs(~ earthcovkind1 + earthcovkind2, data = eck)

#eck.t #inspecting table

kable(eck.t, caption = "Earth Cover Kind - earth cover kind 1 is on the x-axis and earth cover kind 2 is on the y-axis", align = 'c')
Earth Cover Kind - earth cover kind 1 is on the x-axis and earth cover kind 2 is on the y-axis
rangeland, grassland rangeland, shrubby other grass/herbaceous cover native shrubs other shrub cover other barren
barren land 0 0 0 0 0 5
grass/herbaceous cover 18 0 2 0 0 0
shrub cover 0 9 0 1 1 0

Taxonomy

# create a vector of variables
vars <- c('pedon_id', 'taxonname', 'taxpartsize', 'taxsubgrp')

# selecting the subset of data
tax <- subset(s, select = vars)

# changing the col names
colnames(tax) <- c('Pedon ID', 'Taxon Name', 'Particle Size Class', 'Subgroup')

kable(tax, caption = 'Review of Pedon Taxonomy', align = 'c')
Review of Pedon Taxonomy
Pedon ID Taxon Name Particle Size Class Subgroup
2 2015NM023075 Highlonesome family fine-loamy sodic haplocalcids
3 2015NM023001 Highlonesome fine-loamy sodic haplocalcids
4 2015NM023002 Highlonesome, ponded fine-loamy sodic haplocalcids
5 2015NM023005 Highlonesome fine-loamy sodic haplocalcids
7 2015NM023011 Highlonesome, ponded fine-loamy sodic haplocalcids
8 2015NM023012 Highlonesome family fine-loamy sodic haplocalcids
9 2015NM023013 Highlonesome fine-loamy sodic haplocalcids
10 2015NM023149 Highlonesome, ponded fine-loamy sodic haplocalcids
11 2016NM023005 Highlonesome fine-loamy sodic haplocalcids
13 2016NM023020 Highlonesome, severely erodible fine-loamy sodic haplocalcids
14 2016NM023021 Highlonesome fine-loamy sodic haplocalcids
15 2016NM023022 Highlonesome fine-loamy sodic haplocalcids
16 2016NM023053 Highlonesome fine-loamy sodic haplocalcids
17 2016NM023056 Highlonesome family fine-loamy sodic haplocalcids
18 2015NM023124 Highlonesome, severley erodible fine-loamy sodic haplocalcids
19 2015NM023125 Highlonesome family fine-loamy sodic haplocalcids
20 2015NM023126 Highlonesome fine-loamy sodic haplocalcids
21 2015NM023127 Highlonesome, severely erodible fine-loamy sodic haplocalcids
22 2015NM023129 Highlonesome family fine-loamy sodic haplocalcids
23 2015NM023132 Highlonesome fine-loamy sodic haplocalcids
24 2015NM023133 Highlonesome fine-loamy sodic haplocalcids
25 2015NM023137 Highlonesome, severely erodible fine-loamy sodic haplocalcids
26 2015NM023140 Highlonesome family fine-loamy sodic haplocalcids
27 2015NM023142 Highlonesome family fine-loamy sodic haplocalcids
28 2015NM023144 Highlonesome fine-loamy sodic haplocalcids
29 2015NM023147 Highlonesome, family fine-loamy sodic haplocalcids
31 2016NM023014 Highlonesome fine-loamy sodic haplocalcids
32 2016NM023015 Highlonesome fine-loamy sodic haplocalcids
33 2015NM023048 Highlonesome fine-loamy sodic haplocalcids
35 2015NM023050 Highlonesome, severely erodible fine-loamy sodic haplocalcids
36 2015NM023049 Highlonesome fine-loamy sodic haplocalcids
37 2015NM023035 Highlonesome family fine-loamy sodic haplocalcids
38 2015NM023034 Highlonesome family fine-loamy sodic haplocalcids
40 2016NM023019 Highlonesome fine-loamy sodic haplocalcids
41 2016NM023052 Highlonesome, severely erodible fine-loamy sodic haplocalcids
42 2016NM023073 Highlonesome, severely erodible fine-loamy sodic haplocalcids

Diagnostic features

# extract diagnostic data from spc
d <- p1@diagnostic

#d #inspecting data

d <- d[, 2:4] # removing peiid col

#d #inspecting data

# using the transform function to add thickness
d <- transform(d, 
               thickness = featdepb - featdept,
               featkind = as.character(featkind) 
)

#d #inspecting data

# changing from wide to long format
d <- melt(d, id.vars = 'featkind', measure.vars = c('featdept', 'featdepb', 'thickness'))

#d #inspecting data

# create a table with the ric
d.t <- ddply(d, .(variable, featkind), summarize, range = prettySummary(value))

# dcast used to aggregate data back to feature kind on x axis and depths and thick on y
d.t <- dcast(d.t, featkind ~ variable, value.var = 'range')

kable(d.t, caption = "Diagnostic Features", align = 'c')
Diagnostic Features
featkind featdept featdepb thickness
calcic horizon (5, 20, 39, 50, 103)(36) (61, 118, 152, 152, 152)(36) (25, 62, 106, 132, 147)(36)
cambic horizon (4, 7, 14, 20, 31)(16) (38, 42, 58, 72, 103)(16) (22, 30, 46, 54, 85)(16)
ochric epipedon (0, 0, 0, 0, 0)(36) (4, 18, 18, 18, 18)(36) (4, 18, 18, 18, 18)(36)
salt accumulations (0, 0, 0, 25, 100)(34) (5, 152, 152, 152, 152)(34) (5, 121, 152, 152, 152)(34)

Surface Rock Fragments

# a direct copy from the Roecker reports

# grabbing the data from the site object
srf <- s[grepl("surface_", names(s))]

# changing the names
names(srf) <- gsub("surface_", "", names(srf))

# adding total surface fragments
srf <- within(srf, {
  total_srf = gravel + cobbles + stones + boulders + flagstones + channers
  gravel = gravel - fgravel
})

# creating a list of variables to select from
vars <- c("total_srf", "fgravel", "gravel", "cobbles", "stones", "boulders", "channers", "flagstones")

# transforming data from wide to long format
srf.lo <- melt(srf, measure.vars = vars)

# running statistical analysis
srf.5n <- ddply(srf.lo, .(variable), summarize,
                range = prettySummary(value)
)

kable(srf.5n, caption =  "Surface rock fragments (min, 25th, median, 75th, max)(n)", align = 'c')
Surface rock fragments (min, 25th, median, 75th, max)(n)
variable range
total_srf (0, 0, 0, 0.9, 60)(36)
fgravel (0, 0, 0, 0, 2)(36)
gravel (0, 0, 0, 0, 60)(36)
cobbles (0, 0, 0, 0, 0)(36)
stones (0, 0, 0, 0, 0)(36)
boulders (0, 0, 0, 0, 0)(36)
channers (0, 0, 0, 0, 0)(36)
flagstones (0, 0, 0, 0, 0)(36)

Generalized Horizon Designations

# viewing horizon designations
sort(table(h$hzname), decreasing = TRUE)
## 
##  Bkn1  Bkn2     A    An  Bkn3   Bkn Bknz1 Bknz2    Bk Bknz3   Bk1    Bn 
##    20    19    18    16    11     8     7     7     6     5     4     4 
##   Bk2    2C   Anz    Cn  2Bk2  2Bkn 2Bkn2 2Bkn3   2Bn   2Cn 2Cnzg   3Cn 
##     3     2     2     2     1     1     1     1     1     1     1     1 
##  Bkkn  Bkn4  Bknz Bknz4   Cng  Cnzg 
##     1     1     1     1     1     1
# generalized horizon labels
n <- c("An",
       "Bkn1",
       "Bkn2",
       "Cn")
# REGEX rules
r <- c("A",
       "B",
       "2|3|4|kk|^Bknz$",
       "C")

# Compute genhz labels and add to dataset
h$genhz <- generalize.hz(h$hzname, n, r)

p1$genhz <- generalize.hz(p1$hzname, n, r)

Comparison of genhz and hzname pattern matching

# select horizon names and gen hz lables
hz_t <- addmargins(table(h$genhz, h$hzname))

# create an index to limit the table size
idx <- pIndex(hz_t, 10)

# plot 10 horizon designations per row
for (i in unique(idx)){
  print(kable(hz_t[, c(idx == i)], align = 'c', digits = 0, caption = "Horizon designations vs generic horizon designations (counts)"))
}
## 
## 
## Table: Horizon designations vs generic horizon designations (counts)
## 
##             2Bk2    2Bkn    2Bkn2    2Bkn3    2Bn    2C    2Cn    2Cnzg    3Cn    A  
## ---------  ------  ------  -------  -------  -----  ----  -----  -------  -----  ----
## An           0       0        0        0       0     0      0       0       0     18 
## Bkn1         0       0        0        0       0     0      0       0       0     0  
## Bkn2         1       1        1        1       1     0      0       0       0     0  
## Cn           0       0        0        0       0     2      1       1       1     0  
## not-used     0       0        0        0       0     0      0       0       0     0  
## Sum          1       1        1        1       1     2      1       1       1     18 
## 
## 
## Table: Horizon designations vs generic horizon designations (counts)
## 
##             An    Anz    Bk    Bk1    Bk2    Bkkn    Bkn    Bkn1    Bkn2    Bkn3 
## ---------  ----  -----  ----  -----  -----  ------  -----  ------  ------  ------
## An          16     2     0      0      0      0       0      0       0       0   
## Bkn1        0      0     6      4      0      0       8      20      0       0   
## Bkn2        0      0     0      0      3      1       0      0       19      11  
## Cn          0      0     0      0      0      0       0      0       0       0   
## not-used    0      0     0      0      0      0       0      0       0       0   
## Sum         16     2     6      4      3      1       8      20      19      11  
## 
## 
## Table: Horizon designations vs generic horizon designations (counts)
## 
##             Bkn4    Bknz    Bknz1    Bknz2    Bknz3    Bknz4    Bn    Cn    Cng    Cnzg 
## ---------  ------  ------  -------  -------  -------  -------  ----  ----  -----  ------
## An           0       0        0        0        0        0      0     0      0      0   
## Bkn1         0       0        7        0        0        0      4     0      0      0   
## Bkn2         1       1        0        7        5        1      0     0      0      0   
## Cn           0       0        0        0        0        0      0     2      1      1   
## not-used     0       0        0        0        0        0      0     0      0      0   
## Sum          1       1        7        7        5        1      4     2      1      1   
## 
## 
## Table: Horizon designations vs generic horizon designations (counts)
## 
##              x  
## ---------  -----
## An          36  
## Bkn1        49  
## Bkn2        54  
## Cn           9  
## not-used     0  
## Sum         148

Profile Plots

library(RColorBrewer)

cols <- brewer.pal(n = length(levels(p1$genhz)), name = "Set1") 
hz.names <- levels(p1$genhz)

# assign a color to each generalized horizon label
p1$genhz.soil_color <- cols[match(p1$genhz, hz.names)]

# create an index to limit the number of plots to print
idx <- pIndex(p1, 10)

# plot 10 profiles at a time
for (i in unique(idx)){
  plot(p1[idx == i], name = 'hzname', color = 'genhz.soil_color', label = 'pedon_id')
  title("Soil profile plots")
  legend('bottomleft', legend = hz.names, pt.bg = cols, pch = 22, horiz = TRUE, pt.cex = 2, text.width = 1)
}

General Horizon Depths

# basically a direct copy from Roecker reports

# transforming data from wide to long format
genhz.lo <- melt(p1@horizons, id.vars="genhz", measure.vars = c('hzdept', 'hzdepb'))

# creating thickness data from horizon depths
genhz.thk <- ddply(p1@horizons, .(phiid, genhz), summarize, thickness=sum(hzdepb-hzdept))

# transforming data wide to long
genhz.lo2 <- melt(genhz.thk, id.vars = "genhz", measure.vars = 'thickness')

# combining hzn depths and thickness 
genhz.lo <- rbind(genhz.lo, genhz.lo2)

# creating summary statistics
genhz.5n <- ddply(genhz.lo, .(variable, genhz), summarize,
                  range = prettySummary(value)
)

# dcast is changing the structure so that genhz is on the x acis and the variable is on the y
kable(dcast(genhz.5n, genhz ~ variable, value.var = 'range'), align = 'c', caption = "Depths and thickness of generic horizons (min, 25th, median, 75th, max)(n)")
Depths and thickness of generic horizons (min, 25th, median, 75th, max)(n)
genhz hzdept hzdepb thickness
An (0, 0, 0, 0, 0)(36) (4, 10, 14, 20, 45)(36) (4, 10, 14, 20, 45)(36)
Bkn1 (4, 10, 20, 35, 75)(49) (20, 40, 65, 80, 110)(49) (10, 25, 35, 49, 70)(49)
Bkn2 (27, 57, 78, 99, 125)(54) (55, 106, 148, 152, 152)(54) (12, 35, 44, 62, 97)(54)
Cn (61, 75, 104, 120, 130)(9) (110, 152, 152, 152, 152)(9) (22, 32, 35, 49, 91)(9)

Texture and Rock Frags

# creating a vector of variable names
vars <- c('clay', 'silt', 'sand', 'total_frags_pct')

# getting data from variables and transforming it from wide to long format
h.tex <- melt(p1@horizons, id.vars = "genhz", measure.vars = vars)

# creating summary statistics
h.tex <- ddply(h.tex, .(variable, genhz), summarize, range = prettySummary(value))

# aggregating data by genetic horizon
h.tex <- dcast(h.tex, genhz ~ variable, value.var = 'range')

kable(h.tex, align = 'c', caption = "Particle size seperates by genetic horizon (min, 25th, median, 75th, max)(n)" )
Particle size seperates by genetic horizon (min, 25th, median, 75th, max)(n)
genhz clay silt sand total_frags_pct
An (10, 23, 27, 32, 45)(36) (10, 26, 45, 58, 63)(36) (5, 10, 28, 45, 70)(36) (0, 0, 0, 0, 8)(36)
Bkn1 (12, 25, 28, 33, 42)(49) (8, 18, 33, 37, 58)(49) (5, 30, 40, 60, 65)(49) (0, 0, 0, 0, 20)(49)
Bkn2 (14, 22, 28, 32, 50)(54) (5, 13, 22, 36, 66)(54) (10, 35, 48, 60, 70)(54) (0, 0, 0, 0, 40)(54)
Cn (5, 10, 14, 22, 39)(9) (10, 15, 19, 21, 52)(9) (38, 60, 65, 65, 80)(9) (0, 0, 0, 0, 5)(9)

Texture Class

# pulling texture classes as its own table
h.texcl <- h$texcl

# inspecting the frequency table
ftable(h.texcl)
## h.texcl cos  s fs vfs lcos ls lfs lvfs cosl sl fsl vfsl  l sil si scl cl sicl sc sic  c
##                                                                                        
##           0  0  0   0    0  1   1    0    0 10   0    0 23   9  0  43 39   16  0   2  4
# generating count table of texture classes by genetic horizon
hz_tex <- addmargins(xtabs(~ genhz + texture, data = p1@horizons))

# inspecting texture table
hz_tex
##           texture
## genhz        C  CL GR-CL GR-SCL GRV-SCL   L LFS  LS SCL SIC SICL SIL  SL
##   An         1   4     0      0       0   7   0   0   4   1   11   4   4
##   Bkn1       2  19     0      1       0   7   0   0  13   1    2   3   1
##   Bkn2       1  14     1      0       1   9   0   0  22   0    2   2   2
##   Cn         0   1     0      0       0   0   1   1   2   0    1   0   3
##   not-used   0   0     0      0       0   0   0   0   0   0    0   0   0
##   Sum        4  38     1      1       1  23   1   1  41   2   16   9  10
##           texture
## genhz      Sum
##   An        36
##   Bkn1      49
##   Bkn2      54
##   Cn         9
##   not-used   0
##   Sum      148
# rearranging cols
kable(hz_tex[, c(4,5,10,9,3,6,2,8,7,1)], caption = "Texture Class by genetic horizon (counts)", align = 'c')
Texture Class by genetic horizon (counts)
GR-SCL GRV-SCL SIC SCL GR-CL L CL LS LFS C
An 0 0 1 4 0 7 4 0 0 1
Bkn1 1 0 1 13 0 7 19 0 0 2
Bkn2 0 1 0 22 1 9 14 0 0 1
Cn 0 0 0 2 0 0 1 1 1 0
not-used 0 0 0 0 0 0 0 0 0 0
Sum 1 1 2 41 1 23 38 1 1 4
# create soil texture tiangle plots
library(soiltexture)


texture.df <- cbind.data.frame(p1@horizons$genhz, p1@horizons$sand, p1@horizons$silt, p1@horizons$clay)
colnames(texture.df) <- c("genhz", "SAND", "SILT", "CLAY")

# subset by gen hz
h.text.a <- subset(texture.df, texture.df$genhz == 'An')
h.text.b1 <- subset(texture.df, texture.df$genhz == 'Bkn1')
h.text.b2 <- subset(texture.df, texture.df$genhz == 'Bkn2')
h.text.c <- subset(texture.df, texture.df$genhz == 'Cn')


#texture plots
TT.plot( class.sys = "USDA-NCSS.TT",
         tri.data = h.text.a, 
         main = "Soil Texture for An horizons", 
         col = "red", 
         cex = .5, 
         grid.show = FALSE,
         cex.axis = .5,
         cex.lab = .7)

# kept getting an error in teh plot because the sums on some of the horizons were not totalling up to 100
# reviewed nasis data and made changes
#h.tex.b1$sum = h.tex.b1$SAND + h.tex.b1$SILT + h.tex.b1$CLAY

# Texture plot
TT.plot( class.sys = "USDA-NCSS.TT",
         tri.data = h.text.b1, 
         main = "Soil Texture for Bkn1 horizons", 
         col = "red", 
         cex = .5, 
         grid.show = FALSE,
         cex.axis = .5,
         cex.lab = .7)

#texture plot
TT.plot( class.sys = "USDA-NCSS.TT",
         tri.data = h.text.b2, 
         main = "Soil Texture for Bkn2 horizons", 
         col = "red", 
         cex = .5, 
         grid.show = FALSE,
         cex.axis = .5,
         cex.lab = .7)

#texture plot
TT.plot( class.sys = "USDA-NCSS.TT",
         tri.data = h.text.c, 
         main = "Soil Texture for Cn horizons", 
         col = "red", 
         cex = .5, 
         grid.show = FALSE,
         cex.axis = .5,
         cex.lab = .7)

Chemistry

# vector containing which variables are being selected
vars <- c('phfield', 'ec')

# data transformation from wide to long format
h.chem.1 <- melt(p1@horizons, id.vars = "genhz", measure.vars = vars)

# summarize data values by genetic horizon
h.chem <- ddply(h.chem.1, .(variable, genhz), summarize, range=prettySummary(value))

# aggregate data back by genetic horizon
h.chem <- dcast(h.chem, genhz ~ variable, value.var = 'range')

kable(h.chem, align = 'c', caption = "Chemical properties by genetic horizon (min, 25th, median, 75th, max)(n)")
Chemical properties by genetic horizon (min, 25th, median, 75th, max)(n)
genhz phfield ec
An (7.6, 8.2, 8.4, 9.2, 9.7)(36) (0.3, 1, 1.8, 3.4, 16.9)(36)
Bkn1 (7.8, 8.6, 9.2, 9.7, 10.5)(49) (0.9, 2.5, 4.6, 7.5, 20)(49)
Bkn2 (8, 8.9, 9.65, 10.1, 10.5)(52) (1.2, 5.1, 7.3, 11.8, 17.5)(54)
Cn (8.2, 9.2, 9.3, 9.4, 9.7)(9) (1.3, 2.5, 3, 6.1, 15.6)(9)
library(lattice)
library(latticeExtra)

# checking out the distribution of ph and ec data by genetic horizon
bwplot(genhz ~ value | variable, data = h.chem.1,
       main = "Box Plots of pH and EC by genetic horizon",
       scales=list(x="free"), axis = axis.grid,
       as.table = TRUE, layout = c(2,1)
       )

Effervescence

# changing name 
p1@horizons$effervescence[p1@horizons$effclass == "very slight"] <- "very.slight"

# storing levels as vector
eff <- c(violent = "violent", strong = "strong", slight = "slight", very.slight = "very slight", none = "none", missing = "missing")

# applying lables as a factor
p1@horizons$effervescence <- eff[p1@horizons$effervescence]
effclass <- factor(p1@horizons$effervescence, levels = eff)

# create table of effervescence class by genetic horizon
effclass.t <- xtabs(~ genhz + effclass, data = p1@horizons, drop.unused.levels = FALSE)

# viewing table
effclass.t
##           effclass
## genhz      very slight slight strong violent none
##   An                 0     12     13      10    1
##   Bkn1               0      6     23      20    0
##   Bkn2               0      7     23      24    0
##   Cn                 0      2      5       1    1
##   not-used           0      0      0       0    0
# creating table with ordered cols
kable(addmargins(effclass.t[, c(5,1,2,3,4)]), caption = "Effervescence by genetic horizon counts)", align = 'c')
Effervescence by genetic horizon counts)
none very slight slight strong violent Sum
An 1 0 12 13 10 36
Bkn1 0 0 6 23 20 49
Bkn2 0 0 7 23 24 54
Cn 1 0 2 5 1 9
not-used 0 0 0 0 0 0
Sum 2 0 27 64 55 148

Salinity

# assigning salinity class based on ec data directly to new saltclass slot

p1@horizons$saltclass <- ifelse(
  p1@horizons$ec < 2, "nonsaline",
  ifelse(
      p1@horizons$ec < 4, "very.slightly",
      ifelse(
          p1@horizons$ec <8, "slightly",
          ifelse(
              p1@horizons$ec < 16, "moderately",
              ifelse(
                  p1@horizons$ec >= 16, "strongly", 
                  "na"
                )
            )
        )
    )
)


# vector containing salt classes
salt <- c(nonsaline = "nonsaline", very.slightly = "very slightly", slightly = "slightly", moderately = "moderately", strongly = "strongly", na = "na")

# applying names to salt class 
p1@horizons$saltclass <- salt[p1@horizons$saltclass]
saltclass <- factor(p1@horizons$saltclass, levels =salt)

# create salt class table gy genetic horizon
saltclass.t <- xtabs(~ genhz + saltclass, data = p1@horizons, drop.unused.levels = FALSE)

# reorder table columns
saltclass.t[ , c(2,5,3,1,4)]
##           saltclass
## genhz      nonsaline very slightly slightly moderately strongly
##   An              19            11        5          0        1
##   Bkn1             6            13       20          8        2
##   Bkn2             3             3       26         20        2
##   Cn               1             5        1          2        0
##   not-used         0             0        0          0        0
# final table with reorderd cols 
kable(addmargins(saltclass.t[, c(2,5,3,1,4)]), caption = "Salinity Class by genetic horizon (counts)", align = 'c')
Salinity Class by genetic horizon (counts)
nonsaline very slightly slightly moderately strongly Sum
An 19 11 5 0 1 36
Bkn1 6 13 20 8 2 49
Bkn2 3 3 26 20 2 54
Cn 1 5 1 2 0 9
not-used 0 0 0 0 0 0
Sum 29 32 52 30 5 148

Alkalinity (pH Class)

# converting ph values to class

p1@horizons$phclass <- ifelse(
    p1@horizons$phfield < 6.0, "moderately.acid",
    ifelse(
      p1@horizons$phfield < 6.5, "slightly.acid",
      ifelse(
        p1@horizons$phfield < 7.3, "neutral",
        ifelse(
          p1@horizons$phfield < 7.9, "slightly.alkaline",
          ifelse(
            p1@horizons$phfield < 8.4, "modeately.alkaline",
            ifelse(
              p1@horizons$phfield <= 9, "strongly.alkaline",
              ifelse(
                p1@horizons$phfield > 9, "very.strongly.alkaline",
                "na"
              )
            )
          )
        )
      )
    )
)



# creating a vector of levels
ph <- c(moderately.acid = "moderately acid", slightly.acid = "slightly acid", neutral = "neutral", slightly.alkaline = "slightly alkaline", moderately.alkaline = "moderately alkaline", strongly.alkaline = "strongly alkaline", very.strongly.alkaline = "very strongly alkaline", na = "none")

# applying names
p1@horizons$phclass <- ph[p1@horizons$phclass]
phclass <- factor(p1@horizons$phclass, levels = ph)


kable(addmargins(xtabs(~ genhz + phclass, data = p1@horizons, drop.unused.levels = TRUE)), digits = 0, caption = "pH Class by generic horizon", align = 'c')
pH Class by generic horizon
slightly alkaline strongly alkaline very strongly alkaline Sum
An 5 4 14 23
Bkn1 1 15 26 42
Bkn2 0 14 34 48
Cn 0 0 7 7
Sum 6 33 81 120

Color

# Hue
kable(addmargins(xtabs(~ p1@horizons$genhz + p1@horizons$d_hue, data = p1@horizons, drop.unused.levels = TRUE)), digits = 0, caption = "Dry hue by generic horizon (counts)")
Dry hue by generic horizon (counts)
10YR 2.5Y 2.5YR 5Y 5YR 7.5YR Sum
An 10 0 0 0 0 26 36
Bkn1 16 0 0 0 2 31 49
Bkn2 18 10 1 1 3 21 54
Cn 3 3 0 0 0 2 8
Sum 47 13 1 1 5 80 147
kable(addmargins(xtabs(~ genhz + m_hue, data = p1@horizons, drop.unused.levels = TRUE)), digits = 0, caption = "Moist hue by generic horizon (counts)")
Moist hue by generic horizon (counts)
10YR 2.5Y 2.5YR 5Y 5YR 7.5R 7.5YR Sum
An 10 0 0 0 0 1 25 36
Bkn1 16 0 0 0 2 0 31 49
Bkn2 18 10 1 1 3 0 20 53
Cn 3 3 0 0 0 0 2 8
Sum 47 13 1 1 5 1 78 146
# Value and Chroma
vars <- c('d_value', 'd_chroma', 'm_value', 'm_chroma')

h.col <- melt(p1@horizons, id.vars = "genhz", measure.vars = vars)
h.col <- ddply(h.col, .(variable, genhz), summarize, range=prettySummary(value))
h.col <- dcast(h.col, genhz ~ variable, value.var = 'range')

kable(h.col, align = 'c', caption = "Color(Value and Chroma) by genetic horizon (min, 25th, median, 75th, max)(n)" )
Color(Value and Chroma) by genetic horizon (min, 25th, median, 75th, max)(n)
genhz d_value d_chroma m_value m_chroma
An (4, 5, 6, 7, 8)(36) (1, 2, 3, 3, 4)(36) (3, 4, 4, 5, 6)(36) (2, 3, 3, 3, 4)(36)
Bkn1 (4, 5, 6, 6, 7)(49) (1, 3, 3, 4, 6)(49) (3, 4, 4, 5, 6)(49) (2, 3, 3, 4, 4)(49)
Bkn2 (4, 5, 6, 7, 8)(54) (2, 2, 3, 3, 4)(54) (3, 4, 5, 5, 7)(53) (1, 2, 3, 3, 4)(53)
Cn (6, 6, 6, 7, 7)(8) (2, 2, 3, 3, 6)(8) (3, 4, 5, 5, 5)(8) (2, 3, 3, 4, 6)(8)