Background

The California Data Exchange Center (CDEC) hosts all kinds of useful climate data. You can get these easily by station and sensor code with the CDECquery() function. Lookup sensor codes by station with the CDEC_StationInfo() function.

Find CDEC stations via:

List of sensor codes by ID and associated units of measure.

Examples

Setup R Environment

With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:

# run these commands in the R console
install.packages('sharpshootR', dep=TRUE)
install.packages('soilDB', dep=TRUE)
install.packages('plyr', dep=TRUE)
install.packages('seas', dep=TRUE)
install.packages('viridis', dep=TRUE)
install.packages('latticeExtra', dep=TRUE)
install.packages('RColorBrewer', dep=TRUE)
install.packages('zoo', dep=TRUE)

# get latest version from GitHub
install.packages('remotes', dep=TRUE)
remotes::install_github("ncss-tech/sharpshootR", dependencies=FALSE, upgrade_dependencies=FALSE)

Load required packages.

library(latticeExtra)
library(plyr)
library(sharpshootR)
library(viridis)
library(seas)
library(RColorBrewer)
library(zoo)

Basic Usage

Get metadata for the Stanislaus Powerhouse, SPW, station. The results are stored in a list containing site and sensor metadata, along with any commentary related to the history of the station.

meta <- CDEC_StationInfo('SPW')
str(meta, 1)
## List of 3
##  $ site.meta  :'data.frame': 1 obs. of  11 variables:
##  $ sensor.meta:'data.frame': 15 obs. of  6 variables:
##  $ comments   : NULL

Site metadata.

kable_styling(kable(meta$site.meta, format = 'html'), font_size = 10)
Station ID River Basin Hydrologic Area Latitude Operator Elevation County Nearby City Longitude Data Collection Name
SPW STANISLAUS R SAN JOAQUIN RIVER 38.133 Pacific Gas & Electric 1060 CALAVERAS JACKSON 120.367 STANISLAUS POWERHOUSE

Sensor metadata.

kable_styling(kable(meta$sensor.meta, format = 'html'), font_size = 10)
sensor_details sensor interval sensor_name collection_method period_of_record
PRECIPITATION, ACCUMULATED, INCHES 2 (daily) (RAIN) COMPUTED 10/01/2003 to present
PRECIPITATION, INCREMENTAL, INCHES 45 (daily) (PPT INC) COMPUTED 08/25/1998 to present
TEMPERATURE, AIR AVERAGE, DEG F 30 (daily) (TEMP AV) COMPUTED 10/01/2005 to present
TEMPERATURE, AIR MAXIMUM, DEG F 31 (daily) (TEMP MX) COMPUTED 10/01/2005 to present
TEMPERATURE, AIR MINIMUM, DEG F 32 (daily) (TEMP MN) COMPUTED 10/01/2005 to present
BATTERY VOLTAGE, VOLTS 14 (event) (BAT VOL) SATELLITE 07/14/2015 to present
PRECIPITATION, ACCUMULATED, INCHES 2 (event) (RAIN) SATELLITE 07/14/2015 to present
RELATIVE HUMIDITY, % 12 (event) (REL HUM) SATELLITE 07/14/2015 to present
TEMPERATURE, AIR AVERAGE, DEG F 30 (event) (TEMP AV) SATELLITE 01/01/2014 to present
BATTERY VOLTAGE, VOLTS 14 (hourly) (BAT VOL) SATELLITE 08/24/1998 to present
PRECIPITATION, ACCUMULATED, INCHES 2 (hourly) (RAIN) SATELLITE 07/16/2015 to present
RELATIVE HUMIDITY, % 12 (hourly) (REL HUM) SATELLITE 10/01/2015 to present
TEMPERATURE, AIR AVERAGE, DEG F 30 (hourly) (TEMP AV) SATELLITE 11/08/2007 to present
PRECIPITATION, TIPPING BUCKET, INCHES 16 (hourly) (RAINTIP) SATELLITE 08/24/1998 to 07/14/2015
TEMPERATURE, AIR, DEG F 4 (hourly) (TEMP) SATELLITE 08/24/1998 to 11/08/2007

Get daily precipitation (sensor = 45, interval = ‘D’) for a specified time interval.

res <- CDECquery(id='SPW', sensor=45, interval='D', start='2018-01-01', end='2018-01-31')

# first 10 rows
kable_styling(kable(head(res, 10), format = 'html'), font_size = 10)
station_id dur_code sensor_num sensor_type value flag units datetime year month water_year water_day
SPW D 45 PPT INC 0.00 INCHES 2018-01-01 2018 January 2018 93
SPW D 45 PPT INC 0.00 INCHES 2018-01-02 2018 January 2018 94
SPW D 45 PPT INC 0.00 INCHES 2018-01-03 2018 January 2018 95
SPW D 45 PPT INC 0.20 INCHES 2018-01-04 2018 January 2018 96
SPW D 45 PPT INC 0.00 INCHES 2018-01-05 2018 January 2018 97
SPW D 45 PPT INC 1.19 INCHES 2018-01-06 2018 January 2018 98
SPW D 45 PPT INC 0.00 INCHES 2018-01-07 2018 January 2018 99
SPW D 45 PPT INC 1.43 INCHES 2018-01-08 2018 January 2018 100
SPW D 45 PPT INC 0.61 INCHES 2018-01-09 2018 January 2018 101
SPW D 45 PPT INC 0.09 INCHES 2018-01-10 2018 January 2018 102

Soil Temperature and Moisture

Some of the CDEC stations have below-ground sensors. The Highland Meadow station was the first one to have soil sensors installed at three depths. These data are collected every 20 minutes.

Get the 3 soil temperature and volumetric water content (VWC) sensor IDs for HHM. Queries are by sensor ID.

# CDEC station ID
stn <- 'HHM'

# note that station metadata is a list
stn.info <- CDEC_StationInfo(stn)
sensors <- stn.info$sensor.meta

# lookup sensor codes for soil temp and moisture
soil.temp.sensors <- sensors$sensor[grep('soil temp', sensors$sensor_details, ignore.case = TRUE)]
soil.moist.sensors <- sensors$sensor[grep('soil moistr', sensors$sensor_details, ignore.case = TRUE)]

Query the data. It might take a couple of seconds. Note that we are making two distinct requests, each for three sensors.

d.soil_temp <- CDECquery(id = stn, sensor = soil.temp.sensors, interval = 'E', start = '2017-09-15', end = '2019-09-15')
d.soil_moist <- CDECquery(id = stn, sensor = soil.moist.sensors, interval = 'E', start = '2017-09-15', end = '2019-09-15')

# quick check: nice!
kable_styling(kable(head(d.soil_temp, 10), format = 'html'), font_size = 10)
station_id dur_code sensor_num sensor_type value flag units datetime year month water_year water_day
HHM E 194 SOILTD1 50.1 DEG F 2017-09-15 00:00:00 2017 September 2017 350
HHM E 194 SOILTD1 49.9 DEG F 2017-09-15 00:20:00 2017 September 2017 350
HHM E 194 SOILTD1 49.7 DEG F 2017-09-15 00:40:00 2017 September 2017 350
HHM E 194 SOILTD1 49.2 DEG F 2017-09-15 01:00:00 2017 September 2017 350
HHM E 194 SOILTD1 49.2 DEG F 2017-09-15 01:20:00 2017 September 2017 350
HHM E 194 SOILTD1 49.0 DEG F 2017-09-15 01:40:00 2017 September 2017 350
HHM E 194 SOILTD1 48.8 DEG F 2017-09-15 02:00:00 2017 September 2017 350
HHM E 194 SOILTD1 48.8 DEG F 2017-09-15 02:20:00 2017 September 2017 350
HHM E 194 SOILTD1 48.4 DEG F 2017-09-15 02:40:00 2017 September 2017 350
HHM E 194 SOILTD1 48.4 DEG F 2017-09-15 03:00:00 2017 September 2017 350
# this is a lot of data
nrow(d.soil_temp)
## [1] 111001

Perform some basic filtering, note units.

# remove soil temperature less than 20 deg F
d.soil_temp <- subset(d.soil_temp, subset=value > 20)

# remove soil moisture (VWC) < 1%
d.soil_moist <- subset(d.soil_moist, subset=value > 1)

Plot soil moisture and temperature in a single figure. There are many ways to do this, here we are using concepts from the lattice package. Note that we stack the two data sets into “long format” for plotting.

# combine into long format for plotting
d <- make.groups('Soil Temperature (F)'=d.soil_temp, 'Soil Volumetric Water Content (%)'=d.soil_moist)

# define some reasonable plot colors and line widths
tps <- list(superpose.line=list(col=brewer.pal(3, 'Set1'), lwd=1.5))

# plot
xyplot(value ~ datetime | which, groups=sensor_type, data=d, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b %d\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)

Sometimes the daily median is sufficient.

# truncate datetime to date
d$date <- as.Date(d$datetime)

# aggregating over date / sensor group / specific sensor
a <- ddply(d, c('date', 'which', 'sensor_type'), .progress = 'text', .fun = plyr::summarize, q50=median(value, na.rm=TRUE))

# plot daily median
xyplot(q50 ~ date | which, groups=sensor_type, data=a, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)

Precipitation

Daily Precipitation

Query the daily precipitation measurements spanning the 2018-2019 water year from Sonora Ranger Station (SOR).

# get data, note sensor number
x <- CDECquery(id='sor', sensor=45, interval='D', start='2018-09-01', end='2019-09-01')

# compute cumulative PPT
# replace NA with 0 in cumulative calc
x$cumulative <- cumsum(ifelse(is.na(x$value), 0, x$value))

# plot
xyplot(value ~ datetime, data=x, type=c('h', 'g'), scales=list(x=list(alternating=3, format="%b %d\n%Y", tick.number=10)), xlab='', ylab='Daily PPT (inches)', main='Sonora Ranger Station\nDaily Total Precipitation (inches)')

xyplot(cumulative ~ datetime, data=x, type=c('s', 'g'), scales=list(x=list(alternating=3, format="%b %d\n%Y", tick.number=10)), xlab='', ylab='Cumulative PPT (inches)', main='Sonora Ranger Station\nCumulative Precipitation (inches)')

# total
sum(x$value, na.rm = TRUE)
## [1] 15.73

Monthly Precipitation

Query the entire record of monthly precipitation measurements from Sonora Ranger Station (SOR).

x <- CDECquery(id='SOR', sensor=2, interval='M', start='1900-01-01', end='2030-01-01')

# plot
bwplot(month ~ value, data=x, ylab='', main='Sonora Ranger Station\nVariation in Monthly Total PPT', xlab='Monthly Total PPT (inches)')

Water Year Summaries

Cumulative PPT over the course of an idealized water year (October 1 through September 30). Measurements are selected from water years with annual PPT close to the 5th, 25th, 50th, 75th, and 95th percentiles of annual PPT (over all water years). The current water year is symbolized with a red line. The red box and whisker plot (modified, whiskers are 5th and 95th percentiles) summarizes cumulative PPT on the current “water day”.

# define station of interest
s <- 'SOR'
# get metadata
s.info <- CDEC_StationInfo(s)
# format title for cumulative PPT
title.text <- sprintf("%s [%s]", s.info$site.meta$Name, s)

# get data
x <- CDECquery(id=s, sensor=45, interval='D', start='1900-01-01', end='2030-01-01')

## NOTE: requires sharpshootR >= 1.4.02
# plot
PCP_plot(x, ylab='Cumulative PPT (inches)', main=title.text)