1 Background

The California Data Exchange Center (CDEC) hosts all kinds of useful climate data. You can get these easily by station and sensor code with the CDECquery() function. Lookup sensor codes by station with the CDEC_StationInfo() function.

Find CDEC stations via:

List of sensor codes by ID and associated units of measure.

2 Examples

2.1 Setup R Environment

With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:

# run these commands in the R console
install.packages('sharpshootR', dep=TRUE)
install.packages('soilDB', dep=TRUE)
install.packages('plyr', dep=TRUE)
install.packages('seas', dep=TRUE)
install.packages('viridis', dep=TRUE)
install.packages('latticeExtra', dep=TRUE)
install.packages('tactile', dep=TRUE)
install.packages('RColorBrewer', dep=TRUE)
install.packages('zoo', dep=TRUE)

# get latest version from GitHub
install.packages('remotes', dep=TRUE)
remotes::install_github("ncss-tech/sharpshootR", dependencies=FALSE, upgrade_dependencies=FALSE)

Load required packages.

library(latticeExtra)
library(plyr)
library(sharpshootR)
library(viridis)
library(seas)
library(RColorBrewer)
library(zoo)
library(tactile)

2.2 Basic Usage

Get metadata for the Stanislaus Powerhouse, SPW, station. The results are stored in a list containing site and sensor metadata, along with any commentary related to the history of the station.

meta <- CDEC_StationInfo('SPW')
str(meta, 1)
## List of 3
##  $ site.meta  :'data.frame': 1 obs. of  11 variables:
##  $ sensor.meta:'data.frame': 15 obs. of  6 variables:
##  $ comments   : NULL

Site metadata.

kable_styling(kable(meta$site.meta, format = 'html'), font_size = 10)
Station ID River Basin Hydrologic Area Latitude Operator Elevation County Nearby City Longitude Maintenance Name
SPW STANISLAUS RIVER SAN JOAQUIN RIVER 38.133 Pacific Gas & Electric 1060 CALAVERAS JACKSON 120.367 .None Specified STANISLAUS POWERHOUSE

Sensor metadata.

kable_styling(kable(meta$sensor.meta, format = 'html'), font_size = 10)
sensor_details sensor interval sensor_name collection_method period_of_record
PRECIPITATION, ACCUMULATED, INCHES 2 (daily) (RAIN) COMPUTED 10/01/2003 to present
PRECIPITATION, INCREMENTAL, INCHES 45 (daily) (PPT INC) COMPUTED 08/25/1998 to present
TEMPERATURE, AIR AVERAGE, DEG F 30 (daily) (TEMP AV) COMPUTED 10/01/2005 to present
TEMPERATURE, AIR MAXIMUM, DEG F 31 (daily) (TEMP MX) COMPUTED 10/01/2005 to present
TEMPERATURE, AIR MINIMUM, DEG F 32 (daily) (TEMP MN) COMPUTED 10/01/2005 to present
BATTERY VOLTAGE, VOLTS 14 (event) (BAT VOL) SATELLITE 07/14/2015 to present
PRECIPITATION, ACCUMULATED, INCHES 2 (event) (RAIN) SATELLITE 07/14/2015 to present
RELATIVE HUMIDITY, % 12 (event) (REL HUM) SATELLITE 07/14/2015 to present
TEMPERATURE, AIR AVERAGE, DEG F 30 (event) (TEMP AV) SATELLITE 01/01/2014 to present
BATTERY VOLTAGE, VOLTS 14 (hourly) (BAT VOL) SATELLITE 08/24/1998 to present
PRECIPITATION, ACCUMULATED, INCHES 2 (hourly) (RAIN) SATELLITE 07/16/2015 to present
RELATIVE HUMIDITY, % 12 (hourly) (REL HUM) SATELLITE 10/01/2015 to present
TEMPERATURE, AIR AVERAGE, DEG F 30 (hourly) (TEMP AV) SATELLITE 11/08/2007 to present
PRECIPITATION, TIPPING BUCKET, INCHES 16 (hourly) (RAINTIP) SATELLITE 08/24/1998 to 07/14/2015
TEMPERATURE, AIR, DEG F 4 (hourly) (TEMP) SATELLITE 08/24/1998 to 11/08/2007

Get daily precipitation (sensor = 45, interval = ‘D’) for a specified time interval.

res <- CDECquery(id='SPW', sensor=45, interval='D', start='2018-01-01', end='2018-01-31')

# first 10 rows
kable_styling(kable(head(res, 10), format = 'html'), font_size = 10)
station_id dur_code sensor_num sensor_type value flag units datetime year month water_year water_day
SPW D 45 PPT INC 0.00 INCHES 2018-01-01 2018 January 2018 93
SPW D 45 PPT INC 0.00 INCHES 2018-01-02 2018 January 2018 94
SPW D 45 PPT INC 0.00 INCHES 2018-01-03 2018 January 2018 95
SPW D 45 PPT INC 0.20 INCHES 2018-01-04 2018 January 2018 96
SPW D 45 PPT INC 0.00 INCHES 2018-01-05 2018 January 2018 97
SPW D 45 PPT INC 1.19 INCHES 2018-01-06 2018 January 2018 98
SPW D 45 PPT INC 0.00 INCHES 2018-01-07 2018 January 2018 99
SPW D 45 PPT INC 1.43 INCHES 2018-01-08 2018 January 2018 100
SPW D 45 PPT INC 0.61 INCHES 2018-01-09 2018 January 2018 101
SPW D 45 PPT INC 0.09 INCHES 2018-01-10 2018 January 2018 102

2.3 Soil Temperature and Moisture

Some of the CDEC stations have below-ground sensors. The Highland Meadow station was the first one to have soil sensors installed at three depths. These data are collected every 20 minutes.

Get the 3 soil temperature and volumetric water content (VWC) sensor IDs for HHM. Queries are by sensor ID.

# CDEC station ID
stn <- 'HHM'

# note that station metadata is a list
stn.info <- CDEC_StationInfo(stn)
sensors <- stn.info$sensor.meta

# lookup sensor codes for soil temp and moisture
soil.temp.sensors <- sensors$sensor[grep('soil temp', sensors$sensor_details, ignore.case = TRUE)]
soil.moist.sensors <- sensors$sensor[grep('soil moistr', sensors$sensor_details, ignore.case = TRUE)]

Query the data. It might take a couple of seconds. Note that we are making two distinct requests, each for three sensors.

d.soil_temp <- CDECquery(id = stn, sensor = soil.temp.sensors, interval = 'E', start = '2020-09-15', end = '2021-09-15')
d.soil_moist <- CDECquery(id = stn, sensor = soil.moist.sensors, interval = 'E', start = '2020-09-15', end = '2021-09-15')

# quick check: nice!
kable_styling(kable(head(d.soil_temp, 10), format = 'html'), font_size = 10)
station_id dur_code sensor_num sensor_type value flag units datetime year month water_year water_day
HHM E 194 SOILTD1 56 DEG F 2020-09-14 23:15:00 2020 September 2020 351
HHM E 194 SOILTD1 55 DEG F 2020-09-15 00:15:00 2020 September 2020 351
HHM E 194 SOILTD1 54 DEG F 2020-09-15 01:15:00 2020 September 2020 351
HHM E 194 SOILTD1 53 DEG F 2020-09-15 02:15:00 2020 September 2020 351
HHM E 194 SOILTD1 52 DEG F 2020-09-15 03:15:00 2020 September 2020 351
HHM E 194 SOILTD1 52 DEG F 2020-09-15 04:15:00 2020 September 2020 351
HHM E 194 SOILTD1 51 DEG F 2020-09-15 05:15:00 2020 September 2020 351
HHM E 194 SOILTD1 51 DEG F 2020-09-15 06:15:00 2020 September 2020 351
HHM E 194 SOILTD1 50 DEG F 2020-09-15 07:15:00 2020 September 2020 351
HHM E 194 SOILTD1 51 DEG F 2020-09-15 08:15:00 2020 September 2020 351
# this is a lot of data
nrow(d.soil_temp)
## [1] 26046

Perform some basic filtering, note units.

# remove soil temperature less than 20 deg F
d.soil_temp <- subset(d.soil_temp, subset=value > 20)

# remove soil moisture (VWC) < 1%
d.soil_moist <- subset(d.soil_moist, subset=value > 1)

Plot soil moisture and temperature in a single figure. There are many ways to do this, here we are using concepts from the lattice package. Note that we stack the two data sets into “long format” for plotting.

# combine into long format for plotting
d <- make.groups('Soil Temperature (F)'=d.soil_temp, 'Soil Volumetric Water Content (%)'=d.soil_moist)

# define some reasonable plot colors and line widths
tps <- tactile.theme(superpose.line=list(col=brewer.pal(3, 'Set1'), lwd=1.5))

# plot
xyplot(value ~ datetime | which, groups=sensor_type, data=d, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b %d\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)

Sometimes the daily median is sufficient.

# truncate datetime to date
d$date <- as.Date(d$datetime)

# aggregating over date / sensor group / specific sensor
a <- ddply(d, c('date', 'which', 'sensor_type'), .progress = 'text', .fun = plyr::summarize, q50=median(value, na.rm=TRUE))

# plot daily median
xyplot(q50 ~ date | which, groups=sensor_type, data=a, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)

2.4 Precipitation

2.4.1 Daily Precipitation

Query the daily precipitation measurements spanning the 2018-2019 water year from Sonora Ranger Station (SOR).

# get data, note sensor number
x <- CDECquery(id='sor', sensor=45, interval='D', start='2018-09-01', end='2019-09-01')

# compute cumulative PPT
# replace NA with 0 in cumulative calc
x$cumulative <- cumsum(ifelse(is.na(x$value), 0, x$value))

# plot
xyplot(value ~ datetime, data=x, type=c('h', 'g'), scales=list(x=list(alternating=3, format="%b %d\n%Y", tick.number=10)), xlab='', ylab='Daily PPT (inches)', main='Sonora Ranger Station\nDaily Total Precipitation (inches)', par.settings = tactile.theme())

xyplot(cumulative ~ datetime, data=x, type=c('s', 'g'), scales=list(x=list(alternating=3, format="%b %d\n%Y", tick.number=10)), xlab='', ylab='Cumulative PPT (inches)', main='Sonora Ranger Station\nCumulative Precipitation (inches)', par.settings = tactile.theme())

# total
sum(x$value, na.rm = TRUE)
## [1] 34.16

2.4.2 Monthly Precipitation

Query the entire record of monthly precipitation measurements from Sonora Ranger Station (SOR).

x <- CDECquery(id='SOR', sensor=2, interval='M', start='1900-01-01', end='2030-01-01')

# plot
bwplot(month ~ value, data=x, ylab='', main='Sonora Ranger Station\nVariation in Monthly Total PPT', xlab='Monthly Total PPT (inches)', par.settings = tactile.theme())

2.4.3 Water Year Summaries

Cumulative PPT over the course of an idealized water year (October 1 through September 30). Measurements are selected from water years with annual PPT close to the 5th, 25th, 50th, 75th, and 95th percentiles of annual PPT (over all water years). The current water year is symbolized with a red line. The red box and whisker plot (modified, whiskers are 5th and 95th percentiles) summarizes cumulative PPT on the current “water day”.

# define station of interest
s <- 'SPW'
# get metadata
s.info <- CDEC_StationInfo(s)
# format title for cumulative PPT
title.text <- sprintf("%s [%s]", s.info$site.meta$Name, s)

# get data
x <- CDECquery(id=s, sensor=45, interval='D', start='2000-01-01', end='2030-01-01')

# plot
par(mar=c(4.5, 4.5, 2.5, 1.5))
PCP_plot(x, ylab='Cumulative PPT (inches)', main=title.text, this.year = 2021)

Key elements:

  • red line is the current water year’s cumulative PPT trajectory
  • red date and value are current date and cumulative PPT
  • red box-whisker are 5-25-50-75-95th percentiles of cumulative PPT on current day of water year
  • blue traces are cumulative PPT records corresponding to 5th, 25th, 50th, 75th, and 95th percentiles of historical (end of water-year) cumulative PPT
  • blue box-whisker indicates 5-25-50-75-95th percentiles of starting water-day where cumulative PPT > 0.1 (units)

2.4.4 Visualization Ideas

Query the entire record of daily precipitation measurements from New Hogan Lake (NHG).

x <- CDECquery(id='NHG', sensor=45, interval='D', start='1900-01-01', end='2030-01-01')
summary(x$value)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.00000 0.00000 0.00000 0.06376 0.00000 4.38000
head(x)
##    station_id dur_code sensor_num sensor_type value flag  units   datetime year    month water_year
## 1         NHG        D         45     PPT INC  0.00      INCHES 1985-01-01 1985  January       1985
## 7         NHG        D         45     PPT INC  0.61      INCHES 1985-01-07 1985  January       1985
## 21        NHG        D         45     PPT INC  0.00      INCHES 1985-01-21 1985  January       1985
## 28        NHG        D         45     PPT INC  0.08      INCHES 1985-01-28 1985  January       1985
## 35        NHG        D         45     PPT INC  0.00      INCHES 1985-02-04 1985 February       1985
## 49        NHG        D         45     PPT INC  0.00      INCHES 1985-02-18 1985 February       1985
##    water_day
## 1         93
## 7         99
## 21       113
## 28       120
## 35       127
## 49       141
nrow(x)
## [1] 11732

Plot raw data.

xyplot(value ~ datetime, data=x, type='h', main='NEW HOGAN LAKE (NHG)', ylab='Daily Precipitation (inches)', xlab='', par.settings = tactile.theme())

Sum daily precipitation by year/month and display as box-whisker plot.

a <- ddply(x, c('year','month'), plyr::summarise, ppt=sum(value, na.rm=TRUE))

# plot
bwplot(ppt ~ month, data=a, xlab='', ylab='Monthly Total Precipitation (inches)', main='NEW HOGAN LAKE (NHG)', scales=list(y=list(tick.number=10), x=list(cex=0.85, rot=45)), par.settings = tactile.theme())

Visualize monthly precipitation as grouped by year/month.

# color ramp for monthly PPT
# denote 0 with grey 
cols <- colorRampPalette(rev(c(viridis(10), grey(0.90))), space='Lab', interpolate='spline')

levelplot(ppt ~ factor(year) * month, data=a, col.regions=cols, xlab='', 
          ylab='', scales=list(x=list(tick.number=10, rot=45)), 
          main='NEW HOGAN LAKE (NHG)\nMonthly Total Precipitation (inches)')

Alternate representation via seas package.

# alter column names and convert units from inches to mm
x$date <- as.Date(x$datetime)
x$value <- x$value * 25.4
x$precip <- x$value

# init seas object
ss <- seas.sum(x, 'precip', width='month', start.day = as.Date('2015-09-01'), prime='precip')

# monthly summaries, arranged according to CA water year
plot(ss, main='NEW HOGAN LAKE (NHG)')

Alternate representation

image(ss)

Monthly normals.

sn <- seas.norm(ss)
plot(sn)

2.5 Reservoir Levels

# get daily reservoir storage (ac. ft) from Pinecrest, New Melones, Lyons, and Don Pedro Reservoirs
pinecrest <- CDECquery(id='swb', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')
new.melones <- CDECquery(id='nml', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')
lyons <- CDECquery(id='lys', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')
dp <- CDECquery(id='dnp', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')

# compute storage capacity, based on published capacities
pinecrest$capacity <- pinecrest$value / 18312 * 100
new.melones$capacity <- new.melones$value / 2400000 * 100
lyons$capacity <- lyons$value / 6228 * 100
dp$capacity <- dp$value /   2030000 * 100

# combine
g <- make.groups('New Melones'=new.melones, 'Lyons'=lyons, 'Pinecrest'=pinecrest, 'Don Pedro'=dp)

# resonable date scale
r <- range(g$datetime)
s.r <- seq(from=r[1], to=r[2] + 6000000, by='6 months')

# better colors
tps <- tactile.theme(superpose.line=list(lwd=2, col=brewer.pal(n=4, name='Set1')))

# plot
xyplot(capacity ~ datetime, groups=which, data=g, type='l', 
       xlab='', ylab='Capacity (%)',
       scales=list(alternating=3, y=list(tick.number=10),x=list(at=s.r, labels=format(s.r, "%B\n%Y"))),
       auto.key=list(columns=4, lines=TRUE, points=FALSE),
       par.settings=tps,
       panel=function(...) {
         panel.abline(h=seq(0, 100, by=10), col='grey')
         panel.abline(v=s.r, col='grey')
         panel.xyplot(...)
       })

Investigate monthly levels and capacity of New Melones Reservoir.

# New Melones monthly data, retrieve as far back in time as possible 
new.melones.monthly <- CDECquery(id='nml', sensor=15, interval='M', start='1900-01-01', end='2030-01-01')

# convert to pct. capacity
new.melones.monthly$capacity <- new.melones.monthly$value / 2400000 * 100

# color ramp
cols <- colorRampPalette(rev(viridis(10)), space='Lab', interpolate='spline')

# plot, each pixel is colored by the total precip by year/month
levelplot(capacity ~ year * month, data=new.melones.monthly, col.regions=cols, xlab='', ylab='', scales=list(x=list(tick.number=20)), main='New Melones Capacity (%)', par.settings = tactile.theme())

2.5.1 Rate of Change in New Melones Res.



This document is based on sharpshootR version 2.2.