The California Data Exchange
Center (CDEC) hosts all kinds of useful climate data. You can get
these easily by station and sensor code with the
CDECquery()
function. Lookup sensor codes by station with
the CDEC_StationInfo()
function.
Find CDEC stations via:
List of sensor codes by ID and associated units of measure.
With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:
# run these commands in the R console
install.packages('sharpshootR', dep=TRUE)
install.packages('soilDB', dep=TRUE)
install.packages('plyr', dep=TRUE)
install.packages('seas', dep=TRUE)
install.packages('viridis', dep=TRUE)
install.packages('latticeExtra', dep=TRUE)
install.packages('tactile', dep=TRUE)
install.packages('RColorBrewer', dep=TRUE)
install.packages('zoo', dep=TRUE)
# get latest version from GitHub
install.packages('remotes', dep=TRUE)
remotes::install_github("ncss-tech/sharpshootR", dependencies=FALSE, upgrade_dependencies=FALSE)
Load required packages.
library(latticeExtra)
library(plyr)
library(sharpshootR)
library(viridis)
library(seas)
library(RColorBrewer)
library(zoo)
library(tactile)
Get metadata for the Stanislaus Powerhouse, SPW, station. The results are stored in a list containing site and sensor metadata, along with any commentary related to the history of the station.
meta <- CDEC_StationInfo('SPW')
str(meta, 1)
## List of 3
## $ site.meta :'data.frame': 1 obs. of 11 variables:
## $ sensor.meta:'data.frame': 15 obs. of 6 variables:
## $ comments : NULL
Site metadata.
kable_styling(kable(meta$site.meta, format = 'html'), font_size = 10)
Station ID | River Basin | Hydrologic Area | Latitude | Operator | Elevation | County | Nearby City | Longitude | Maintenance | Name |
---|---|---|---|---|---|---|---|---|---|---|
SPW | STANISLAUS RIVER | SAN JOAQUIN RIVER | 38.133 | Pacific Gas & Electric | 1060 | CALAVERAS | JACKSON | 120.367 | .None Specified | STANISLAUS POWERHOUSE |
Sensor metadata.
kable_styling(kable(meta$sensor.meta, format = 'html'), font_size = 10)
sensor_details | sensor | interval | sensor_name | collection_method | period_of_record |
---|---|---|---|---|---|
PRECIPITATION, ACCUMULATED, INCHES | 2 | (daily) | (RAIN) | COMPUTED | 10/01/2003 to present |
PRECIPITATION, INCREMENTAL, INCHES | 45 | (daily) | (PPT INC) | COMPUTED | 08/25/1998 to present |
TEMPERATURE, AIR AVERAGE, DEG F | 30 | (daily) | (TEMP AV) | COMPUTED | 10/01/2005 to present |
TEMPERATURE, AIR MAXIMUM, DEG F | 31 | (daily) | (TEMP MX) | COMPUTED | 10/01/2005 to present |
TEMPERATURE, AIR MINIMUM, DEG F | 32 | (daily) | (TEMP MN) | COMPUTED | 10/01/2005 to present |
BATTERY VOLTAGE, VOLTS | 14 | (event) | (BAT VOL) | SATELLITE | 07/14/2015 to present |
PRECIPITATION, ACCUMULATED, INCHES | 2 | (event) | (RAIN) | SATELLITE | 07/14/2015 to present |
RELATIVE HUMIDITY, % | 12 | (event) | (REL HUM) | SATELLITE | 07/14/2015 to present |
TEMPERATURE, AIR AVERAGE, DEG F | 30 | (event) | (TEMP AV) | SATELLITE | 01/01/2014 to present |
BATTERY VOLTAGE, VOLTS | 14 | (hourly) | (BAT VOL) | SATELLITE | 08/24/1998 to present |
PRECIPITATION, ACCUMULATED, INCHES | 2 | (hourly) | (RAIN) | SATELLITE | 07/16/2015 to present |
RELATIVE HUMIDITY, % | 12 | (hourly) | (REL HUM) | SATELLITE | 10/01/2015 to present |
TEMPERATURE, AIR AVERAGE, DEG F | 30 | (hourly) | (TEMP AV) | SATELLITE | 11/08/2007 to present |
PRECIPITATION, TIPPING BUCKET, INCHES | 16 | (hourly) | (RAINTIP) | SATELLITE | 08/24/1998 to 07/14/2015 |
TEMPERATURE, AIR, DEG F | 4 | (hourly) | (TEMP) | SATELLITE | 08/24/1998 to 11/08/2007 |
Get daily precipitation (sensor = 45, interval = ‘D’) for a specified time interval.
res <- CDECquery(id='SPW', sensor=45, interval='D', start='2018-01-01', end='2018-01-31')
# first 10 rows
kable_styling(kable(head(res, 10), format = 'html'), font_size = 10)
station_id | dur_code | sensor_num | sensor_type | value | flag | units | datetime | year | month | water_year | water_day |
---|---|---|---|---|---|---|---|---|---|---|---|
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-01 | 2018 | January | 2018 | 93 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-02 | 2018 | January | 2018 | 94 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-03 | 2018 | January | 2018 | 95 | |
SPW | D | 45 | PPT INC | 0.20 | INCHES | 2018-01-04 | 2018 | January | 2018 | 96 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-05 | 2018 | January | 2018 | 97 | |
SPW | D | 45 | PPT INC | 1.19 | INCHES | 2018-01-06 | 2018 | January | 2018 | 98 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-07 | 2018 | January | 2018 | 99 | |
SPW | D | 45 | PPT INC | 1.43 | INCHES | 2018-01-08 | 2018 | January | 2018 | 100 | |
SPW | D | 45 | PPT INC | 0.61 | INCHES | 2018-01-09 | 2018 | January | 2018 | 101 | |
SPW | D | 45 | PPT INC | 0.09 | INCHES | 2018-01-10 | 2018 | January | 2018 | 102 |
Some of the CDEC stations have below-ground sensors. The Highland Meadow station was the first one to have soil sensors installed at three depths. These data are collected every 20 minutes.
Get the 3 soil temperature and volumetric water content (VWC) sensor IDs for HHM. Queries are by sensor ID.
# CDEC station ID
stn <- 'HHM'
# note that station metadata is a list
stn.info <- CDEC_StationInfo(stn)
sensors <- stn.info$sensor.meta
# lookup sensor codes for soil temp and moisture
soil.temp.sensors <- sensors$sensor[grep('soil temp', sensors$sensor_details, ignore.case = TRUE)]
soil.moist.sensors <- sensors$sensor[grep('soil moistr', sensors$sensor_details, ignore.case = TRUE)]
Query the data. It might take a couple of seconds. Note that we are making two distinct requests, each for three sensors.
d.soil_temp <- CDECquery(id = stn, sensor = soil.temp.sensors, interval = 'E', start = '2020-09-15', end = '2021-09-15')
d.soil_moist <- CDECquery(id = stn, sensor = soil.moist.sensors, interval = 'E', start = '2020-09-15', end = '2021-09-15')
# quick check: nice!
kable_styling(kable(head(d.soil_temp, 10), format = 'html'), font_size = 10)
station_id | dur_code | sensor_num | sensor_type | value | flag | units | datetime | year | month | water_year | water_day |
---|---|---|---|---|---|---|---|---|---|---|---|
HHM | E | 194 | SOILTD1 | 56 | DEG F | 2020-09-14 23:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 55 | DEG F | 2020-09-15 00:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 54 | DEG F | 2020-09-15 01:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 53 | DEG F | 2020-09-15 02:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 52 | DEG F | 2020-09-15 03:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 52 | DEG F | 2020-09-15 04:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 51 | DEG F | 2020-09-15 05:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 51 | DEG F | 2020-09-15 06:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 50 | DEG F | 2020-09-15 07:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 51 | DEG F | 2020-09-15 08:15:00 | 2020 | September | 2020 | 351 |
# this is a lot of data
nrow(d.soil_temp)
## [1] 26046
Perform some basic filtering, note units.
# remove soil temperature less than 20 deg F
d.soil_temp <- subset(d.soil_temp, subset=value > 20)
# remove soil moisture (VWC) < 1%
d.soil_moist <- subset(d.soil_moist, subset=value > 1)
Plot soil moisture and temperature in a single figure. There are many
ways to do this, here we are using concepts from the
lattice
package. Note that we stack the two data sets into
“long format” for plotting.
# combine into long format for plotting
d <- make.groups('Soil Temperature (F)'=d.soil_temp, 'Soil Volumetric Water Content (%)'=d.soil_moist)
# define some reasonable plot colors and line widths
tps <- tactile.theme(superpose.line=list(col=brewer.pal(3, 'Set1'), lwd=1.5))
# plot
xyplot(value ~ datetime | which, groups=sensor_type, data=d, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b %d\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)
Sometimes the daily median is sufficient.
# truncate datetime to date
d$date <- as.Date(d$datetime)
# aggregating over date / sensor group / specific sensor
a <- ddply(d, c('date', 'which', 'sensor_type'), .progress = 'text', .fun = plyr::summarize, q50=median(value, na.rm=TRUE))
# plot daily median
xyplot(q50 ~ date | which, groups=sensor_type, data=a, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)
Query the daily precipitation measurements spanning the 2018-2019 water year from Sonora Ranger Station (SOR).
# get data, note sensor number
x <- CDECquery(id='sor', sensor=45, interval='D', start='2018-09-01', end='2019-09-01')
# compute cumulative PPT
# replace NA with 0 in cumulative calc
x$cumulative <- cumsum(ifelse(is.na(x$value), 0, x$value))
# plot
xyplot(value ~ datetime, data=x, type=c('h', 'g'), scales=list(x=list(alternating=3, format="%b %d\n%Y", tick.number=10)), xlab='', ylab='Daily PPT (inches)', main='Sonora Ranger Station\nDaily Total Precipitation (inches)', par.settings = tactile.theme())
xyplot(cumulative ~ datetime, data=x, type=c('s', 'g'), scales=list(x=list(alternating=3, format="%b %d\n%Y", tick.number=10)), xlab='', ylab='Cumulative PPT (inches)', main='Sonora Ranger Station\nCumulative Precipitation (inches)', par.settings = tactile.theme())
# total
sum(x$value, na.rm = TRUE)
## [1] 34.16
Query the entire record of monthly precipitation measurements from Sonora Ranger Station (SOR).
x <- CDECquery(id='SOR', sensor=2, interval='M', start='1900-01-01', end='2030-01-01')
# plot
bwplot(month ~ value, data=x, ylab='', main='Sonora Ranger Station\nVariation in Monthly Total PPT', xlab='Monthly Total PPT (inches)', par.settings = tactile.theme())
Cumulative PPT over the course of an idealized water year (October 1 through September 30). Measurements are selected from water years with annual PPT close to the 5th, 25th, 50th, 75th, and 95th percentiles of annual PPT (over all water years). The current water year is symbolized with a red line. The red box and whisker plot (modified, whiskers are 5th and 95th percentiles) summarizes cumulative PPT on the current “water day”.
# define station of interest
s <- 'SPW'
# get metadata
s.info <- CDEC_StationInfo(s)
# format title for cumulative PPT
title.text <- sprintf("%s [%s]", s.info$site.meta$Name, s)
# get data
x <- CDECquery(id=s, sensor=45, interval='D', start='2000-01-01', end='2030-01-01')
# plot
par(mar=c(4.5, 4.5, 2.5, 1.5))
PCP_plot(x, ylab='Cumulative PPT (inches)', main=title.text, this.year = 2021)
Key elements:
Query the entire record of daily precipitation measurements from New Hogan Lake (NHG).
x <- CDECquery(id='NHG', sensor=45, interval='D', start='1900-01-01', end='2030-01-01')
summary(x$value)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00000 0.00000 0.00000 0.06376 0.00000 4.38000
head(x)
## station_id dur_code sensor_num sensor_type value flag units datetime year month water_year
## 1 NHG D 45 PPT INC 0.00 INCHES 1985-01-01 1985 January 1985
## 7 NHG D 45 PPT INC 0.61 INCHES 1985-01-07 1985 January 1985
## 21 NHG D 45 PPT INC 0.00 INCHES 1985-01-21 1985 January 1985
## 28 NHG D 45 PPT INC 0.08 INCHES 1985-01-28 1985 January 1985
## 35 NHG D 45 PPT INC 0.00 INCHES 1985-02-04 1985 February 1985
## 49 NHG D 45 PPT INC 0.00 INCHES 1985-02-18 1985 February 1985
## water_day
## 1 93
## 7 99
## 21 113
## 28 120
## 35 127
## 49 141
nrow(x)
## [1] 11732
Plot raw data.
xyplot(value ~ datetime, data=x, type='h', main='NEW HOGAN LAKE (NHG)', ylab='Daily Precipitation (inches)', xlab='', par.settings = tactile.theme())
Sum daily precipitation by year/month and display as box-whisker plot.
a <- ddply(x, c('year','month'), plyr::summarise, ppt=sum(value, na.rm=TRUE))
# plot
bwplot(ppt ~ month, data=a, xlab='', ylab='Monthly Total Precipitation (inches)', main='NEW HOGAN LAKE (NHG)', scales=list(y=list(tick.number=10), x=list(cex=0.85, rot=45)), par.settings = tactile.theme())
Visualize monthly precipitation as grouped by year/month.
# color ramp for monthly PPT
# denote 0 with grey
cols <- colorRampPalette(rev(c(viridis(10), grey(0.90))), space='Lab', interpolate='spline')
levelplot(ppt ~ factor(year) * month, data=a, col.regions=cols, xlab='',
ylab='', scales=list(x=list(tick.number=10, rot=45)),
main='NEW HOGAN LAKE (NHG)\nMonthly Total Precipitation (inches)')
Alternate representation via seas
package.
# alter column names and convert units from inches to mm
x$date <- as.Date(x$datetime)
x$value <- x$value * 25.4
x$precip <- x$value
# init seas object
ss <- seas.sum(x, 'precip', width='month', start.day = as.Date('2015-09-01'), prime='precip')
# monthly summaries, arranged according to CA water year
plot(ss, main='NEW HOGAN LAKE (NHG)')
Alternate representation
image(ss)
Monthly normals.
sn <- seas.norm(ss)
plot(sn)
# get daily reservoir storage (ac. ft) from Pinecrest, New Melones, Lyons, and Don Pedro Reservoirs
pinecrest <- CDECquery(id='swb', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')
new.melones <- CDECquery(id='nml', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')
lyons <- CDECquery(id='lys', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')
dp <- CDECquery(id='dnp', sensor=15, interval='D', start='2015-01-01', end='2030-01-01')
# compute storage capacity, based on published capacities
pinecrest$capacity <- pinecrest$value / 18312 * 100
new.melones$capacity <- new.melones$value / 2400000 * 100
lyons$capacity <- lyons$value / 6228 * 100
dp$capacity <- dp$value / 2030000 * 100
# combine
g <- make.groups('New Melones'=new.melones, 'Lyons'=lyons, 'Pinecrest'=pinecrest, 'Don Pedro'=dp)
# resonable date scale
r <- range(g$datetime)
s.r <- seq(from=r[1], to=r[2] + 6000000, by='6 months')
# better colors
tps <- tactile.theme(superpose.line=list(lwd=2, col=brewer.pal(n=4, name='Set1')))
# plot
xyplot(capacity ~ datetime, groups=which, data=g, type='l',
xlab='', ylab='Capacity (%)',
scales=list(alternating=3, y=list(tick.number=10),x=list(at=s.r, labels=format(s.r, "%B\n%Y"))),
auto.key=list(columns=4, lines=TRUE, points=FALSE),
par.settings=tps,
panel=function(...) {
panel.abline(h=seq(0, 100, by=10), col='grey')
panel.abline(v=s.r, col='grey')
panel.xyplot(...)
})
Investigate monthly levels and capacity of New Melones Reservoir.
# New Melones monthly data, retrieve as far back in time as possible
new.melones.monthly <- CDECquery(id='nml', sensor=15, interval='M', start='1900-01-01', end='2030-01-01')
# convert to pct. capacity
new.melones.monthly$capacity <- new.melones.monthly$value / 2400000 * 100
# color ramp
cols <- colorRampPalette(rev(viridis(10)), space='Lab', interpolate='spline')
# plot, each pixel is colored by the total precip by year/month
levelplot(capacity ~ year * month, data=new.melones.monthly, col.regions=cols, xlab='', ylab='', scales=list(x=list(tick.number=20)), main='New Melones Capacity (%)', par.settings = tactile.theme())
This document is based on sharpshootR
version 2.2.