The California Data Exchange
Center (CDEC) hosts all kinds of useful climate data. You can get
these easily by station and sensor code with the
CDECquery()
function. Lookup sensor codes by station with
the CDEC_StationInfo()
function.
Find CDEC stations via:
List of sensor codes by ID and associated units of measure.
With a recent version of R (>= 2.15), it is possible to get all of the packages that this tutorial depends on via:
# run these commands in the R console
install.packages('sharpshootR', dep=TRUE)
install.packages('soilDB', dep=TRUE)
install.packages('plyr', dep=TRUE)
install.packages('seas', dep=TRUE)
install.packages('viridis', dep=TRUE)
install.packages('latticeExtra', dep=TRUE)
install.packages('tactile', dep=TRUE)
install.packages('RColorBrewer', dep=TRUE)
install.packages('zoo', dep=TRUE)
# get latest version from GitHub
install.packages('remotes', dep=TRUE)
remotes::install_github("ncss-tech/sharpshootR", dependencies=FALSE, upgrade_dependencies=FALSE)
Load required packages.
library(latticeExtra)
library(plyr)
library(sharpshootR)
library(viridis)
library(seas)
library(RColorBrewer)
library(zoo)
library(tactile)
Get metadata for the Stanislaus Powerhouse, SPW, station. The results are stored in a list containing site and sensor metadata, along with any commentary related to the history of the station.
meta <- CDEC_StationInfo('SPW')
str(meta, 1)
## List of 3
## $ site.meta :'data.frame': 1 obs. of 11 variables:
## $ sensor.meta:'data.frame': 15 obs. of 6 variables:
## $ comments : NULL
Site metadata.
kable_styling(kable(meta$site.meta, format = 'html'), font_size = 10)
Station ID | River Basin | Hydrologic Area | Latitude | Operator | Elevation | County | Nearby City | Longitude | Maintenance | Name |
---|---|---|---|---|---|---|---|---|---|---|
SPW | STANISLAUS RIVER | SAN JOAQUIN RIVER | 38.133 | Pacific Gas & Electric | 1060 | CALAVERAS | JACKSON | 120.367 | .None Specified | STANISLAUS POWERHOUSE |
Sensor metadata.
kable_styling(kable(meta$sensor.meta, format = 'html'), font_size = 10)
sensor_details | sensor | interval | sensor_name | collection_method | period_of_record |
---|---|---|---|---|---|
PRECIPITATION, ACCUMULATED, INCHES | 2 | (daily) | (RAIN) | COMPUTED | 10/01/2003 to present |
PRECIPITATION, INCREMENTAL, INCHES | 45 | (daily) | (PPT INC) | COMPUTED | 08/25/1998 to present |
TEMPERATURE, AIR AVERAGE, DEG F | 30 | (daily) | (TEMP AV) | COMPUTED | 10/01/2005 to present |
TEMPERATURE, AIR MAXIMUM, DEG F | 31 | (daily) | (TEMP MX) | COMPUTED | 10/01/2005 to present |
TEMPERATURE, AIR MINIMUM, DEG F | 32 | (daily) | (TEMP MN) | COMPUTED | 10/01/2005 to present |
BATTERY VOLTAGE, VOLTS | 14 | (event) | (BAT VOL) | SATELLITE | 07/14/2015 to present |
PRECIPITATION, ACCUMULATED, INCHES | 2 | (event) | (RAIN) | SATELLITE | 07/14/2015 to present |
RELATIVE HUMIDITY, % | 12 | (event) | (REL HUM) | SATELLITE | 07/14/2015 to present |
TEMPERATURE, AIR AVERAGE, DEG F | 30 | (event) | (TEMP AV) | SATELLITE | 01/01/2014 to present |
BATTERY VOLTAGE, VOLTS | 14 | (hourly) | (BAT VOL) | SATELLITE | 08/24/1998 to present |
PRECIPITATION, ACCUMULATED, INCHES | 2 | (hourly) | (RAIN) | SATELLITE | 07/16/2015 to present |
RELATIVE HUMIDITY, % | 12 | (hourly) | (REL HUM) | SATELLITE | 10/01/2015 to present |
TEMPERATURE, AIR AVERAGE, DEG F | 30 | (hourly) | (TEMP AV) | SATELLITE | 11/08/2007 to present |
PRECIPITATION, TIPPING BUCKET, INCHES | 16 | (hourly) | (RAINTIP) | SATELLITE | 08/24/1998 to 07/14/2015 |
TEMPERATURE, AIR, DEG F | 4 | (hourly) | (TEMP) | SATELLITE | 08/24/1998 to 11/08/2007 |
Get daily precipitation (sensor = 45, interval = ‘D’) for a specified time interval.
res <- CDECquery(id='SPW', sensor=45, interval='D', start='2018-01-01', end='2018-01-31')
# first 10 rows
kable_styling(kable(head(res, 10), format = 'html'), font_size = 10)
station_id | dur_code | sensor_num | sensor_type | value | flag | units | datetime | year | month | water_year | water_day |
---|---|---|---|---|---|---|---|---|---|---|---|
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-01 | 2018 | January | 2018 | 93 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-02 | 2018 | January | 2018 | 94 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-03 | 2018 | January | 2018 | 95 | |
SPW | D | 45 | PPT INC | 0.20 | INCHES | 2018-01-04 | 2018 | January | 2018 | 96 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-05 | 2018 | January | 2018 | 97 | |
SPW | D | 45 | PPT INC | 1.19 | INCHES | 2018-01-06 | 2018 | January | 2018 | 98 | |
SPW | D | 45 | PPT INC | 0.00 | INCHES | 2018-01-07 | 2018 | January | 2018 | 99 | |
SPW | D | 45 | PPT INC | 1.43 | INCHES | 2018-01-08 | 2018 | January | 2018 | 100 | |
SPW | D | 45 | PPT INC | 0.61 | INCHES | 2018-01-09 | 2018 | January | 2018 | 101 | |
SPW | D | 45 | PPT INC | 0.09 | INCHES | 2018-01-10 | 2018 | January | 2018 | 102 |
Some of the CDEC stations have below-ground sensors. The Highland Meadow station was the first one to have soil sensors installed at three depths. These data are collected every 20 minutes.
Get the 3 soil temperature and volumetric water content (VWC) sensor IDs for HHM. Queries are by sensor ID.
# CDEC station ID
stn <- 'HHM'
# note that station metadata is a list
stn.info <- CDEC_StationInfo(stn)
sensors <- stn.info$sensor.meta
# lookup sensor codes for soil temp and moisture
soil.temp.sensors <- sensors$sensor[grep('soil temp', sensors$sensor_details, ignore.case = TRUE)]
soil.moist.sensors <- sensors$sensor[grep('soil moistr', sensors$sensor_details, ignore.case = TRUE)]
Query the data. It might take a couple of seconds. Note that we are making two distinct requests, each for three sensors.
d.soil_temp <- CDECquery(id = stn, sensor = soil.temp.sensors, interval = 'E', start = '2020-09-15', end = '2021-09-15')
d.soil_moist <- CDECquery(id = stn, sensor = soil.moist.sensors, interval = 'E', start = '2020-09-15', end = '2021-09-15')
# quick check: nice!
kable_styling(kable(head(d.soil_temp, 10), format = 'html'), font_size = 10)
station_id | dur_code | sensor_num | sensor_type | value | flag | units | datetime | year | month | water_year | water_day |
---|---|---|---|---|---|---|---|---|---|---|---|
HHM | E | 194 | SOILTD1 | 56 | DEG F | 2020-09-14 23:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 55 | DEG F | 2020-09-15 00:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 54 | DEG F | 2020-09-15 01:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 53 | DEG F | 2020-09-15 02:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 52 | DEG F | 2020-09-15 03:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 52 | DEG F | 2020-09-15 04:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 51 | DEG F | 2020-09-15 05:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 51 | DEG F | 2020-09-15 06:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 50 | DEG F | 2020-09-15 07:15:00 | 2020 | September | 2020 | 351 | |
HHM | E | 194 | SOILTD1 | 51 | DEG F | 2020-09-15 08:15:00 | 2020 | September | 2020 | 351 |
# this is a lot of data
nrow(d.soil_temp)
## [1] 26046
Perform some basic filtering, note units.
# remove soil temperature less than 20 deg F
d.soil_temp <- subset(d.soil_temp, subset=value > 20)
# remove soil moisture (VWC) < 1%
d.soil_moist <- subset(d.soil_moist, subset=value > 1)
Plot soil moisture and temperature in a single figure. There are many
ways to do this, here we are using concepts from the
lattice
package. Note that we stack the two data sets into
“long format” for plotting.
# combine into long format for plotting
d <- make.groups('Soil Temperature (F)'=d.soil_temp, 'Soil Volumetric Water Content (%)'=d.soil_moist)
# define some reasonable plot colors and line widths
tps <- tactile.theme(superpose.line=list(col=brewer.pal(3, 'Set1'), lwd=1.5))
# plot
xyplot(value ~ datetime | which, groups=sensor_type, data=d, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b %d\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)
Sometimes the daily median is sufficient.
# truncate datetime to date
d$date <- as.Date(d$datetime)
# aggregating over date / sensor group / specific sensor
a <- ddply(d, c('date', 'which', 'sensor_type'), .progress = 'text', .fun = plyr::summarize, q50=median(value, na.rm=TRUE))
# plot daily median
xyplot(q50 ~ date | which, groups=sensor_type, data=a, type=c('g', 'l'), auto.key=list(cex=0.5, columns=2, points=FALSE, lines=TRUE), strip=strip.custom(bg=grey(0.90)), par.settings=tps, layout=c(1,2), scales=list(alternating=3, x=list(format="%b\n%Y", tick.number=10), y=list(relation='free', rot=0)), xlab='', ylab='', sub=stn)