1 Introduction

This document is designed to show you a few ways to implement the conditioned Latin hypercube sampling (cLHS) package to generate sampling points. This is a stratified random sampling method using environmental covariate (raster) data. for more information see:

Minasny, B., and A.B. McBratney. 2006. A conditioned Latin hypercube method for sampling in the presence of ancillary information. Comput. Geosci. 32(9): 1378-1388. doi: 10.1016/j.cageo.2005.12.009.

Roudier, P. 2011. clhs: a R package for conditioned Latin hypercube sampling. R package version 0.5-1. R Found. Vienna Available HttpCRAN R-Proj. Orgpackage Clhs Accessed Febr. 2016.

Roudier, P., A.E. Hewitt, and D.E. Beaudette. 2012. A conditioned Latin hypercube sampling algorithm incorporating operational constraints.

2 Setup

Load Required Libraries

library(rgdal)
## Loading required package: sp
## rgdal: version: 1.2-16, (SVN revision 701)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.2.0, released 2017/04/28
##  Path to GDAL shared files: C:/Users/Dylan.Beaudette/Documents/R/win-library/3.4/rgdal/gdal
##  GDAL binary built with GEOS: TRUE 
##  Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
##  Path to PROJ.4 shared files: C:/Users/Dylan.Beaudette/Documents/R/win-library/3.4/rgdal/proj
##  Linking to sp version: 1.2-5
library(raster)
library(clhs)

Set working directory and download data

# create a new directory to store the data
dir.create("C:/workspace2/clhs", recursive = TRUE)

# setting the working directory
setwd("C:/workspace2/clhs/")

# download data
download.file(url = "http://github.com/dave-white2/data/raw/master/clhs/clhs_data.zip", destfile = "clhs.zip", method = "auto")

# unzip data
unzip(zipfile = "C:/workspace2/clhs/clhs.zip", overwrite = TRUE)

3 Load Raster Data

Data needs to be of the same extent and resolution

# load raster data of same extent and resolution
r.claymin <- raster("claymin.tif")
r.mrrtf <- raster("mrrtf.tif")
r.mrvbf <- raster("mrvbf.tif")
r.ndvi <- raster("ndvi.tif")
r.sagawi <- raster("sagawi.tif")
r.cost <- raster("cost.tif")

Combine into a raster stack and assign names

r.stack <- stack(r.claymin, r.mrrtf, r.mrvbf, r.ndvi, r.sagawi)
names(r.stack) <- c('claymin', 'mrrtf', 'mrvbf', 'ndvi', 'sagawi')

# load raster stack into memory for faster sampling
r.stack <- readAll(r.stack)

4 cLHS - Basic Method

Convert the raster stack to a SpatialPointsDataFrame

During this process every pixel becomes a row in the resulting data.frame. WARINING: if you have a very large dataset this might not be appropriate as it could take a long time or exceed available memory.

s <- rasterToPoints(r.stack, spatial=TRUE)

Apply cLHS to entire raster stack - this process takes a bit of time (size: number of sample points/bins) - result is as a cLHS_result object. To see a progress bar of how long it takes to run cLHS, set progress = TRUE.

s.clhs <- clhs(s, size = 10, progress = FALSE, iter = 1000, simple = FALSE)

Diagnostic plots You can produce plots illustrating the result of the cLHS sampling procedure. For more info type ?plot.cLHS_result into your console.

plot(s.clhs, mode=c("obj", "box"))

Extract indices

subset.idx <- s.clhs$index_samples

Plot points to inspect and save to shp file

# check visually:
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
points(s[subset.idx, ], bg = 'red', pch=21)

# save cLHS points to shp
writeOGR(s[subset.idx, ], dsn = 'C:/workspace2/clhs', layer='clhs_points', driver = 'ESRI Shapefile', overwrite_layer = TRUE)

5 cLHS - Large Raster Datasets

Sometimes, you may have a large raster stack, both in number of rasters and in size. When this occurs the above method can be entirely too slow. We can utilize a technique where a regular grid of points is created over the raster stack. Then the values at this regular grid are used to create the cLHS object. This is created by regular sampling of the raster stack.

# result is a SpatialPointsDataFrame
# size is the number of points to generate
s <- sampleRegular(r.stack, size = 500, sp = TRUE)

# plot s to see the grid of points
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
plot(s, add=TRUE)

Apply cLHS to SpatialPixelsDataFrame (size: number of sample points/bins) - result is as a cLHS_result object

s.clhs <- clhs(s, size = 10, progress = FALSE, iter = 1000, simple = FALSE)

Diagnostic plots

plot(s.clhs, mode=c("obj", "box"))

Extract indices

subset.idx <- s.clhs$index_samples

Plot points to inspect and save to shp file

# check visually:
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
points(s[subset.idx, ], bg = 'red', pch=21)

# save cLHS points to shp
writeOGR(s[subset.idx, ], dsn = 'C:/workspace2/clhs', layer = 'clhs_points_rgrd', driver = 'ESRI Shapefile', overwrite_layer = TRUE)

6 cLHS - Cost Implementation

cLHS has a built in cost function. Using cost, you can constrain the points to particular locations. For instance, you may not have access to certain private lands, or the terrain is too rugged. You can generate a cost raster where larger pixel values are “more expensive” to visit. Remember that the cost raster needs to be the same size and extent of the rasters in the raster stack.
This cost raster was created in SAGA using the Accumulated Cost (Anisotropic) function with slope as the cost grid, aspect as the direction grid, and a roads grid as the destination points. The accumulated cost raster created from this process is shown below.

# plot the cost raster
par(mar = c(1,1,1,1))
plot(r.cost, axes=FALSE)

The cost raster needs to be combined with the raster stack for this to work

r.stack <- stack(r.claymin, r.mrrtf, r.mrvbf, r.ndvi, r.sagawi, r.cost)
names(r.stack) <- c('claymin', 'mrrtf', 'mrvbf', 'ndvi', 'sagawi', 'cost')
r.stack <- readAll(r.stack)

Convert the raster stack to a SpatialPointsDataFrame by sampling along a regular grid.

s <- sampleRegular(r.stack, size = 10000, sp=TRUE)

To implement the cost function in cLHS, there are a few changes to the code.
Apply cLHS to entire raster stack - this process takes a bit of time (size: number of sample points/bins) - result is as a cLHS_result object.

# pending a fix in clhs package (https://github.com/pierreroudier/clhs/issues/3)
# NA cost values must be filtered from the samples
s <- s[which(!is.na(s$cost)), ]

# cost surface is activated via `cost` argument
s.clhs <- clhs(s, size = 10, progress = FALSE, iter = 1000, cost = 'cost', simple = FALSE)

Diagnostic plots

plot(s.clhs, mode = c("obj", "box"))

Extract indices

subset.idx <- s.clhs$index_samples

Plot points to inspect and save to shp file

# check visually on the sagawi raster
par(mar = c(1,1,1,1))
plot(r.sagawi, axes=FALSE)
contour(r.cost, nlevels=10, col='black', add=TRUE)
points(s[subset.idx, ], bg = 'red', pch=21)

# check visualy on the cost raster
par(mar = c(0,0,0,0))
plot(r.cost, axes=FALSE)
points(s[subset.idx, ], bg = 'red', pch=21)

# save cLHS points to shp
writeOGR(s[subset.idx, ], dsn = 'C:/workspace2/clhs', layer = 'clhs_points_cost', driver = 'ESRI Shapefile', overwrite_layer = TRUE)