Statistics for Soil Survey - Part 1
Statistics for Soil Survey - Part 2
Pre-course Assignment
0.1
Setup
0.2
Additional Reading
1
Introduction
1.1
Outline
2
Model Evaluation
2.1
Introduction
2.1.1
Examples - Dispersion
2.1.2
Examples - Variation and Certainty
2.2
Theory of Uncertainty
2.3
Resampling to Estimate Uncertainty
2.3.1
Examples - Confidence Intervals
2.3.2
Exercise 1
2.4
Performance Metrics
2.4.1
Regression Metrics
2.4.2
Examples
2.4.3
Exercise 2
2.4.4
Categorical
2.4.5
Examples
2.4.6
Exercise 3
2.5
Validation
2.5.1
Internal Validation
2.5.2
External Validation
3
Numerical Taxonomy and Ordination
3.1
Introduction
3.1.1
Objectives
3.2
Whirlwind Tour
3.2.1
Similarity, Disimilarty, and Distance
3.2.2
Standardization of Characteristics
3.2.3
Missing Data
3.2.4
Visualizing Pair-Wise Distances: The Dendrogram
3.2.5
Cluster Analysis: Finding Groups in Data
3.2.6
Ordination: Visualization in a Reduced Space
3.2.7
Pair-Wise Distances Between Soil Profiles
3.2.8
Final Discussion
3.3
Elaboration with Examples
3.3.1
Set Up the R Session
3.3.2
Data Sources
3.3.3
Try it!
3.3.4
Evaluating Missing Data
3.3.5
More on the Distance Matrix and How to Make One
3.3.6
Thematic Distance Matrix Plots
3.3.7
Hierachrical Clustering
3.3.8
Centroid and Medoid (Partitioning) Clustering
3.3.9
Ordination
3.4
Practical Applications
3.4.1
Pair-Wise Distances Between Soil Profiles
3.4.2
Pair-Wise Distances Between Soil Series
3.4.3
Pair-Wise Distances Between Subgroup-Level Taxa
3.4.4
Soil Color Signatures
3.4.5
Clustering of Soil Colors
3.4.6
“How Do the Interpretations Compare?”
3.4.7
MLRA Concepts via Climate and Elevation Samples
4
Linear Regression
4.1
Introduction
4.2
Linear Regression Example
4.3
Data
4.3.1
Henry Mount Database
4.3.2
Aggregate Time Series
4.3.3
Final Dataset
4.4
Spatial data
4.4.1
Plot Coordinates
4.4.2
Extracting Spatial Data
4.5
Exploratory Data Analysis (EDA)
4.5.1
Compare Samples vs Population
4.6
Linear modeling
4.6.1
Diagnostics
4.6.2
Variable selection/reduction
4.6.3
Final model & accuracy assessment
4.6.4
Model Interpretation
4.7
Generate spatial predictions
4.8
Create Map
4.9
Exercise
4.10
Additional reading
5
Generalized Linear Models
5.1
Introduction
5.2
Logistic Regression
5.3
Examples
5.4
Exercise
5.4.1
Load packages
5.4.2
Read in data
5.5
Exploratory analysis (EDA)
5.5.1
Data wrangling
5.5.2
Geomorphic data
5.5.3
Soil Scientist Bias
5.5.4
Plot coordinates
5.6
Exercise 1: View the data
5.6.1
Extracting spatial data
5.6.2
Examine spatial data
5.7
Constructing the model
5.7.1
Diagnostic
5.7.2
Variable Selection & model validation
5.7.3
Final model & accuracy
5.7.4
Model effects
5.8
Generate spatial predictions
5.9
Exercise
5.10
Additional reading
6
Tree-based Models
6.1
Introduction
6.2
Exploratory Data Analysis
6.2.1
Getting Data Into R and Exporting to Shapefile
6.2.2
Examining Data in R
6.2.3
Exercise 1
6.3
Classification and Regression Trees (CART)
6.3.1
Exercise 2: rpart
6.4
Random Forest
6.4.1
Exercise 3: randomForest
6.5
Prediction using Tree-based Models
6.6
Summary
Appendix
A
Accuracy and Uncertainty for Categorical Predictions
A.1
Status Quo
A.2
Theses
A.3
Soap Box Time
A.4
Concept Demonstration via Simulated Data
A.5
Accuracy
A.5.1
Confusion Matrix / Area Under ROC
A.5.2
Brier Scores
A.5.3
Tau and Weighted Tau (class-similarity)
A.6
Uncertainty
A.6.1
Shanon Entropy
A.7
Review
A.8
Example Implementation
A.9
Resources
A.9.1
Evaluating Accuracy of Categorical / Probabilistic Predictions
A.9.2
Sampling and Stability of Estimates
B
Commentary: Selecting a Modeling Framework
B.1
Soil Temperature Regime Modeling in CA792
References
Published with bookdown
Statistics for Soil Survey - Part 2
Chapter 1
Introduction
Finish this.
1.1
Outline