History and Evolution of Digital (Predictive) Soil Mapping. R. A. MacMillan LandMapper Environmental Solutions Inc.

Similar documents

Multi-scale upscaling approaches of soil properties from soil monitoring data

How To Predict Soil Carbon Stock

Geography 4203 / GIS Modeling. Class (Block) 9: Variogram & Kriging

Notable near-global DEMs include

GEOENGINE MSc in Geomatics Engineering (Master Thesis) Anamelechi, Falasy Ebere

3D Building Roof Extraction From LiDAR Data

A Method Using ArcMap to Create a Hydrologically conditioned Digital Elevation Model

How To Understand The Geology Of An Australian Soil

Landforms form an integral part

Environmental Remote Sensing GEOG 2021

ANALYSIS 3 - RASTER What kinds of analysis can we do with GIS?

Annealing Techniques for Data Integration

Tutorial 8 Raster Data Analysis

SAMPLE MIDTERM QUESTIONS

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

Global environmental information Examples of EIS Data sets and applications

An Assessment of the Effectiveness of Segmentation Methods on Classification Performance

LIDAR and Digital Elevation Data

Some elements of photo. interpretation

Using Google Earth for Environmental Science Research

5. GIS, Cartography and Visualization of Glacier Terrain

3D Model of the City Using LiDAR and Visualization of Flood in Three-Dimension

ArcGIS Geostatistical Analyst: Statistical Tools for Data Exploration, Modeling, and Advanced Surface Generation

Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes

Introduction to GIS (Basics, Data, Analysis) & Case Studies. 13 th May Content. What is GIS?

GIS Data in ArcGIS. Pay Attention to Data!!!

Remote Sensing, GPS and GIS Technique to Produce a Bathymetric Map

Introduction to Imagery and Raster Data in ArcGIS

Visualizing of Berkeley Earth, NASA GISS, and Hadley CRU averaging techniques

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

Digital image processing

Description of Simandou Archaeological Potential Model. 13A.1 Overview

Land Use/Land Cover Map of the Central Facility of ARM in the Southern Great Plains Site Using DOE s Multi-Spectral Thermal Imager Satellite Images

Web-based GIS Application of the WEPP Model

Calculation of Minimum Distances. Minimum Distance to Means. Σi i = 1

A GIS helps you answer questions and solve problems by looking at your data in a way that is quickly understood and easily shared.

Analysis of Landsat ETM+ Image Enhancement for Lithological Classification Improvement in Eagle Plain Area, Northern Yukon

Understanding Raster Data

Relevance of moving window size in landform classification by TPI

Aneeqa Syed [Hatfield Consultants] Vancouver GIS Users Group Meeting December 8, 2010

The Terms of reference (ToR) for conducting Rapid EIA study for the proposed project is described below:

Developing sub-domain verification methods based on Geographic Information System (GIS) tools

The USGS Landsat Big Data Challenge

WHAT IS GIS - AN INRODUCTION

720 Contour Grading. General. References. Resources. Definitions

The Effect of Environmental Factors on Real Estate Value

ANALYSIS OF POSTFIRE SALVAGE LOGGING, WATERSHED CHARACTERISTICS, AND SEDIMENTATION IN THE STANISLAUS NATIONAL FOREST

Using Spatial Statistics In GIS

Topographic Survey. Topographic Survey. Topographic Survey. Topographic Survey. CIVL 1101 Surveying - Introduction to Topographic Modeling 1/8

Leveraging Ensemble Models in SAS Enterprise Miner

INSTRUCTIONS FOR MAKING 3D,.DWG CONTOUR LINES

Nature Values Screening Using Object-Based Image Analysis of Very High Resolution Remote Sensing Data

Evaluation of Forest Road Network Planning According to Environmental Criteria

Data source, type, and file naming convention

UNCERT: GEOSTATISTICAL, GROUND WATER MODELING, AND VISUALIZATION SOFTWARE

Cafcam: Crisp And Fuzzy Classification Accuracy Measurement Software

WATER INTERACTIONS WITH ENERGY, ENVIRONMENT AND FOOD & AGRICULTURE Vol. II Spatial Data Handling and GIS - Atkinson, P.M.

A KNOWLEDGE-BASED APPROACH FOR REDUCING CLOUD AND SHADOW ABSTRACT

Raster Data Structures

Basic Elements of Reading Plans

The Scientific Data Mining Process

Working with Digital Elevation Models and Digital Terrain Models in ArcMap 9

VCS REDD Methodology Module. Methods for monitoring forest cover changes in REDD project activities

Assessment. Ian Uglow Technical Director, SLR Consulting 7 th October 2010

Predictive Modeling Techniques in Insurance

Australian Soil Resources Information System (ASRIS)

Geospatial Software Solutions for the Environment and Natural Resources

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Evaluation of surface runoff conditions. scanner in an intensive apple orchard

RESOLUTION MERGE OF 1: SCALE AERIAL PHOTOGRAPHS WITH LANDSAT 7 ETM IMAGERY

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

A quick overview of geographic information systems (GIS) Uwe Deichmann, DECRG

Geography 4203 / GIS Modeling. Class 12: Spatial Data Quality and Uncertainty

y = Xβ + ε B. Sub-pixel Classification

METHODOLOGY FOR LANDSLIDE SUSCEPTIBILITY AND HAZARD MAPPING USING GIS AND SDI

The premier software for extracting information from geospatial imagery.

CLIDATA In Ostrava 18/06/2013

3D VISUALIZATION OF GEOTHERMAL WELLS DIRECTIONAL SURVEYS AND INTEGRATION WITH DIGITAL ELEVATION MODEL (DEM)

INTRODUCTION TO GEOSTATISTICS And VARIOGRAM ANALYSIS

CRMS Website Training

NJ Interception Drainage

3D Analysis and Surface Modeling

This is Geospatial Analysis II: Raster Data, chapter 8 from the book Geographic Information System Basics (index.html) (v. 1.0).

Development of an Impervious-Surface Database for the Little Blackwater River Watershed, Dorchester County, Maryland

Geostatistics Exploratory Analysis

Whitebox Geospatial Analysis Tools Tutorial Series. Tutorial 3: Streams and Watershed Extraction

MODIS IMAGES RESTORATION FOR VNIR BANDS ON FIRE SMOKE AFFECTED AREA

Machine Learning Algorithms for GeoSpatial Data. Applications and Software Tools

APPLICATION OF MULTITEMPORAL LANDSAT DATA TO MAP AND MONITOR LAND COVER AND LAND USE CHANGE IN THE CHESAPEAKE BAY WATERSHED

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

Appendix C - Risk Assessment: Technical Details. Appendix C - Risk Assessment: Technical Details

Application of Google Earth for flood disaster monitoring in 3D-GIS

The Next Generation Science Standards (NGSS) Correlation to. EarthComm, Second Edition. Project-Based Space and Earth System Science

Transcription:

History and Evolution of Digital (Predictive) Soil Mapping R. A. MacMillan LandMapper Environmental Solutions Inc.

Outline Unifying DSM Framework: Universal Model of Variation Z(s) = Z * (s) + ε(s) + ε Past: Early History of Development of DSM (pre 2003) Theory, Concepts, Models, Software, Inputs, Developments Examples of early methods and outputs Key Recent Developments in DSM post 2003 Theory, Concepts, Models, Software, Inputs, Developments Examples of recent methods and outputs Future Trends: How do I See DSM Developing? Theory, Concepts, Inputs, Models, Software, Developments From Static Maps to Dynamic Real-Time Models Discussion and Conclusions Constraints and pitfalls to be avoided, technical/political

Introduction Universal Model of Soil Variation A Unifying Framework for DSM

Source: Burrough, 1986 eq. 8.14 Universal Model of Soil Variation A Unifying Framework for Digital Soil Mapping Z(s) = Z*(s) + ε(s) + ε Predicted soil type or soil property value Deterministic part of the predictive model Stochastic part of the predictive model Pure Noise part of the predictive model Predicted spatial pattern of some soil property or class including uncertainty of the estimate part of the variation that is predictable by means of some statistical or heuristic soil-landscape model part of the variation that shows spatial structure, can be modelled with a variogram part of the variation that can t be predicted at the current scale with the available data and models

Deterministic Part of Prediction Model: Z*(s) Conceptual Models Conceptual or mental soillandscape models Produce area-class maps Statistical Models Scorpan relate soils/soil properties to covariates Explain spatial distribution of soils in terms of known soil forming factors as represented by covariates 15 40 60 EOR Series DYD Series KLM Series FMN Series Layer weightings 2 x 1 x 2 x 1 x 3 x Total salinity h a z a rd ra tin g COR Series Individual salinity hazard ratings fo r e a c h la y e r 100 x 100 m grid L a n d s c a p e c u rv a tu re Ve g e ta tio n R a in fa ll G e o lo g y S o ils L a n d s u rfa c e Salinity hazard m a p

Stochastic Part of Prediction Model: ε(s) Geostatistical Estimation Predict soil properties Point or block kriging Predict soil classes Indicator kriging Predict error of estimate Correct Deterministic Part Error in deterministic part is computed (residuals) If structure exists in error then krige error & subtract

Pure Noise Part of Prediction Model: ε(s) Some Variation not Predictable Have to be honest about this Should quantify and report it Deterministic Prediction Mental and Statistical Models Not perfect often lack suitable covariates to predict target variable Lack covariates at finer resolution Geostatistical Prediction Insufficient point input data Can t predict at less than the smallest spacing of input point data Semi Variance Nugget Sill Range d1 d2 d3 d4 Lag (distance)

Past Early History of DSM Development (pre 2003) On Digital Soil Mapping McBratney et al., 2003

Early History of Development of DSM Deterministic Stochastic Soil Classes Soil Properties Soil Classes Soil Properties

Past Theory: Deterministic Component Z*(s) Classed Conceptual Models Jenny (1941) CLORPT (Note no N=space) Simonson (1959) Process Model of additions, removals, translocations, transformations Ruhe (1975) Erosional -Depositional surfaces, open/closed basins Dalrymple et al., (1968) Nine unit hill slope model Milne (1936a, 1936b) Catena concept, toposequences

Source: Lin, 2005 Frontiers in Soil Science http://www7.nationalacademies.org/soilfrontiers/ Past Concepts: Deterministic Component Z*(s) Classed Conceptual Models Soil = f (C, O, R, P, T, ) Climate Topography Organisms Soil Parent Material Time

http://solim.geography.wisc.edu/index.htm

Past Models: Deterministic Component Z*(s) Classed Statistical Predictions Fuzzy Inference Zhu, 1997, Zhu et al., 1996 MacMillan et al., 2000, 2005 Neural Networks Zhu, 2000 Expert Knowledge (Bayesian) Skidmore et al., 1991 Cook et al., 1996, Corner et al., 1997 Regression Trees Moran and Bui, 2002, Bui and Moran, 2003 Layer weightings 2 x 1 x 2 x 1 x 3 x Total salinity h a z a rd ra tin g Individual salinity hazard ratings fo r e a c h la y e r 100 x 100 m grid L a n d s c a p e c u rv a tu re Ve g e ta tio n R a in fa ll G e o lo g y S o ils L a n d s u rfa c e Salinity hazard m a p Source: Jones et al., 2000

Past Software: Deterministic Component Z*(s) Classed Statistical Predictions Regression Trees CUBIST Rulequest Research, 2000 CART Breiman et al., 1984 C4.5 & See5 Quinlin, 1992 JMP (SAS) http://www.jmp.com/ R http://www.r-project.org/ Fuzzy Logic SoLIM Zhu et al., 1996, 1997 LandMapR, FuzME Bayesian Logic Prospector Duda et al., 1978 Expector Skidmore et al., 1991 Netica Norsys.com/netica

Past Inputs: Deterministic Component Z*(s) Classed Statistical Predictions C = Climate Temp, Ppt, ET, Solar Rad Mean, min, max, variance Annual, monthly, indices O = Organisms Manual Maps Land Use Vegetation Remotely Sensed Imagery Classified RS imagery NDVI, EVI, other ratios R = Relief (topography) Primary Attributes Slope, aspect, curvatures Slope Position, roughness Secondary Attributes CTI, WI, SPI, STC P = Parent Material Published geology maps Gamma radiometrics Thermal IR, RS Ratios A = Age

Past Inputs: Deterministic Component Z*(s) Classed Statistical Predictions Common Topo Inputs Profile Curvature Plan (Contour) Curvature Slope Gradient (& Aspect) CTI or Wetness Index Sometimes, not always Less Common Topo Inputs Surface Roughness Relief within a window Relief relative to drainage Pit, peak, Ridge, channel, Profile Curvature Slope Gradient Pit 2 Peak Relief Plan Curvature Wetness Index Divide 2 Channel Source: MacMillan, 2005

Past Inputs: Non-DEM Airborne Radiometrics Radiometrics 4 Subsurface Infer Parent Material Source: Mayr, 2005

Past Inputs: Non-DEM Satellite Imagery Grassland Land Cover Types Alpine Land Cover Types

Past Models: Deterministic Component Z*(s) Examples of Predictions of Soil Class Maps

Approaches to Producing Predictive Area- Class Maps

Knowledge-Based Classification In SoLIM Source: Zhu, SoLIM Handbook

Source: Thompson et al., 2010 WCSS Knowledge-Based Classification Using Boolean Decision Tree in USA Component Soils Gilpin Pineville Laidig Guyandotte Dekalb Craigsville Meckesville Cateache Shouns

Knowledge-Based Classification In LandMapR Source: Steen and Coupé, 1997 Source: MacMillan, 2005

Knowledge-Based Classification In Utah, Knowledge-Based PURC Approach Note: Not simple slope elements but complex patterns Source: Cole and Boettinger, 2004

Approaches to Producing Predictive Area- Class Maps

Source: Zhou et al., 2004, JZUS Supervised Classification Using Regression Trees Note similarity of supervised rules and classes to typical soil-landform conceptual classes Note numeric estimate of likelihood of occurrence of classes

Source: Zhou et al., 2004, JZUS Supervised Classification Using Bayesian Analysis of Evidence/Classification Trees

Predicting Area-Class Soil Maps Using Discriminant Analysis Source: Scull et al., 2005, Ecological Modelling

Predicting Area-Class Soil Maps Using Regression Trees Extrapolation Uncertainty of prediction Bui and Moran (2003) Geoderma 111:21-44 Source: Bui and Moran., 2003

Supervised Classification Using Fuzzy Logic Shi et al., 2004 Used multiple cases of reference sites Each site was used to establish fuzzy similarity of unclassified locations to reference sites Used Fuzzy-minimum function to compute fuzzy similarity Harden class using largest (Fuzzymaximum) value Considered distance to each reference site in computing Fuzzy-similarity Fuzzy likelihood of being a broad ridge Source: Shi et al., 2004

Approaches to Producing Predictive Area- Class Maps

Credit: J. Balkovič & G. Čemanová Concept of Fuzzy K-means Clustering Source: Sobocká et al., 2003

Example of Application of Fuzzy K-means Unsupervised Classification From: Burrough et al., 2001, Landscsape Ecology Note similarity of unsupervised classes to conceptual classes

Example of Application of Disaggregation of a Soil Map by Clustering into Components Source: Faine, 2001

Developments: Deterministic Component Z*(s) Classed Predictive Maps in Past Characteristics of Models Models largely ignored ε Seldom estimate error Rarely correct for error Mainly use DEM inputs Initially 3x3 windows Slope, aspect, curvatures Maybe wetness index Later improvements were measures of slope position Rarely use ancillary data Exceptions Bui, Skull, Zhu Operate at single scale Characteristics of Models Many use expert knowledge Data mining is the exception Training data seldom used Specialty software prevails Software for DEM analysis SoLIM, TAPESG, TOPAZ, TOPOG, TAS, LandMapR, SAGA, ISRISI, 3dMapper Software for extracting rules Expector, Netica, CART, See 5, Cubist, Prospector Software for applying rules ESRI, SoLIM, SIE, SAGA

Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Approaches Aimed at Predicting Continuous Soil Properties

Past Concepts: Deterministic Component Z*(s) Continuous Soil Properties Same Theory-Concepts as for Classed Maps Soil = f (C, O, R, P, T, ) Except theory applied to individual soil properties Initially referred to as environmental correlation Soil properties related to Landscape attributes Climate variables Geology, lithology, soil pm Key Papers Moore et al., 1993 Linear regression McSweeney et al., 1994 McKenzie & Austin, 1993 Gessler at al, 1995 GLMs in S-Plus McKenzie & Ryan, 1999 Regression Trees

Past Models: Deterministic Component Z*(s) Continuous Soil Properties Regression Trees McKenzie & Ryan, 1998, Odeh et al., 1994 Fuzzy Logic-Neural Networks Zhu, 1997 Bayesian Expert Knowledge Skidmore et al., 1996 Cook et al., 1996, Corner et al., 1997 GLMs General Linear Models McKenzie & Austin, 1993 Gessler et al., 1995 Source: McKenzie and Ryan, 1998

Past Inputs: Deterministic Component Z*(s) for Continuous Soil Properties Similar to Classed Maps But: Many innovations originated with continuous modelers Increased use of non-dem attributes climate, radiometrics, imagery Improved DEM derivatives Wetness Index & CTI Upslope means for slope, etc. Inverted DEMs to compute» Down slope dispersal» Down slope means» New slope position data Source: McKenzie and Ryan, 1998

Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Examples of Predictions of Soil Property Maps

Past Models: Deterministic Component Z*(s) Continuous Maps Aandahl, 1948 (Note Date!) Regression model Predicted Average Nitrogen (3-24 inch) Total Nitrogen by depth Total Organic Carbon by depth interval Depth of profile to loess Predictor (covariate) Slope position as expressed by length of slope from shoulder Lost in the depths of time Source: Aandahl, 1948

Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Moore et al., 1993 Seminal paper Focus on topography Small sites Other covariates were assumed constant Got people thinking About quantifying environmental correlation, especially soil-topography relationships Source: Moore et al, 1993

Source: McKenzie and Ryan, 1998 Past Models: Deterministic Component Z*(s) for Continuous Soil Properties McKenzie & Ryan, 1998 Regression Tree: Soil Depth

Source: McKenzie and Ryan, 1998 Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Gessler et al., 1995 GLMs Largely based Topo CTI Others held Steady Source: Gessler, 2005

Credit: Minasny & McBratney Past Models: Deterministic Component Z*(s) for Continuous Soil Properties 2.17 160.1 Regression tree Text: C Text: S,LS,L,CL,LiC 1.18 2.84 54.61 27.45 BD<1.43 BD>1.43 Clay<46.5 Clay>46.5 0.64 2.21 2.97 2.04 15.65 13.00 14.59 5.50 BD<1.42 BD>1.42 Source: Minasny and McBratney 3.37 2.81 1.83 8.90

Developments: Deterministic Component Z*(s) Predictive Maps up to 2003 Main Developments Better DEM derivatives More and better measures of landform position or context (Qin et al., 2012) Some recognition of scale and resolution effects Different window sizes Different grid resolutions More non-dem inputs Increased use of imagery New surrogates for PM Main Developments Integration of single models into multi-purpose software ArcGIS, ArcSIE, ArcView SAGA, Whitebox, IDRISI Improved processing ability Bigger files, faster processing Emergence of 2 main scales Hillslope elements (series) Quite similar across models Landscape patterns (domains) Similar to associations

Early History of Development of DSM Deterministic Stochastic Soil Classes Soil Properties Soil Classes Soil Properties

Past Theory: Stochastic Component ε(s) Waldo Tobler (1970) First law of geography Everything is related to everything else, but near things are more related than distant things Matheron (1971) Theory of regionalized variables Webster and Cuanalo (1975) clay, silt, ph, CaCO3, colour value, and stoniness on transect Burgess and Webster (1980 ab) Soil Property maps by kriging Universal kriging (drift) of EC

Source: Oliver, 1989 Past Models: Stochastic Component ε(s) Universal Model of Variation Matheron (1971) Burgess and Webster (1980 ab) Webster and Burrough (1980) Burrough (1986) Webster and McBratney (1987) Oliver (1989)

Past Models: Stochastic Component ε(s) Optimal Interpolation by Kriging Irregular spatial distribution (of observed point values) Compute semi-variance at different lag distances 6 5 6 7 6 7 8 5 6 y 7 x Collect point sample observations Fit Semi-variogram to lag data Estimate values and error at fixed grid locations 6.1 5.7 5.3 5.8 7.0 6.5 6.0 5.2 7.6 7.0 6.0 5.7 7.2 7.0 6.2 5.5

Past Software: Stochastic Component ε(s) Earlier Stand Alone Pc-Geostat (PC-Raster) Early version of GSTAT VESPER Variogram estimation and spatial prediction with error Minasny et al., 2005 http://sydney.edu.au/agricultu re/pal/software/vesper GEOEASE (DOS, 1991) http://www.epa.gov/ada/csm os/models/geoeas.html Later More Integrated GSTAT Pebesma and Wesseling, 1998 Incorporated into ISRISI Now incorporated into R and S-Plus packages Pebesma, 2004 http://www.gstat.org/index.ht ml ArcGIS Geostatistical Analyst SGeMS (Stanford Univ) http://sgems.sourceforge.net/

Past Inputs: Stochastic Component ε(s) Essentially Just x,y,z Values at Point Locations 1. Start with set of soil property values irregularly distributed in x,y Cartesian space 2. Locate the regularly spaced grid nodes where predicted soil property values are to be calculated 3. Locate the n soil property data points within a search window around the current grid cell for which a value is to be calculated 4. Compute a new value for each location as the weighted average of n neighbor elevations with weights established by the semi-variogram

Past Models: Stochastic Component ε(s) for Continuous Soil Properties Examples of Predictions of Soil Property Maps by Kriging

Continuous Soil Property Maps by Kriging Very Early Alberta Example Lacombe Research Station Sampled soils on a 50 m grid Sand, Silt, Clay, ph, OC, EC, others 3 depths (0-15, 15-50, 50-100) Used custom written software Compute variograms Interpolate using the variograms Only visualised as contour maps Only got 3D drapes in 1988 Used PC-Raster to drape Saw strong soil-landscape pattern Source: MacMillan, 1985 unpublished SEMI-VARIOGRAM FOR A-HORIZON %SAND SEMI-VARIANCE 160 140 120 100 80 60 40 20 0 1 3 5 7 9 11 13 15 LAG (1 LAG = 30 M) LACOMBE SITE: A HORIZON %SAND (1985) 17 19

Continuous Soil Property Maps by Kriging Source: http://sydney.edu.au/agriculture/pal/software/vesper.shtml

Continuous Soil Property Maps by Kriging Yasribi et al., 2009 Simple ordinary kriging of soil properties (OK) No co-kriging No regression prediction Relies on presence of Sufficient point samples Spatial structure over distances longer then the smallest sampling interval Source: Yasribi et al., 2009

Continuous Soil Property Maps by Kriging Shi, 2009 Comparison of ph by four different methods a) HASM b) Kriging c) IWD d) Splines Source: Yasribi et al., 2009

Developments: Stochastic Component ε(s) Predictive Maps up to 2003 Main Developments Theory Becomes better understood and accepted Concepts Regression-kriging evolves to include a separate part for regression prediction Models Understanding and use of universal model grows Directional, local variograms Main Developments Software From stand alone and single purpose to integrated software Improvements in Visualization Capacity to process large data sets Automated variogram fitting Ease of use Inputs Developments in sampling designs and sampling theory

Present and Recent Past Key Developments in DSM Since 2003 (2003-2012) On Digital Soil Mapping McBratney et al., 2003

Developments in DSM Since 2003 Increasing Convergence and Interplay Deterministic Stochastic Soil Classes Soil Properties Soil Classes Soil Properties Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Theory: Key Developments Since 2003 Deterministic Part Pretty much unchanged But Still based on attempting to elucidate quantitative relationships between soils & environmental covariates Scorpan elaboration highlights importance of the spatial component (n) and of spatially correlated error ε(s) Stochastic Part Same underlying theory But Still based on theory of regionalized variables Increasing realization that the structural part of variation (non-stationary mean or drift) can be better modelled by a deterministic function than by purely spatial calculations

Concepts: Key Developments Since 2003 Deterministic Part Scorpan Model Explicitly recognizes soil data (s) as a potential input to predict other soil data Soil inputs can include soil maps, point observations, even expert knowledge Explicitly recognizes space (n) or location as a factor in predicting soil data Space as in x,y location Space as in context, kriging Factors as predictors Factors explicitly seen as quantitative predictors in prediction function Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Concepts: Key Developments Since 2003 Stochastic Part Emergence of Regression Kriging (RK) Key difference to ordinary kriging is that it is no longer assumed that the mean of a variable is constant Local variation or drift can be modelled by some deterministic function Local regression lowers error, improves predictions Local regression function can even be a soil map Source: Heuvelink, personal communication

Models: Key Developments Since 2003 Deterministic Part Improvements in Data Mining and Knowledge Extraction Supervised Classification Training data obtained from both points and maps» Sample maps at points Ensemble or multiple realization models (100 x)» Boosting, bagging» Random Forests» ANN, Regression tree Deterministic Part Improvements in Data Mining and Knowledge Extraction Expert Knowledge Extraction Bayesian Analysis of Evidence Prototype Category Theory Fuzzy Neural Networks Tools for Manual Extraction of Fuzzy Expert Knowledge» ArcSIE, SoLIM Unsupervised classification Fuzzy k-means, c-means

Models: Key Developments Since 2003 Stochastic Part Regression Kriging Recognized as equivalent to universal kriging or kriging with external drift Use of external knowledge and maps made easier Incorporation of soft data Made more accessible through implementation in commercial (ESRI) and open source software (R) Stochastic Part Regression Kriging Odeh et al., 1995 McBratney et al., 2003 Hengl et al., 2004, 2007, 2003 Heuvelink, 2006 Hengl how to books http://spatialanalyst.net/book/ http://www.itc.nl/library /Papers_2003/misca/hen gl_comparison.pdf

Software: Key Developments Since 2003 Commercial Software JMP (SAS) (McBratney) http://www.jmp.com/ S-Plus, Matlab, Used by soil researchers See5, CUBIST, CART Regression Trees Netica (Bayesian) Norsys.com/netica Improvements Better visualization Better interfaces Non-commercial Software Fuzzy Logic SoLIM Zhu et al., 1996, 1997 ArcSIE Shi, FuzME Bayesian Logic Full Range of Options R http://www.r-project.org Regression Kriging Random Forests Regression Trees GLMs GSTAT (in R)

Source: Schmidt and Andrew., 2005 Inputs: Key Developments Since 2003 Terrain Attributes More and better measures Primarily contextual and related to landform position Real advances related to Multi-scale analysis varying window size and grid resolution Window-based and flowbased hill slope context Systematic examination of relationships of properties and processes to scale Source: Smith et al., 2006

Inputs: Key Developments Since 2003 Terrain Attributes Multi-scale analysis Varying window size and grid resolution Identifies that some variables are more useful when computer over larger windows or coarser grids Finer resolution grids not always needed or better Drop off in predictive power of DEMs after about 30-50 m grid resolution Source: Deng et al., 2007

MrVBF: Multi-scale DEM Analysis Smooth and subsample Source: Gallant, 2012 Original: 25 m Generalised: 75 m Generalised 675 m Flatness Flatness Bottomness Bottomness Valley Bottom Flatness Valley Bottom Flatness

Multiple Resolution Landform Position MrVBF Example Outputs Broader Scale 9 DEM MRVBF for 25 m DEM Source: Gallant, 2012

Developments: Improved Measures of Landform Position SAGA-RHSP: relative hydrologic slope position SAGA-ABC: altitude above channel Source: C. Bulmer, unpublished Calculation based on: MacMillan, 2005 Source: C. Bulmer, unpublished

Developments: Improved Measures of Landform Position TOPHAT Schmidt and Hewitt (2004) Slope Position Hatfield (1996) Source: Schmidt & Hewitt, (2004) Source: Hatfield (1996)

Developments: Improved Measures of Landform Position - Scilands Source: Rüdiger Köthe, 2012

Measures of Relative Slope Length (L) Computed by LandMapR Percent L Pit to Peak Percent L Channel to Divide MEASURE OF REGIONAL CONTEXT MEASURE OF LOCAL CONTEXT Source: MacMillan, 2005 Image Data Copyright the Province of British Columbia, 2003

Measures of Relative Slope Position Computed by LandMapR Percent Diffuse Upslope Area Percent Z Channel to Divide SENSITIVE TO HOLLOWS & DRAWS RELATIVE TO MAIN STREAM CHANNELS Image Data Copyright the Province of British Columbia, 2003 Source: MacMillan, 2005

Source: Reuter, H.I. (unpublished) Developments: Improved Classification of Landform Patterns Iwahashi & Pike (2006) Iwahashi landform underlying 1:650k soil map Terrain Series Fine texture, High convexity Fine texture, Low convexity Coarse texture, High convexity Coarse texture, Low convexity Terrain Classes 1 5 9 13 3 7 11 15 2 6 10 14 4 8 12 16 steep gentle

Inputs: Key Developments Since 2003 Non-Terrain Attributes Systematic analysis of environmental covariates Detect distances and scales over which each covariate exhibits a strong relationship with a soil or property to be predicted or just with itself Vary window sizes and grid resolutions and compute regressions on derivatives analyse range of variation inherent to each covariate» Functional relationships are dependent on scale Source: Park, 2004

Inputs: Key Developments Since 2003 Non-Terrain Attributes Systematic analysis of scale of environmental covariates Select and use input covariates at the most appropriate scale Explicitly recognize the hierarchical nature of environmental controls on soils Select variables at the scales, resolutions or window sizes with the strongest predictive power for each property or class to be predicted. Source: Park, 2004

Inputs: Key Developments Since 2003 Harmonization of soil profile depth data through spline fitting Source: David Jacquier, 2010

Inputs: Key Developments Since 2003 From discrete soil classes to continuous soil properties Clearfield soil series Wapello County, Iowa Mukey: 411784 Musym: 230C Harmonization of soil profile data through spline fitting Modal profile Source: Sun et al., (2010) Fit masspreserving spline Fitted Spline Estimate averages for spline at standardised depth ranges, e.g., globalsoilmap depth ranges Spline averages at specified depth ranges

Source: Hempel et al., 2011 Outputs: Key Developments Since 2003 From Classes to Properties Non-disaggregated soil maps Weighted averages by polygon by soil property and depth Calling version 0.5 Disaggregated Soil Class Maps Estimate soil property values at every grid cell location & depth Based on weighted likelihood value of occurrence of each of n soils times property value for that soil at that depth Likelihood value can come from various methods Source: Sun et al, 2010

Outputs: Key Developments Since 2003 From Classes to Properties Disaggregated Soil Class Maps Estimate soil property values at every grid cell location Source: Zhu et al., 1997

Recent Models Recent Examples of Predictions of Soil Class Maps

Predicting Area-Class Soil Maps Clovis Grinand, Dominique Arrouays,Bertrand Laroche, and Manuel Pascal Martin. Extrapolating regional soil landscapes from an existing soil map: Sampling intensity, validation procedures, and integration of spatial context. Geoderma 143, 180-190 Source: Grinand et al., 2008

Source: Park et al, 2004 Recent Knowledge-Based Classification In Africa, Multi-scale, Hierarchical Landforms Elevation + Slope + UPA + Catena ( 2 km support) SOTER Soil and landforms (1:1 million 1.5 million

DEM Digital Soil Mapping in England & Wales using Legacy Data Predicted soil series TOPAZ TAPES-G LandMapR TRAINING DATA MODELLING (NETICA) OUTPUTS Point Data Detailed soil maps Covariates Expert knowledge Accuracy assessment Source: Mayr, 2010

Predicting Area-Class Soil Maps Using Multiple Regression Trees (100 x) Prepare a database and tables of mapping units & soil series, and covariates Select 1/n of the points systematically (n=100) Repeat n times Sample soil series randomly from the multinomial distribution of mapping unit composites Used See 5, (RuleQuest Research, 2009 Construct decision tree Predict soil series at all pixels Calculate the soil series statistics based on the n predictions for each pixel Calculate the probability for each soil series Generate soil series maps Source: Sun et al., 2010

Predicting Area-Class Soil Maps Using Multiple Regression Trees (100 x) A closer look at the junction point in the middle of 4 combined maps, (a) the original map units, and (b) the most likely soil series map and its associated probability. The length of the image is approximately 14 km. (a) Legend monr_comppct Value High : 100 Low : 7 (b) Source: Sun et al., 2010

Recent Models Recent Examples of Predictions of Continuous Soil Property Maps

Source: Hengl et al., 2004 Continuous Soil Property Maps by Kriging & RK Hengl et al., 2004 Comparison of topsoil thickness by four different methods a) Point locations b) Soil Map only c) Ordinary Kriging d) Plain Regression e) Regression-kriging Evidence supports RK

Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) 300 soil point data Assemble field data

Recent Example: Regression-Kriging (scorpan + ε) Source: Minasny et al., 2010 Assemble covariates for the predictive model

Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) Perform regression to build a predictive model Linear Model OC = f(x) + e Predictors Elevation Aspect Landsat band 6 NDVI Land-use Soil-Landscape Unit

Source: Minasny et al., 2010 Recent Example: Regression-Kriging scorpan + ε) Predict both property value and standard error over the entire area

Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) Fit a variogram to the residuals

Source: Minasny et al., 2010 Recent Example: Regression-Kriging scorpan + ε) Krige the residuals

Source: Minasny et al., 2010 Recent Example: Regression-Kriging scorpan + ε) Linear Model + Add interpolated residuals to the prediction from regression Residuals Final Prediction

Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) (Std.err. of regression) 2 Add regression variance and kriging variance to get total variance + (Std. err. of kriging) 2 (Total Variance) 1/2

Recent Example: Regression-Kriging C predicted for sampled locations C=100-1.2EC-5.2REF-0.6REF 2-2.1EL Regression model Residuals C predicted for all grid locations Kriging Final C map Mg C/ha 95 85 75 65 55 45 35 25 15 Mean 64.0 Min 27.0 Max 87.9 CV% 18.4 RMSE 9.8 RI (%) 19.7

Source: Mayr et al., 2010 Continuous Soil Property Maps by Hybrid Bayesian Analysis

Future Trends Personal View of Likely Future DSM Development (Post 2012)

Possibility to move from single snapshot mapping of static soil properties to continuous update and improvement of maps of both static and dynamic properties within a structured and consistent framework.

Source: Heuvelink et al., 2004 The Future: Lets Go Back and Talk About the Universal Model of Variation Again Z(s) = Z*(s) + ε(s) + ε Lots of things qualify as regression! Deterministic part of the predictive model Regression just means minimizing variance Stochastic part of the predictive model What is all this talk about optimization?

Source: Zhu et al., 2010 The Future: Maybe Progress Towards True Regression will be Stepwise Z(s) = Z*(s) + ε(s) + ε Lots of things qualify as regression! Regression depends on having enough point data

The Future: A Conceptual Framework for GSIF A Global Soil Information Facility Collaborative and open collection, input and sharing of geo-registered field evidence (Open Soil Profiles) Collaborative and open production, assembly and sharing of covariate data (World Grids) Collaborative and open and modelling on an inter-active, web-based serverside platform Everything is accessible, transparent and repeatable Maps we can all contribute to, access, use, modify and update, continuously and transparently Source: Hengl et al., 2011

Source: Hengl et al., 2011 The Future: Functionality for GSIF A Global Soil Information Facility Possibility of making use of existing legacy soil maps (even new soil maps) needed for soil prediction anywhere Possibility to assess error and correct for it everywhere Possibility of rescuing, sharing, harmonizing and archiving soil profile point data needed for soil prediction anywhere Possibility to develop and use global models (even for local mapping) Possibility to develop and use multi-scale and multi-resolution hierarchical models

Source: Hengl et al., 2011 The Future: Conceptual Framework for GSIF Open Soil Profiles

Source: Hengl et al., 2011 The Future: Conceptual Framework for GSIF World Grids

Source: Hengl et al., 2011 The Future: Conceptual Framework for GSIF World Grids

The Future: Collaborative Global, Multi- Scale Mapping through GSIF Possibility for combining Top-Down and Bottom-up mapping through weighted averaging of 2 or more sets of predictions ) Possibility to develop and use global models (even for local mapping) Source: Hengl et al., 2011

Source: Hengl et al., 2011 The Future: Global, Multi-Scale Modeling of Soil Properties through GSIF Possibility to develop and use multi-scale and multi-resolution hierarchical models Possibility to develop and use global models (even for local mapping)

Source: Hengl et al., 2011 The Future: Global, Multi-Scale Modeling of Soil Properties through GSIF Global Models inform and improve local mapping

Source: Hengl et al., 2011 The Future: Functionality for GSIF A Global Soil Information Facility Anyone can access and display the maps

The Future: Functionality for GSIF A Global Soil Information Facility With Google Earth everyone has a GIS to view free soil maps and data Slide credit: Tom Hengl, 2011 Source: Hengl et al., 2011

Source: Hengl et al., 2011 The Future: Collaborative Global, Multi- Scale Mapping through GSIF A Global Collaboratory! Working together we can map the world one tile at a time! The next generation of soil surveyors is everyone!