History and Evolution of Digital (Predictive) Soil Mapping R. A. MacMillan LandMapper Environmental Solutions Inc.
Outline Unifying DSM Framework: Universal Model of Variation Z(s) = Z * (s) + ε(s) + ε Past: Early History of Development of DSM (pre 2003) Theory, Concepts, Models, Software, Inputs, Developments Examples of early methods and outputs Key Recent Developments in DSM post 2003 Theory, Concepts, Models, Software, Inputs, Developments Examples of recent methods and outputs Future Trends: How do I See DSM Developing? Theory, Concepts, Inputs, Models, Software, Developments From Static Maps to Dynamic Real-Time Models Discussion and Conclusions Constraints and pitfalls to be avoided, technical/political
Introduction Universal Model of Soil Variation A Unifying Framework for DSM
Source: Burrough, 1986 eq. 8.14 Universal Model of Soil Variation A Unifying Framework for Digital Soil Mapping Z(s) = Z*(s) + ε(s) + ε Predicted soil type or soil property value Deterministic part of the predictive model Stochastic part of the predictive model Pure Noise part of the predictive model Predicted spatial pattern of some soil property or class including uncertainty of the estimate part of the variation that is predictable by means of some statistical or heuristic soil-landscape model part of the variation that shows spatial structure, can be modelled with a variogram part of the variation that can t be predicted at the current scale with the available data and models
Deterministic Part of Prediction Model: Z*(s) Conceptual Models Conceptual or mental soillandscape models Produce area-class maps Statistical Models Scorpan relate soils/soil properties to covariates Explain spatial distribution of soils in terms of known soil forming factors as represented by covariates 15 40 60 EOR Series DYD Series KLM Series FMN Series Layer weightings 2 x 1 x 2 x 1 x 3 x Total salinity h a z a rd ra tin g COR Series Individual salinity hazard ratings fo r e a c h la y e r 100 x 100 m grid L a n d s c a p e c u rv a tu re Ve g e ta tio n R a in fa ll G e o lo g y S o ils L a n d s u rfa c e Salinity hazard m a p
Stochastic Part of Prediction Model: ε(s) Geostatistical Estimation Predict soil properties Point or block kriging Predict soil classes Indicator kriging Predict error of estimate Correct Deterministic Part Error in deterministic part is computed (residuals) If structure exists in error then krige error & subtract
Pure Noise Part of Prediction Model: ε(s) Some Variation not Predictable Have to be honest about this Should quantify and report it Deterministic Prediction Mental and Statistical Models Not perfect often lack suitable covariates to predict target variable Lack covariates at finer resolution Geostatistical Prediction Insufficient point input data Can t predict at less than the smallest spacing of input point data Semi Variance Nugget Sill Range d1 d2 d3 d4 Lag (distance)
Past Early History of DSM Development (pre 2003) On Digital Soil Mapping McBratney et al., 2003
Early History of Development of DSM Deterministic Stochastic Soil Classes Soil Properties Soil Classes Soil Properties
Past Theory: Deterministic Component Z*(s) Classed Conceptual Models Jenny (1941) CLORPT (Note no N=space) Simonson (1959) Process Model of additions, removals, translocations, transformations Ruhe (1975) Erosional -Depositional surfaces, open/closed basins Dalrymple et al., (1968) Nine unit hill slope model Milne (1936a, 1936b) Catena concept, toposequences
Source: Lin, 2005 Frontiers in Soil Science http://www7.nationalacademies.org/soilfrontiers/ Past Concepts: Deterministic Component Z*(s) Classed Conceptual Models Soil = f (C, O, R, P, T, ) Climate Topography Organisms Soil Parent Material Time
http://solim.geography.wisc.edu/index.htm
Past Models: Deterministic Component Z*(s) Classed Statistical Predictions Fuzzy Inference Zhu, 1997, Zhu et al., 1996 MacMillan et al., 2000, 2005 Neural Networks Zhu, 2000 Expert Knowledge (Bayesian) Skidmore et al., 1991 Cook et al., 1996, Corner et al., 1997 Regression Trees Moran and Bui, 2002, Bui and Moran, 2003 Layer weightings 2 x 1 x 2 x 1 x 3 x Total salinity h a z a rd ra tin g Individual salinity hazard ratings fo r e a c h la y e r 100 x 100 m grid L a n d s c a p e c u rv a tu re Ve g e ta tio n R a in fa ll G e o lo g y S o ils L a n d s u rfa c e Salinity hazard m a p Source: Jones et al., 2000
Past Software: Deterministic Component Z*(s) Classed Statistical Predictions Regression Trees CUBIST Rulequest Research, 2000 CART Breiman et al., 1984 C4.5 & See5 Quinlin, 1992 JMP (SAS) http://www.jmp.com/ R http://www.r-project.org/ Fuzzy Logic SoLIM Zhu et al., 1996, 1997 LandMapR, FuzME Bayesian Logic Prospector Duda et al., 1978 Expector Skidmore et al., 1991 Netica Norsys.com/netica
Past Inputs: Deterministic Component Z*(s) Classed Statistical Predictions C = Climate Temp, Ppt, ET, Solar Rad Mean, min, max, variance Annual, monthly, indices O = Organisms Manual Maps Land Use Vegetation Remotely Sensed Imagery Classified RS imagery NDVI, EVI, other ratios R = Relief (topography) Primary Attributes Slope, aspect, curvatures Slope Position, roughness Secondary Attributes CTI, WI, SPI, STC P = Parent Material Published geology maps Gamma radiometrics Thermal IR, RS Ratios A = Age
Past Inputs: Deterministic Component Z*(s) Classed Statistical Predictions Common Topo Inputs Profile Curvature Plan (Contour) Curvature Slope Gradient (& Aspect) CTI or Wetness Index Sometimes, not always Less Common Topo Inputs Surface Roughness Relief within a window Relief relative to drainage Pit, peak, Ridge, channel, Profile Curvature Slope Gradient Pit 2 Peak Relief Plan Curvature Wetness Index Divide 2 Channel Source: MacMillan, 2005
Past Inputs: Non-DEM Airborne Radiometrics Radiometrics 4 Subsurface Infer Parent Material Source: Mayr, 2005
Past Inputs: Non-DEM Satellite Imagery Grassland Land Cover Types Alpine Land Cover Types
Past Models: Deterministic Component Z*(s) Examples of Predictions of Soil Class Maps
Approaches to Producing Predictive Area- Class Maps
Knowledge-Based Classification In SoLIM Source: Zhu, SoLIM Handbook
Source: Thompson et al., 2010 WCSS Knowledge-Based Classification Using Boolean Decision Tree in USA Component Soils Gilpin Pineville Laidig Guyandotte Dekalb Craigsville Meckesville Cateache Shouns
Knowledge-Based Classification In LandMapR Source: Steen and Coupé, 1997 Source: MacMillan, 2005
Knowledge-Based Classification In Utah, Knowledge-Based PURC Approach Note: Not simple slope elements but complex patterns Source: Cole and Boettinger, 2004
Approaches to Producing Predictive Area- Class Maps
Source: Zhou et al., 2004, JZUS Supervised Classification Using Regression Trees Note similarity of supervised rules and classes to typical soil-landform conceptual classes Note numeric estimate of likelihood of occurrence of classes
Source: Zhou et al., 2004, JZUS Supervised Classification Using Bayesian Analysis of Evidence/Classification Trees
Predicting Area-Class Soil Maps Using Discriminant Analysis Source: Scull et al., 2005, Ecological Modelling
Predicting Area-Class Soil Maps Using Regression Trees Extrapolation Uncertainty of prediction Bui and Moran (2003) Geoderma 111:21-44 Source: Bui and Moran., 2003
Supervised Classification Using Fuzzy Logic Shi et al., 2004 Used multiple cases of reference sites Each site was used to establish fuzzy similarity of unclassified locations to reference sites Used Fuzzy-minimum function to compute fuzzy similarity Harden class using largest (Fuzzymaximum) value Considered distance to each reference site in computing Fuzzy-similarity Fuzzy likelihood of being a broad ridge Source: Shi et al., 2004
Approaches to Producing Predictive Area- Class Maps
Credit: J. Balkovič & G. Čemanová Concept of Fuzzy K-means Clustering Source: Sobocká et al., 2003
Example of Application of Fuzzy K-means Unsupervised Classification From: Burrough et al., 2001, Landscsape Ecology Note similarity of unsupervised classes to conceptual classes
Example of Application of Disaggregation of a Soil Map by Clustering into Components Source: Faine, 2001
Developments: Deterministic Component Z*(s) Classed Predictive Maps in Past Characteristics of Models Models largely ignored ε Seldom estimate error Rarely correct for error Mainly use DEM inputs Initially 3x3 windows Slope, aspect, curvatures Maybe wetness index Later improvements were measures of slope position Rarely use ancillary data Exceptions Bui, Skull, Zhu Operate at single scale Characteristics of Models Many use expert knowledge Data mining is the exception Training data seldom used Specialty software prevails Software for DEM analysis SoLIM, TAPESG, TOPAZ, TOPOG, TAS, LandMapR, SAGA, ISRISI, 3dMapper Software for extracting rules Expector, Netica, CART, See 5, Cubist, Prospector Software for applying rules ESRI, SoLIM, SIE, SAGA
Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Approaches Aimed at Predicting Continuous Soil Properties
Past Concepts: Deterministic Component Z*(s) Continuous Soil Properties Same Theory-Concepts as for Classed Maps Soil = f (C, O, R, P, T, ) Except theory applied to individual soil properties Initially referred to as environmental correlation Soil properties related to Landscape attributes Climate variables Geology, lithology, soil pm Key Papers Moore et al., 1993 Linear regression McSweeney et al., 1994 McKenzie & Austin, 1993 Gessler at al, 1995 GLMs in S-Plus McKenzie & Ryan, 1999 Regression Trees
Past Models: Deterministic Component Z*(s) Continuous Soil Properties Regression Trees McKenzie & Ryan, 1998, Odeh et al., 1994 Fuzzy Logic-Neural Networks Zhu, 1997 Bayesian Expert Knowledge Skidmore et al., 1996 Cook et al., 1996, Corner et al., 1997 GLMs General Linear Models McKenzie & Austin, 1993 Gessler et al., 1995 Source: McKenzie and Ryan, 1998
Past Inputs: Deterministic Component Z*(s) for Continuous Soil Properties Similar to Classed Maps But: Many innovations originated with continuous modelers Increased use of non-dem attributes climate, radiometrics, imagery Improved DEM derivatives Wetness Index & CTI Upslope means for slope, etc. Inverted DEMs to compute» Down slope dispersal» Down slope means» New slope position data Source: McKenzie and Ryan, 1998
Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Examples of Predictions of Soil Property Maps
Past Models: Deterministic Component Z*(s) Continuous Maps Aandahl, 1948 (Note Date!) Regression model Predicted Average Nitrogen (3-24 inch) Total Nitrogen by depth Total Organic Carbon by depth interval Depth of profile to loess Predictor (covariate) Slope position as expressed by length of slope from shoulder Lost in the depths of time Source: Aandahl, 1948
Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Moore et al., 1993 Seminal paper Focus on topography Small sites Other covariates were assumed constant Got people thinking About quantifying environmental correlation, especially soil-topography relationships Source: Moore et al, 1993
Source: McKenzie and Ryan, 1998 Past Models: Deterministic Component Z*(s) for Continuous Soil Properties McKenzie & Ryan, 1998 Regression Tree: Soil Depth
Source: McKenzie and Ryan, 1998 Past Models: Deterministic Component Z*(s) for Continuous Soil Properties Gessler et al., 1995 GLMs Largely based Topo CTI Others held Steady Source: Gessler, 2005
Credit: Minasny & McBratney Past Models: Deterministic Component Z*(s) for Continuous Soil Properties 2.17 160.1 Regression tree Text: C Text: S,LS,L,CL,LiC 1.18 2.84 54.61 27.45 BD<1.43 BD>1.43 Clay<46.5 Clay>46.5 0.64 2.21 2.97 2.04 15.65 13.00 14.59 5.50 BD<1.42 BD>1.42 Source: Minasny and McBratney 3.37 2.81 1.83 8.90
Developments: Deterministic Component Z*(s) Predictive Maps up to 2003 Main Developments Better DEM derivatives More and better measures of landform position or context (Qin et al., 2012) Some recognition of scale and resolution effects Different window sizes Different grid resolutions More non-dem inputs Increased use of imagery New surrogates for PM Main Developments Integration of single models into multi-purpose software ArcGIS, ArcSIE, ArcView SAGA, Whitebox, IDRISI Improved processing ability Bigger files, faster processing Emergence of 2 main scales Hillslope elements (series) Quite similar across models Landscape patterns (domains) Similar to associations
Early History of Development of DSM Deterministic Stochastic Soil Classes Soil Properties Soil Classes Soil Properties
Past Theory: Stochastic Component ε(s) Waldo Tobler (1970) First law of geography Everything is related to everything else, but near things are more related than distant things Matheron (1971) Theory of regionalized variables Webster and Cuanalo (1975) clay, silt, ph, CaCO3, colour value, and stoniness on transect Burgess and Webster (1980 ab) Soil Property maps by kriging Universal kriging (drift) of EC
Source: Oliver, 1989 Past Models: Stochastic Component ε(s) Universal Model of Variation Matheron (1971) Burgess and Webster (1980 ab) Webster and Burrough (1980) Burrough (1986) Webster and McBratney (1987) Oliver (1989)
Past Models: Stochastic Component ε(s) Optimal Interpolation by Kriging Irregular spatial distribution (of observed point values) Compute semi-variance at different lag distances 6 5 6 7 6 7 8 5 6 y 7 x Collect point sample observations Fit Semi-variogram to lag data Estimate values and error at fixed grid locations 6.1 5.7 5.3 5.8 7.0 6.5 6.0 5.2 7.6 7.0 6.0 5.7 7.2 7.0 6.2 5.5
Past Software: Stochastic Component ε(s) Earlier Stand Alone Pc-Geostat (PC-Raster) Early version of GSTAT VESPER Variogram estimation and spatial prediction with error Minasny et al., 2005 http://sydney.edu.au/agricultu re/pal/software/vesper GEOEASE (DOS, 1991) http://www.epa.gov/ada/csm os/models/geoeas.html Later More Integrated GSTAT Pebesma and Wesseling, 1998 Incorporated into ISRISI Now incorporated into R and S-Plus packages Pebesma, 2004 http://www.gstat.org/index.ht ml ArcGIS Geostatistical Analyst SGeMS (Stanford Univ) http://sgems.sourceforge.net/
Past Inputs: Stochastic Component ε(s) Essentially Just x,y,z Values at Point Locations 1. Start with set of soil property values irregularly distributed in x,y Cartesian space 2. Locate the regularly spaced grid nodes where predicted soil property values are to be calculated 3. Locate the n soil property data points within a search window around the current grid cell for which a value is to be calculated 4. Compute a new value for each location as the weighted average of n neighbor elevations with weights established by the semi-variogram
Past Models: Stochastic Component ε(s) for Continuous Soil Properties Examples of Predictions of Soil Property Maps by Kriging
Continuous Soil Property Maps by Kriging Very Early Alberta Example Lacombe Research Station Sampled soils on a 50 m grid Sand, Silt, Clay, ph, OC, EC, others 3 depths (0-15, 15-50, 50-100) Used custom written software Compute variograms Interpolate using the variograms Only visualised as contour maps Only got 3D drapes in 1988 Used PC-Raster to drape Saw strong soil-landscape pattern Source: MacMillan, 1985 unpublished SEMI-VARIOGRAM FOR A-HORIZON %SAND SEMI-VARIANCE 160 140 120 100 80 60 40 20 0 1 3 5 7 9 11 13 15 LAG (1 LAG = 30 M) LACOMBE SITE: A HORIZON %SAND (1985) 17 19
Continuous Soil Property Maps by Kriging Source: http://sydney.edu.au/agriculture/pal/software/vesper.shtml
Continuous Soil Property Maps by Kriging Yasribi et al., 2009 Simple ordinary kriging of soil properties (OK) No co-kriging No regression prediction Relies on presence of Sufficient point samples Spatial structure over distances longer then the smallest sampling interval Source: Yasribi et al., 2009
Continuous Soil Property Maps by Kriging Shi, 2009 Comparison of ph by four different methods a) HASM b) Kriging c) IWD d) Splines Source: Yasribi et al., 2009
Developments: Stochastic Component ε(s) Predictive Maps up to 2003 Main Developments Theory Becomes better understood and accepted Concepts Regression-kriging evolves to include a separate part for regression prediction Models Understanding and use of universal model grows Directional, local variograms Main Developments Software From stand alone and single purpose to integrated software Improvements in Visualization Capacity to process large data sets Automated variogram fitting Ease of use Inputs Developments in sampling designs and sampling theory
Present and Recent Past Key Developments in DSM Since 2003 (2003-2012) On Digital Soil Mapping McBratney et al., 2003
Developments in DSM Since 2003 Increasing Convergence and Interplay Deterministic Stochastic Soil Classes Soil Properties Soil Classes Soil Properties Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation
Theory: Key Developments Since 2003 Deterministic Part Pretty much unchanged But Still based on attempting to elucidate quantitative relationships between soils & environmental covariates Scorpan elaboration highlights importance of the spatial component (n) and of spatially correlated error ε(s) Stochastic Part Same underlying theory But Still based on theory of regionalized variables Increasing realization that the structural part of variation (non-stationary mean or drift) can be better modelled by a deterministic function than by purely spatial calculations
Concepts: Key Developments Since 2003 Deterministic Part Scorpan Model Explicitly recognizes soil data (s) as a potential input to predict other soil data Soil inputs can include soil maps, point observations, even expert knowledge Explicitly recognizes space (n) or location as a factor in predicting soil data Space as in x,y location Space as in context, kriging Factors as predictors Factors explicitly seen as quantitative predictors in prediction function Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation
Concepts: Key Developments Since 2003 Stochastic Part Emergence of Regression Kriging (RK) Key difference to ordinary kriging is that it is no longer assumed that the mean of a variable is constant Local variation or drift can be modelled by some deterministic function Local regression lowers error, improves predictions Local regression function can even be a soil map Source: Heuvelink, personal communication
Models: Key Developments Since 2003 Deterministic Part Improvements in Data Mining and Knowledge Extraction Supervised Classification Training data obtained from both points and maps» Sample maps at points Ensemble or multiple realization models (100 x)» Boosting, bagging» Random Forests» ANN, Regression tree Deterministic Part Improvements in Data Mining and Knowledge Extraction Expert Knowledge Extraction Bayesian Analysis of Evidence Prototype Category Theory Fuzzy Neural Networks Tools for Manual Extraction of Fuzzy Expert Knowledge» ArcSIE, SoLIM Unsupervised classification Fuzzy k-means, c-means
Models: Key Developments Since 2003 Stochastic Part Regression Kriging Recognized as equivalent to universal kriging or kriging with external drift Use of external knowledge and maps made easier Incorporation of soft data Made more accessible through implementation in commercial (ESRI) and open source software (R) Stochastic Part Regression Kriging Odeh et al., 1995 McBratney et al., 2003 Hengl et al., 2004, 2007, 2003 Heuvelink, 2006 Hengl how to books http://spatialanalyst.net/book/ http://www.itc.nl/library /Papers_2003/misca/hen gl_comparison.pdf
Software: Key Developments Since 2003 Commercial Software JMP (SAS) (McBratney) http://www.jmp.com/ S-Plus, Matlab, Used by soil researchers See5, CUBIST, CART Regression Trees Netica (Bayesian) Norsys.com/netica Improvements Better visualization Better interfaces Non-commercial Software Fuzzy Logic SoLIM Zhu et al., 1996, 1997 ArcSIE Shi, FuzME Bayesian Logic Full Range of Options R http://www.r-project.org Regression Kriging Random Forests Regression Trees GLMs GSTAT (in R)
Source: Schmidt and Andrew., 2005 Inputs: Key Developments Since 2003 Terrain Attributes More and better measures Primarily contextual and related to landform position Real advances related to Multi-scale analysis varying window size and grid resolution Window-based and flowbased hill slope context Systematic examination of relationships of properties and processes to scale Source: Smith et al., 2006
Inputs: Key Developments Since 2003 Terrain Attributes Multi-scale analysis Varying window size and grid resolution Identifies that some variables are more useful when computer over larger windows or coarser grids Finer resolution grids not always needed or better Drop off in predictive power of DEMs after about 30-50 m grid resolution Source: Deng et al., 2007
MrVBF: Multi-scale DEM Analysis Smooth and subsample Source: Gallant, 2012 Original: 25 m Generalised: 75 m Generalised 675 m Flatness Flatness Bottomness Bottomness Valley Bottom Flatness Valley Bottom Flatness
Multiple Resolution Landform Position MrVBF Example Outputs Broader Scale 9 DEM MRVBF for 25 m DEM Source: Gallant, 2012
Developments: Improved Measures of Landform Position SAGA-RHSP: relative hydrologic slope position SAGA-ABC: altitude above channel Source: C. Bulmer, unpublished Calculation based on: MacMillan, 2005 Source: C. Bulmer, unpublished
Developments: Improved Measures of Landform Position TOPHAT Schmidt and Hewitt (2004) Slope Position Hatfield (1996) Source: Schmidt & Hewitt, (2004) Source: Hatfield (1996)
Developments: Improved Measures of Landform Position - Scilands Source: Rüdiger Köthe, 2012
Measures of Relative Slope Length (L) Computed by LandMapR Percent L Pit to Peak Percent L Channel to Divide MEASURE OF REGIONAL CONTEXT MEASURE OF LOCAL CONTEXT Source: MacMillan, 2005 Image Data Copyright the Province of British Columbia, 2003
Measures of Relative Slope Position Computed by LandMapR Percent Diffuse Upslope Area Percent Z Channel to Divide SENSITIVE TO HOLLOWS & DRAWS RELATIVE TO MAIN STREAM CHANNELS Image Data Copyright the Province of British Columbia, 2003 Source: MacMillan, 2005
Source: Reuter, H.I. (unpublished) Developments: Improved Classification of Landform Patterns Iwahashi & Pike (2006) Iwahashi landform underlying 1:650k soil map Terrain Series Fine texture, High convexity Fine texture, Low convexity Coarse texture, High convexity Coarse texture, Low convexity Terrain Classes 1 5 9 13 3 7 11 15 2 6 10 14 4 8 12 16 steep gentle
Inputs: Key Developments Since 2003 Non-Terrain Attributes Systematic analysis of environmental covariates Detect distances and scales over which each covariate exhibits a strong relationship with a soil or property to be predicted or just with itself Vary window sizes and grid resolutions and compute regressions on derivatives analyse range of variation inherent to each covariate» Functional relationships are dependent on scale Source: Park, 2004
Inputs: Key Developments Since 2003 Non-Terrain Attributes Systematic analysis of scale of environmental covariates Select and use input covariates at the most appropriate scale Explicitly recognize the hierarchical nature of environmental controls on soils Select variables at the scales, resolutions or window sizes with the strongest predictive power for each property or class to be predicted. Source: Park, 2004
Inputs: Key Developments Since 2003 Harmonization of soil profile depth data through spline fitting Source: David Jacquier, 2010
Inputs: Key Developments Since 2003 From discrete soil classes to continuous soil properties Clearfield soil series Wapello County, Iowa Mukey: 411784 Musym: 230C Harmonization of soil profile data through spline fitting Modal profile Source: Sun et al., (2010) Fit masspreserving spline Fitted Spline Estimate averages for spline at standardised depth ranges, e.g., globalsoilmap depth ranges Spline averages at specified depth ranges
Source: Hempel et al., 2011 Outputs: Key Developments Since 2003 From Classes to Properties Non-disaggregated soil maps Weighted averages by polygon by soil property and depth Calling version 0.5 Disaggregated Soil Class Maps Estimate soil property values at every grid cell location & depth Based on weighted likelihood value of occurrence of each of n soils times property value for that soil at that depth Likelihood value can come from various methods Source: Sun et al, 2010
Outputs: Key Developments Since 2003 From Classes to Properties Disaggregated Soil Class Maps Estimate soil property values at every grid cell location Source: Zhu et al., 1997
Recent Models Recent Examples of Predictions of Soil Class Maps
Predicting Area-Class Soil Maps Clovis Grinand, Dominique Arrouays,Bertrand Laroche, and Manuel Pascal Martin. Extrapolating regional soil landscapes from an existing soil map: Sampling intensity, validation procedures, and integration of spatial context. Geoderma 143, 180-190 Source: Grinand et al., 2008
Source: Park et al, 2004 Recent Knowledge-Based Classification In Africa, Multi-scale, Hierarchical Landforms Elevation + Slope + UPA + Catena ( 2 km support) SOTER Soil and landforms (1:1 million 1.5 million
DEM Digital Soil Mapping in England & Wales using Legacy Data Predicted soil series TOPAZ TAPES-G LandMapR TRAINING DATA MODELLING (NETICA) OUTPUTS Point Data Detailed soil maps Covariates Expert knowledge Accuracy assessment Source: Mayr, 2010
Predicting Area-Class Soil Maps Using Multiple Regression Trees (100 x) Prepare a database and tables of mapping units & soil series, and covariates Select 1/n of the points systematically (n=100) Repeat n times Sample soil series randomly from the multinomial distribution of mapping unit composites Used See 5, (RuleQuest Research, 2009 Construct decision tree Predict soil series at all pixels Calculate the soil series statistics based on the n predictions for each pixel Calculate the probability for each soil series Generate soil series maps Source: Sun et al., 2010
Predicting Area-Class Soil Maps Using Multiple Regression Trees (100 x) A closer look at the junction point in the middle of 4 combined maps, (a) the original map units, and (b) the most likely soil series map and its associated probability. The length of the image is approximately 14 km. (a) Legend monr_comppct Value High : 100 Low : 7 (b) Source: Sun et al., 2010
Recent Models Recent Examples of Predictions of Continuous Soil Property Maps
Source: Hengl et al., 2004 Continuous Soil Property Maps by Kriging & RK Hengl et al., 2004 Comparison of topsoil thickness by four different methods a) Point locations b) Soil Map only c) Ordinary Kriging d) Plain Regression e) Regression-kriging Evidence supports RK
Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) 300 soil point data Assemble field data
Recent Example: Regression-Kriging (scorpan + ε) Source: Minasny et al., 2010 Assemble covariates for the predictive model
Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) Perform regression to build a predictive model Linear Model OC = f(x) + e Predictors Elevation Aspect Landsat band 6 NDVI Land-use Soil-Landscape Unit
Source: Minasny et al., 2010 Recent Example: Regression-Kriging scorpan + ε) Predict both property value and standard error over the entire area
Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) Fit a variogram to the residuals
Source: Minasny et al., 2010 Recent Example: Regression-Kriging scorpan + ε) Krige the residuals
Source: Minasny et al., 2010 Recent Example: Regression-Kriging scorpan + ε) Linear Model + Add interpolated residuals to the prediction from regression Residuals Final Prediction
Source: Minasny et al., 2010 Recent Example: Regression-Kriging (scorpan + ε) (Std.err. of regression) 2 Add regression variance and kriging variance to get total variance + (Std. err. of kriging) 2 (Total Variance) 1/2
Recent Example: Regression-Kriging C predicted for sampled locations C=100-1.2EC-5.2REF-0.6REF 2-2.1EL Regression model Residuals C predicted for all grid locations Kriging Final C map Mg C/ha 95 85 75 65 55 45 35 25 15 Mean 64.0 Min 27.0 Max 87.9 CV% 18.4 RMSE 9.8 RI (%) 19.7
Source: Mayr et al., 2010 Continuous Soil Property Maps by Hybrid Bayesian Analysis
Future Trends Personal View of Likely Future DSM Development (Post 2012)
Possibility to move from single snapshot mapping of static soil properties to continuous update and improvement of maps of both static and dynamic properties within a structured and consistent framework.
Source: Heuvelink et al., 2004 The Future: Lets Go Back and Talk About the Universal Model of Variation Again Z(s) = Z*(s) + ε(s) + ε Lots of things qualify as regression! Deterministic part of the predictive model Regression just means minimizing variance Stochastic part of the predictive model What is all this talk about optimization?
Source: Zhu et al., 2010 The Future: Maybe Progress Towards True Regression will be Stepwise Z(s) = Z*(s) + ε(s) + ε Lots of things qualify as regression! Regression depends on having enough point data
The Future: A Conceptual Framework for GSIF A Global Soil Information Facility Collaborative and open collection, input and sharing of geo-registered field evidence (Open Soil Profiles) Collaborative and open production, assembly and sharing of covariate data (World Grids) Collaborative and open and modelling on an inter-active, web-based serverside platform Everything is accessible, transparent and repeatable Maps we can all contribute to, access, use, modify and update, continuously and transparently Source: Hengl et al., 2011
Source: Hengl et al., 2011 The Future: Functionality for GSIF A Global Soil Information Facility Possibility of making use of existing legacy soil maps (even new soil maps) needed for soil prediction anywhere Possibility to assess error and correct for it everywhere Possibility of rescuing, sharing, harmonizing and archiving soil profile point data needed for soil prediction anywhere Possibility to develop and use global models (even for local mapping) Possibility to develop and use multi-scale and multi-resolution hierarchical models
Source: Hengl et al., 2011 The Future: Conceptual Framework for GSIF Open Soil Profiles
Source: Hengl et al., 2011 The Future: Conceptual Framework for GSIF World Grids
Source: Hengl et al., 2011 The Future: Conceptual Framework for GSIF World Grids
The Future: Collaborative Global, Multi- Scale Mapping through GSIF Possibility for combining Top-Down and Bottom-up mapping through weighted averaging of 2 or more sets of predictions ) Possibility to develop and use global models (even for local mapping) Source: Hengl et al., 2011
Source: Hengl et al., 2011 The Future: Global, Multi-Scale Modeling of Soil Properties through GSIF Possibility to develop and use multi-scale and multi-resolution hierarchical models Possibility to develop and use global models (even for local mapping)
Source: Hengl et al., 2011 The Future: Global, Multi-Scale Modeling of Soil Properties through GSIF Global Models inform and improve local mapping
Source: Hengl et al., 2011 The Future: Functionality for GSIF A Global Soil Information Facility Anyone can access and display the maps
The Future: Functionality for GSIF A Global Soil Information Facility With Google Earth everyone has a GIS to view free soil maps and data Slide credit: Tom Hengl, 2011 Source: Hengl et al., 2011
Source: Hengl et al., 2011 The Future: Collaborative Global, Multi- Scale Mapping through GSIF A Global Collaboratory! Working together we can map the world one tile at a time! The next generation of soil surveyors is everyone!