golder 2013 dsm_introduction_presentation_feb6_ram_version1

Introduction to Digital Soil Mapping

(DSM)

R. A. (Bob) MacMillanLandMapper Environmental Solutions Inc.

Presented to Golder Associates: Feb 6, 2013

Outline• Unifying DSM Framework: Universal Model of Variation

– Z(s) = Z*(s) + ε(s) + ε

• Past: Early History of Development of DSM (pre 2003)– Theory, Concepts, Models, Software, Inputs, Developments

– Examples of early methods and outputs

• Key Recent Developments in DSM post 2003– Theory, Concepts, Models, Software, Inputs, Developments

– Examples of recent methods and outputs

• Future Trends: How do I See DSM Developing?– Theory, Concepts, Inputs, Models, Software, Developments

– From Static Maps to Dynamic Real-Time Models

Introduction

Universal Model of Soil Variation

A Unifying Framework for DSM

Universal Model of Soil Variation• A Unifying Framework for Digital Soil Mapping

Z(s) = Z*(s) + ε(s) + ε

Predicted soil type or

soil property value

Deterministic part of

the predictive model

Stochastic part of the

predictive model

Pure Noise part of


Predicted spatial

pattern of some soil

property or class

including uncertainty

of the estimate

part of the variation

that shows spatial

structure, can be

modelled with a

variogram


that is predictable by

means of some

statistical or heuristic

soil-landscape model


that can’t be predicted

at the current scale

with the available

data and models

Source: Burrough, 1986 eq. 8.14

Deterministic Part of Prediction Model:

Z*(s)

• Conceptual Models

– Conceptual or mental soil-landscape models

– Produce area-class maps

• Statistical Models

– Scorpan – relate soils/soil properties to covariates

– Explain spatial distribution of soils in terms of known soil forming factors as represented by covariates

EOR Series DYD Series KLM Series FMN Series

15

40

60

COR Series

I n d i v i d u a l s a l in i t y h a z a r d r a t i n g s

f o r e a c h l a y e r

1 0 0 x 1 0 0 m g r id

L a n d s c a p e

c u r v a tu r e

V e g e ta t io n

R a in fa l l

G e o lo g y

S o i ls

L a n d s u r f a c e

S a l in i t y h a z a r d

m a p

L a y e r w e ig h t in g s

2 x

1 x

2 x

1 x

3 x

T o ta l s a l in i t y

h a z a r d r a t in g

Stochastic Part of Prediction Model:

ε(s)

• Geostatistical Estimation

– Predict soil properties• Point or block kriging

– Predict soil classes• Indicator kriging

– Predict error of estimate

• Correct Deterministic Part

– Error in deterministic part is computed (residuals)

– If structure exists in error then krige error & subtract

Pure Noise Part of Prediction Model:

ε(s)

• Some Variation not Predictable

– Have to be honest about this• Should quantify and report it

• Deterministic Prediction

– Mental and Statistical Models• Not perfect – often lack suitable

covariates to predict target variable

• Lack covariates at finer resolution

• Geostatistical Prediction

– Insufficient point input data• Can’t predict at less than the

smallest spacing of input point datad1 d2 d3 d4

SemiVariance

Lag (distance)

Sill

Nugget

Range

Past

Early History of DSM Development

(pre 2003)On Digital Soil Mapping

McBratney et al., 2003

Early History of Development of DSM

Deterministic

Soil Classes

Soil Properties

Stochastic

Soil Classes

Soil Properties

Past Theory: Deterministic Component

Z*(s) Classed Conceptual Models– Jenny (1941)

• CLORPT (Note no N=space)

– Simonson (1959)• Process Model of additions,

removals, translocations, transformations

– Ruhe (1975)• Erosional -Depositional

surfaces, open/closed basins

– Dalrymple et al., (1968)• Nine unit hill slope model

– Milne (1936a, 1936b)• Catena concept, toposequences

Past Concepts: Deterministic Component

Z*(s) Classed Conceptual Models

Climate

Topography

Parent

Material

Organisms

Time

Soil

Soil = f (C, O, R, P, T, …)

Source: Lin, 2005 Frontiers in Soil Science

http://www7.nationalacademies.org/soilfrontiers/

http://solim.geography.wisc.edu/index.htm

Past Models: Deterministic Component

Z*(s) Classed Statistical Predictions• Fuzzy Inference

– Zhu, 1997, Zhu et al., 1996

– MacMillan et al., 2000, 2005

• Neural Networks

– Zhu, 2000

• Expert Knowledge (Bayesian)

– Skidmore et al., 1991

– Cook et al., 1996, Corner et al., 1997

• Regression Trees

– Moran and Bui, 2002, Bui and Moran, 2003

I n d i v i d u a l s a l in i t y h a z a r d r a t i n g s

f o r e a c h l a y e r

1 0 0 x 1 0 0 m g r id

L a n d s c a p e

c u r v a t u r e

V e g e t a t io n

R a in f a l l

G e o lo g y

S o i ls

L a n d s u r f a c e

S a l in i t y h a z a r d

m a p

L a y e r w e ig h t in g s

2 x

1 x

2 x

1 x

3 x

T o t a l s a l in i t y

h a z a r d r a t in g

Source: Jones et al., 2000

Past Software: Deterministic Component

Z*(s) Classed Statistical Predictions• Regression Trees

– CUBIST • Rulequest Research , 2000

– CART• Breiman et al., 1984

– C4.5 & See5• Quinlin, 1992

– JMP (SAS)• http://www.jmp.com/

– R• http://www.r-project.org/

• Fuzzy Logic

– SoLIM

• Zhu et al., 1996, 1997

– LandMapR, FuzME

• Bayesian Logic

– Prospector

• Duda et al., 1978

– Expector

• Skidmore et al., 1991

– Netica

• Norsys.com/netica

Past Inputs: Deterministic Component

Z*(s) Classed Statistical Predictions

• C = Climate

– Temp, Ppt, ET, Solar Rad

• Mean, min, max, variance

• Annual, monthly, indices

• O = Organisms

– Manual Maps

• Land Use

• Vegetation

– Remotely Sensed Imagery

• Classified RS imagery

• NDVI, EVI, other ratios

• R = Relief (topography)

– Primary Attributes

• Slope, aspect, curvatures

• Slope Position, roughness

– Secondary Attributes

• CTI, WI, SPI, STC

• P = Parent Material

– Published geology maps

– Gamma radiometrics

– Thermal IR, RS Ratios

• A = Age


Z*(s) Classed Statistical Predictions• Common Topo Inputs

– Profile Curvature

– Plan (Contour) Curvature

– Slope Gradient (& Aspect)

– CTI or Wetness Index• Sometimes, not always

• Less Common Topo Inputs

– Surface Roughness

– Relief within a window

– Relief relative to drainage• Pit, peak, Ridge, channel,

Profile Curvature Plan Curvature

Slope Gradient Wetness Index

Pit 2 Peak Relief Divide 2 Channel

Source: MacMillan, 2005

Past Inputs: Non-DEM Airborne

Radiometrics

• Radiometrics 4 Subsurface • Infer Parent Material

Source: Mayr, 2005

Past Inputs: Non-DEM Satellite Imagery

Grassland Land Cover Types Alpine Land Cover Types


Z*(s)

Examples of Predictions of Soil Class

Maps

Approaches to Producing Predictive Area-

Class Maps

Knowledge-Based Classification In SoLIM

Source: Zhu, SoLIM Handbook

Knowledge-Based Classification Using

Boolean Decision Tree in USA

Gilpin

Pineville

Laidig

Guyandotte

Dekalb

Component Soils

Craigsville

Meckesville

Cateache

Shouns

Source: Thompson et al., 2010 WCSS

Knowledge-Based Classification In LandMapR

Source: Steen and Coupé, 1997


Knowledge-Based Classification In LandMapR

Source: Global Forest Watch Canada, 2012

Note: Not simple slope elements but complex patterns

Source: Cole and Boettinger, 2004

Knowledge-Based Classification In Utah,

Knowledge-Based PURC Approach


Class Maps

Supervised Classification Using Regression Trees

Note similarity of supervised rulesand classes to typical soil-landformconceptual classes

Note numeric estimate of likelihood of occurrence of classes

Source: Zhou et al., 2004,

JZUS

Supervised Classification Using Bayesian

Analysis of Evidence/Classification Trees

Source: Zhou et al., 2004,

JZUS

Predicting Area-Class Soil Maps Using

Discriminant Analysis

Source: Scull et al., 2005, Ecological Modelling

Uncertainty of prediction

Bui and Moran (2003)

Geoderma 111:21-44

Extrapolation

Source: Bui and Moran., 2003


Regression Trees

Supervised Classification Using Fuzzy Logic

• Shi et al., 2004– Used multiple cases of reference

sites

– Each site was used to establish fuzzy similarity of unclassified locations to reference sites

– Used Fuzzy-minimum function to compute fuzzy similarity

– Harden class using largest (Fuzzy-maximum) value

– Considered distance to each reference site in computing Fuzzy-similarity

Fuzzy likelihood of being a broad ridge

Source: Shi et al., 2004


Class Maps

Concept of Fuzzy K-means Clustering

Credit: J. Balkovič & G. Čemanová

Source: Sobocká et al., 2003

Example of Application of Fuzzy K-means

Unsupervised Classification

From: Burrough et al., 2001, Landscsape Ecology

Note similarity of unsupervised

classes to conceptual classes

Example of Application of Disaggregation of

a Soil Map by Clustering into Components

Source: Faine, 2001

Developments: Deterministic Component

Z*(s) Classed Predictive Maps in Past• Characteristics of Models

– Models largely ignored ε• Seldom estimate error

• Rarely correct for error

– Mainly use DEM inputs• Initially 3x3 windows

• Slope, aspect, curvatures

• Maybe wetness index

• Later improvements were measures of slope position

– Rarely use ancillary data• Exceptions like Bui, Skull

– Operate at single scale

• Characteristics of Models

– Many use expert knowledge• Data mining is the exception

• Training data seldom used

– Specialty software prevails• Software for DEM analysis

– SoLIM, TAPESG, TOPAZ, TOPOG, TAS, SAGA, ESRI, ISRISI, LandMapR

• Software for extracting rules

– Expector, Netica, CART, See 5, Cubist, Prospector

• Software for applying rules

– ESRI, SoLIM, SIE, SAGA


Z*(s) for Continuous Soil Properties

Approaches Aimed at Predicting

Continuous Soil Properties

Past Concepts: Deterministic Component

Z*(s) Continuous Soil Properties• Same Theory-Concepts

as for Classed Maps

– Except theory applied to individual soil properties

– Initially referred to as environmental correlation

– Soil properties related to• Landscape attributes

• Climate variables

• Geology, lithology, soil pm

• Key Papers

– Moore et al., 1993

• Linear regression

– McSweeney et al., 1994

– McKenzie & Austin, 1993

– Gessler at al, 1995

• GLMs in S-Plus

– McKenzie & Ryan, 1999

• Regression Trees

Soil = f (C, O, R, P, T, …)


Z*(s) Continuous Soil Properties• Regression Trees

– McKenzie & Ryan, 1998, Odeh et al., 1994

• Fuzzy Logic-Neural Networks

– Zhu, 1997

• Bayesian Expert Knowledge

– Skidmore et al., 1996

– Cook et al., 1996, Corner et al., 1997

• GLMs – General Linear Models

– McKenzie & Austin, 1993

– Gessler et al., 1995Source: McKenzie and Ryan, 1998



• Similar to Classed Maps But:

– Many innovations originated with continuous modelers

• Increased use of non-DEM attributes

– climate, radiometrics, imagery

• Improved DEM derivatives

– Wetness Index & CTI

– Upslope means for slope, etc.

– Inverted DEMs to compute

» Down slope dispersal

» Down slope means

» New slope position data

Source: McKenzie and Ryan, 1998



Examples of Predictions of Soil

Property Maps


Z*(s) Continuous Maps

• Aandahl, 1948 (Note Date!)

– Regression model• Predicted

– Average Nitrogen (3-24 inch)

– Total Nitrogen by depth

– Total Organic Carbon by depth interval

– Depth of profile to loess

• Predictor (covariate)

– Slope position as expressed by length of slope from shoulder

– Lost in the depths of time

Source: Aandahl, 1948



• Moore et al., 1993

– Seminal paper

– Focus on topography• Small sites

• Other covariates were assumed constant

– Got people thinking• About quantifying

environmental correlation, especially soil-topography relationships

Source: Moore et al, 1993



• McKenzie & Ryan, 1998

– Regression Tree: Soil Depth




• Gessler et al., 1995

– GLMs

– Largely based• Topo

– CTI

• Others held

– Steady


Source: Gessler, 2005

Credit: Minasny & McBratney

Regression tree2.17

1.18 2.84

Text: C Text: S,LS,L,CL,LiC

0.64 2.21 2.97 2.04

160.1

54.61 27.45

BD<1.43 BD>1.43 Clay<46.5 Clay>46.5

15.65 13.00 14.59 5.50

BD<1.42 BD>1.42

3.37 2.81

1.83 8.90



Source: Minasny and McBratney

Developments: Deterministic Component

Z*(s) Predictive Maps up to 2003

• Main Developments

– Better DEM derivatives

• More and better measures of

landform position or

context

• Some recognition of scale

and resolution effects

– Different window sizes

– Different grid resolutions

– More non-DEM inputs

• Increased use of imagery

• New surrogates for PM


– Integration of single models

into multi-purpose software

• ArcGIS, ArcSIE, ArcView

• SAGA, Whitebox, IDRISI

– Improved processing ability

• Bigger files, faster processing

– Emergence of 2 main scales

• Hillslope elements (series)

– Quite similar across models

• Landscape patterns (domains)

– Similar to associations

Early History of Development of DSM

Deterministic

Soil Classes

Soil Properties

Stochastic

Soil Classes

Soil Properties

Past Theory: Stochastic Component

ε(s)– Waldo Tobler (1970)

• First law of geography

– Everything is related to everything else, but near things are more related than distant things

– Matheron (1971)• Theory of regionalized variables

– Webster and Cuanalo (1975)• clay, silt, pH, CaCO3, colour

value, and stoniness on transect

– Burgess and Webster (1980 ab)• Soil Property maps by kriging

• Universal kriging (drift) of EC

Past Models: Stochastic Component

ε(s)– Universal Model of Variation

• Matheron (1971)

• Burgess and Webster (1980 ab)

• Webster and Burrough (1980)

• Burrough (1986)

• Webster and McBratney (1987)

• Oliver (1989)

Source: Oliver, 1989

Past Models: Stochastic Component

ε(s) Optimal Interpolation by Kriging

Fit Semi-variogram to lag data

6

6

7

6

6

7

7

5

8 5

x

y

Collect point sample observations

Irregular spatial distribution

(of observed point values)

Compute semi-variance

at different lag distances

Estimate values and error at fixed grid locations

6.1 5.7 5.3 5.8

7.0 6.5 6.0 5.2

7.6 6.0 5.77.0

7.2 7.0 6.2 5.5

Past Software: Stochastic Component

ε(s)• Earlier Stand Alone

– Pc-Geostat (PC-Raster)• Early version of GSTAT

– VESPER• Variogram estimation and

spatial prediction with error

• Minasny et al., 2005

• http://sydney.edu.au/agriculture/pal/software/vesper

– GEOEASE (DOS, 1991)• http://www.epa.gov/ada/csm

os/models/geoeas.html

• Later More Integrated

– GSTAT • Pebesma and Wesseling, 1998

• Incorporated into ISRISI

• Now incorporated into R and S-Plus packages

– Pebesma, 2004

• http://www.gstat.org/index.html

– ArcGIS• Geostatistical Analyst

– SGeMS (Stanford Univ)• http://sgems.sourceforge.net/

Past Inputs: Stochastic Component ε(s)

• Essentially Just x,y,z Values at Point Locations

1. Start with set of soil property values

irregularly distributed in x,y Cartesian space

2. Locate the regularly spaced grid nodes where predicted soil property

values are to be calculated

3. Locate the n soil property data points

within a search window around the current grid cell for which a value is

to be calculated

4. Compute a new value for each location as the weighted average of n

neighbor elevations with weights established by

the semi-variogram

Past Models: Stochastic Component ε(s)

for Continuous Soil Properties

Examples of Predictions of Soil

Property Maps by Kriging

Continuous Soil Property Maps by

Kriging

• Very Early Alberta Example

– Lacombe Research Station

• Sampled soils on a 50 m grid

– Sand, Silt, Clay,

– pH, OC, EC, others

– 3 depths (0-15, 15-50, 50-100)

• Used custom written software

– Compute variograms

– Interpolate using the variograms

• Only visualised as contour maps

– Only got 3D drapes in 1988

– Used PC-Raster to drape

– Saw strong soil-landscape pattern

0

20

40

60

80

100

120

140

160

1 3 5 7 9 11 13 15 17 19

SEMI-VARIOGRAM FOR A-HORIZON %SAND

LAG (1 LAG = 30 M)

SE

MI-

VA

RIA

NC

E

LACOMBE SITE: A HORIZON %SAND (1985)Source: MacMillan, 1985 unpublished


Kriging

Source: http://sydney.edu.au/agriculture/pal/software/vesper.shtml


Kriging

• Yasribi et al., 2009

– Simple ordinary kriging

of soil properties (OK)

• No co-kriging

• No regression prediction

– Relies on presence of

• Sufficient point samples

• Spatial structure over

distances longer then the

smallest sampling

interval

Source: Yasribi et al., 2009


Kriging

• Shi, 2009

– Comparison of pH by

four different methods

• a) HASM

• b) Kriging

• c) IWD

• d) Splines

Source: Yasribi et al., 2009

Developments: Stochastic Component

ε(s) Predictive Maps up to 2003


– Theory

• Becomes better understood

and accepted

– Concepts

• Regression-kriging evolves

to include a separate part for

regression prediction

– Models

• Understanding and use of

universal model grows

• Directional, local variograms


– Software

• From stand alone and single

purpose to integrated software

• Improvements in

– Visualization

– Capacity to process large

data sets

– Automated variogram fitting

– Ease of use

– Inputs

• Developments in sampling

designs and sampling theory

Present and Recent Past

Key Developments in DSM Since 2003

(2003-2012)On Digital Soil Mapping

McBratney et al., 2003

Developments in DSM Since 2003

Deterministic

Soil Classes

Soil Properties

Stochastic

Soil Classes

Soil Properties

Increasing Convergence and Interplay

Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Theory: Key Developments Since 2003

• Deterministic Part

– Pretty much unchanged

• Still based on attempting to

elucidate quantitative

relationships between soils

& environmental covariates

– But

• Scorpan elaboration

highlights importance of

the spatial component (n)

and of spatially correlated

error ε(s)

• Stochastic Part

– Same underlying theory

• Still based on theory of

regionalized variables

– But

• Increasing realization that

the structural part of

variation (non-stationary

mean or drift) can be better

modelled by a deterministic

function than by purely

spatial calculations

Concepts: Key Developments Since 2003


– Scorpan Model

• Explicitly recognizes soil data

(s) as a potential input to

predict other soil data

– Soil inputs can include soil

maps, point observations,

even expert knowledge

• Explicitly recognizes space

(n) or location as a factor in

predicting soil data

– Space as in x,y location

– Space as in context, kriging

• Factors as predictors

– Factors explicitly seen as

quantitative predictors in

prediction function

Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Concepts: Key Developments Since 2003

• Stochastic Part

– Emergence of Regression

Kriging (RK)

• Key difference to ordinary

kriging is that it is no longer

assumed that the mean of a

variable is constant

• Local variation or drift can

be modelled by some

deterministic function

– Local regression lowers

error, improves predictions

– Local regression function

can even be a soil mapSource: Heuvelink, personal communication

Models: Key Developments Since 2003


– Improvements in Data

Mining and Knowledge

Extraction

• Supervised Classification

– Training data obtained

from both points and maps

» Sample maps at points

– Ensemble or multiple

realization models (100 x)

» Boosting, bagging

» Random Forests

» ANN, Regression tree


– Improvements in Data

Mining and Knowledge

Extraction

• Expert Knowledge Extraction

– Bayesian Analysis of Evidence

– Prototype Category Theory

– Fuzzy Neural Networks

– Tools for Manual Extraction

of Fuzzy Expert Knowledge

» ArcSIE, SoLIM

• Unsupervised classification

– Fuzzy k-means, c-means

Models: Key Developments Since 2003

• Stochastic Part

– Regression Kriging

• Recognized as equivalent to

universal kriging or kriging

with external drift

• Use of external knowledge

and maps made easier

– Incorporation of soft data

• Made more accessible

through implementation in

commercial (ESRI) and

open source software (R)

• Stochastic Part


• Odeh et al., 1995

• McBratney et al., 2003

• Hengl et al., 2004, 2007,

2003

• Heuvelink, 2006

• Hengl how to books

– http://spatial-

analyst.net/book/

– http://www.itc.nl/library

/Papers_2003/misca/hen

gl_comparison.pdf

Comparison of Soil Property Maps by

Kriging & RK

• Hengl et al., 2012

– Comparison of ordinary

kriging and regression

kriging

• Evidence supports RK as

explaining more of the

variation than OK alone

– Greater spatial detail

– Fewer extrapolation

areas

– Better fit to data

Source: Hengl et al., 2012

Software: Key Developments Since 2003

• Commercial Software

– JMP (SAS) (McBratney)• http://www.jmp.com/

– S-Plus, Matlab, • Used by soil researchers

– See5, CUBIST, CART• Regression Trees

– Netica (Bayesian)

• Norsys.com/netica

– Improvements

• Better visualization

• Better interfaces

• Non-commercial Software

– Fuzzy Logic

• SoLIM Zhu et al., 1996, 1997

• ArcSIE Shi, FuzME

– Bayesian Logic

– Full Range of Options• R

– http://www.r-project.org


– Random Forests

– Regression Trees

– GLMs

• GSTAT (in R)

Inputs: Key Developments Since 2003

• Terrain Attributes

– More and better measures

• Primarily contextual and

related to landform position

– Real advances related to

• Multi-scale analysis

– varying window size and

grid resolution

• Window-based and flow-

based hill slope context

• Systematic examination of

relationships of properties

and processes to scale

Source: Smith et al., 2006

Source: Schmidt and Andrew., 2005


• Terrain Attributes

– Multi-scale analysis

• Varying window size and

grid resolution

• Identifies that some

variables are more useful

when computed over larger

windows or coarser grids

– Finer resolution grids not

always needed or better

– Drop off in predictive

power of DEMs after

about 30-50 m grid

resolution

Source: Deng et al., 2007


• ConMAP: Hyper-scale Contextual Analysis of Topographic Parameters

Source: Berhens et al., in press

– Neighborhood example

• Diameter

– 21 km

• Predictirs

– 775


• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters


ConStat (ConMap)- neighborhood reduction

a) Full neighborhoodb) Reduction of radiic) Reduction on radii d) Combination of b and c


• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters



• Hyper-scale Terrain

Analysis in ConSTAT

– Systematic analysis of relative

importance of terrain

measures different scales

• Compute statistics of terrain

measures at different scales

– Use data mining (Random

Forests) to identify

importance of different

statistics at different scales

and at each different location


MrVBF: Multi-scale DEM AnalysisSmooth and subsample

Original: 25 m Generalised: 75 m Generalised 675 mFlatness

Bottomness

Valley Bottom

Flatness

Valley Bottom

Flatness

Bottomness

Flatness

Source: Gallant, 2012

Multiple Resolution Landform Position

MrVBF Example Outputs

Source: Gallant, 2012

Broader Scale 9” DEM

MRVBF for 25 m DEM

Developments: Improved Measures of

Landform Position

• SAGA-RHSP: relative

hydrologic slope position

• SAGA-ABC: altitude

above channel

Source: C. Bulmer, unpublishedCalculation based on: MacMillan, 2005

Source: C. Bulmer, unpublished


Landform Position

• TOPHAT – Schmidt

and Hewitt (2004)

• Slope Position – Hatfield

(1996)

Source: Hatfield (1996)Source: Schmidt & Hewitt, (2004)


Landform Position - Scilands

Source: Rüdiger Köthe , 2012

Measures of Relative Slope Length (L)

Computed by LandMapR• Percent L Pit to Peak • Percent L Channel to Divide

MEASURE OF LOCAL CONTEXTMEASURE OF REGIONAL CONTEXT

Image Data Copyright the Province of British Columbia, 2003


Image Data Copyright the Province of British Columbia, 2003

Measures of Relative Slope Position

Computed by LandMapR• Percent Diffuse Upslope Area • Percent Z Channel to Divide

RELATIVE TO MAIN STREAM CHANNELSSENSITIVE TO HOLLOWS & DRAWS


Developments: Improved Classification of

Landform Patterns Iwahashi & Pike (2006)• Iwahashi landform underlying 1:650k soil map

Source: Reuter, H.I. (unpublished)

steep gentle

Terr

ain

Series

Fine texture,

High convexity

Fine texture,

Low convexity

Coarse texture,

High convexity

Coarse texture,

Low convexity

Terrain Classes

1

4

5

8

9

12

13

2 6 10 14

3 7 11 15

16


• Non-Terrain Attributes

– Systematic analysis of

environmental covariates

• Detect distances and scales

over which each covariate

exhibits a strong relationship

with a soil or property to be

predicted or just with itself

– Vary window sizes and grid

resolutions and compute

regressions on derivatives

– analyse range of variation

inherent to each covariate

» Functional relationships

are dependent on scale

Source: Park, 2004


• Non-Terrain Attributes

– Systematic analysis of scale of

environmental covariates

• Select and use input covariates

at the most appropriate scale

– Explicitly recognize the

hierarchical nature of

environmental controls on

soils

– Select variables at the scales,

resolutions or window sizes

with the strongest predictive

power for each property or

class to be predicted.

Source: Park, 2004


Source: David Jacquier, 2010

Harmonization of soil profile depth data through spline fitting

Inputs: Key Developments Since 2003From discrete soil classes to continuous soil properties

‘Modal’

profile

Fit mass-

preserving

spline

Spline

averages

at

specified

depth

ranges

Estimate

averages for

spline at

standardised

depth

ranges, e.g.,

globalsoilmap

depth ranges

Fitted

Spline

Clearfield soil seriesWapello County, Iowa

Mukey: 411784Musym: 230C

Source: Sun et al., (2010)

Harmonization of soil profile

data through spline fitting

Outputs: Key Developments Since 2003

• From Classes to Properties

– Non-disaggregated soil maps

• Weighted averages by polygon

by soil property and depth

– Calling version 0.5

– Disaggregated Soil Class Maps

• Estimate soil property values at

every grid cell location & depth

– Based on weighted likelihood

value of occurrence of each of

n soils times property value for

that soil at that depth

– Likelihood value can come

from various methods

Source: Sun et al, 2010

Source: Hempel et al., 2011

Outputs: Key Developments Since 2003

• From Classes to Properties

– Disaggregated Soil Class Maps

• Estimate soil property values at

every grid cell location

Source: Zhu et al., 1997

Recent Models

Recent Examples of Predictions of

Soil Class Maps

Predicting Area-Class Soil Maps

Source: Grinand et al., 2008

Clovis Grinand, Dominique Arrouays,

Bertrand Laroche, and Manuel Pascal Martin.

Extrapolating regional soil landscapes from an

existing soil map: Sampling intensity,

validation procedures, and integration of

spatial context. Geoderma 143, 180-190

Recent Knowledge-Based Classification In

Africa, Multi-scale, Hierarchical Landforms

Source: Park et al, 2004

Elevation + Slope + UPA + Catena

( 2 km support)

SOTER Soil and landforms

(1:1 million – 1.5 million

Predicted

soil series

TOPAZ LandMapR

DEM

Point Data

Detailed soil maps

Covariates

TRAINING DATA MODELLING

(NETICA)OUTPUTS

Expert

knowledge

Accuracy

assessment

TAPES-G

Digital Soil Mapping

in England & Wales

using Legacy Data

Source: Mayr, 2010

Source: Sun et al., 2010


Multiple Regression Trees (100 x)

Prepare a database and tables of mapping units & soil series, and covariates

Select 1/n of the points systematically (n=100)

Sample soil series randomly from the multinomial distribution of mapping unit composites

Construct decision tree

Predict soil series at all pixels

Calculate the soil series statistics based on the n predictions for each pixel

Calculate the probability for each soil series

Generate soil series maps

Repeat n

times

Used See 5, (RuleQuest

Research, 2009

Source: Sun et al., 2010


Multiple Regression Trees (100 x)

A closer look at the junction point in the middle of 4 combined maps,

(a) the original map units, and

(b) the most likely soil series map and its associated probability.

The length of the image is approximately 14 km.

Legend

monr_comppct

Value

High : 100

Low : 7

(a)

(b)

Recent Models

Recent Examples of Predictions of

Continuous Soil Property Maps


Kriging & RK

• Hengl et al., 2004

– Comparison of topsoil

thickness by four

different methods

• a) Point locations

• b) Soil Map only

• c) Ordinary Kriging

• d) Plain Regression

• e) Regression-kriging

– Evidence supports RK


300 soil point data

Assemble

field data

Source: Minasny et al., 2010

Recent Example: Regression-Kriging

(scorpan + ε)

Assemble covariates for



(scorpan + ε)


Linear ModelOC = f(x) + e

PredictorsElevation

AspectLandsat band 6

NDVILand-use

Soil-Landscape Unit

Perform regression to

build a predictive model


(scorpan + ε)


Predict both

property value

and standard

error over the

entire area


scorpan + ε)


Fit a variogram to the

residuals


(scorpan + ε)


Krige the residuals


scorpan + ε)


+Linear Model Residuals

Final Prediction

Add interpolated

residuals to the

prediction from

regression


scorpan + ε)


+(Std.err. of regression)2

(Std. err. of kriging)2

(Total Variance)1/2

Add regression variance

and kriging variance to

get total variance


(scorpan + ε)


Mean 64.0

Min 27.0

Max 87.9

CV% 18.4

RMSE 9.8

RI (%) 19.7

Mg C/ha

15

25

35

45

55

65

75

85

95

Final C map

C=100-1.2EC-5.2REF-0.6REF2-2.1ELC predicted for

sampled locations

C predicted for

all grid locationsResiduals

Kriging

Regression

model



Hybrid Bayesian Analysis

Source: Mayr et al., 2010

Future Trends

Personal View of Likely Future DSM

Development

(Post 2012)

The Future: Lets Go Back and Talk About

the Universal Model of Variation Again

Source: Heuvelink et al., 2004

Deterministic part of


Stochastic part of the

predictive model

Lots of things qualify

as regression!

Regression just

means minimizing

variance

What is all this talk

about optimization?

Z(s) = Z*(s) + ε(s) + ε

The Future: A Conceptual Framework for

GSIF – A Global Soil Information Facility


Collaborative and

open and modelling

on an inter-active,

web-based server-

side platform

Collaborative and

open production,

assembly and sharing

of covariate data

(World Grids)

Collaborative and

open collection,

input and sharing of

geo-registered field

evidence

(Open Soil Profiles)

Maps we can all contribute to, access, use, modify and

update, continuously and transparently

Everything is

accessible,

transparent and

repeatable

The Future: Functionality for GSIF – A

Global Soil Information Facility


Possibility of making

use of existing

legacy soil maps

(even new soil maps)

needed for soil

prediction anywhere

Possibility of

rescuing, sharing,

harmonizing and

archiving soil

profile point data

needed for soil

prediction anywhere

Possibility to

develop and use

global models (even

for local mapping)

Possibility to

develop and use

multi-scale and

multi-resolution

hierarchical models

Possibility to assess

error and correct for

it everywhere

The Future: Conceptual Framework for

GSIF – World Soil Profiles


The Future: Implemented Framework for

GSIF – World Soil Profiles

Source: www.worldsoilprofiles.org

The Future: Conceptual Framework for

GSIF – World Grids



GSIF – World Grids

Source: www.worldgrids.org


GSIF

The Future: Collaborative Global, Multi-

Scale Mapping through GSIF


Possibility for combining

Top-Down and Bottom-up

mapping through weighted

averaging of 2 or more sets

of predictions

)

Possibility to

develop and use

global models (even

for local mapping)

The Future: Global, Multi-Scale Modeling

of Soil Properties through GSIF


Possibility to

develop and use

global models (even

for local mapping)

Possibility to

develop and use

multi-scale and

multi-resolution

hierarchical models



Global Models

inform and

improve local

mapping

• Global DSM Models

– Make use of ALL data

• From everywhere in

the world

– Provide initial coarse

local predictions

• That can be refined

and improved with:

– More & finer local data

– Local model runs

Source: Hengl personal communication, 2013




Global Models

inform and

improve local

mapping




Anyone can

access and

display the

maps




Slide credit: Tom Hengl,

2011

With Google

Earth everyone

has a GIS to

view free soil

maps and data

The Future: Collaborative Global, Multi-

Scale Mapping through GSIF


A Global

Collaboratory!

Working together

we can map the

world one tile at a

time!

The next generation

of soil surveyors is

everyone!

The Future: From Mapping to

Continuously Updated Modelling

Possibility to move from single

snapshot mapping of static soil

properties to continuous update and

improvement of maps of both static and

dynamic properties within a structured

and consistent framework.

golder 2013 dsm_introduction_presentation_feb6_ram_version1

Documents