SPATIAL INTERPOLATION TECHNIQUES () Iterpolato refers to the process of estmatg the ukow data values for specfc locatos usg the kow data values for other pots. I may staces we may wsh to model a feature as a cotuous feld (.e. a 'surface'), yet we oly have data values for a fte umber of pots. It therefore becomes ecessary to terpolate (.e. estmate) the values for the terveg pots. For example, we may have measuremets of the depth of a partcular geologcal stratum from a umber of bore holes, but f we wat to model the stratum 3-dmesos the we eed to estmate ts depth for places where we do ot have bore hole formato. Iterpolato may also be requred other stuatos. For example, we mght wat to covert from a raster wth a partcular cell se or oretato to a raster wth a dfferet cell se or oretato. Ths procedure s kow as covoluto. Alteratvely we mght wat to covert from a raster data model to a TIN, or vce versa. Iterpolato would aga be requred. Whe terpolatg from a sample of pots we would ormally express the estmated values of the terveg pots the same uts as were used for the measuremets at the sample pots, but sometmes we may be more terested the probablty that a certa value s exceeded or that the terpolated value s wth a certa rage. Most terpolato methods ca be dvded to two ma types called global ad local. Global terpolators use all the avalable data to provde estmates for the pots wth ukow values; local terpolators use oly the formato the vcty of the pot beg estmated. Global terpolators are ofte used to remove the effects of major treds before usg local terpolators to aalyse the resduals. Krgg s a partcular type of local terpolato usg more advaced geostatstcal techques. Iterpolato methods may also be classfed as exact or exact. Usg exact terpolato, the predcted values at the pots for whch the data values are kow wll be the kow values; exact terpolato methods remove ths costrat (.e. the observed data values ad the terpolated values for a gve pot are ot ecessarly the same). Iexact terpolato may produce a smoother (ad arguably more plausble) surface. Iterpolato methods may be ether determstc or stochastc. Determstc methods provdes o dcato of the extet of possble errors, whereas stochastc methods provde probablstc estmates. We wll beg by lookg at some of the more commo forms of spatal samplg; the we wll look at global ad the local terpolators, payg partcular atteto to a geostatstcal techque kow as krgg (a local, exact ad stochastc method). To make the dscusso more cocrete we wll make occasoal refereces to the example dscussed by Burrough ad McDoell whch the objectve s to estmate the cocetrato of c the sols a rego the souther Netherlads, usg data measured at 98 sample pots. Apart from c cocetratos, data were recorded at the sample pots o other easly measured varables such as elevato, dstace from the rver Maas, frequecy of floodg, ad sol type. The data are lsted Appedx 3 of Burrough ad McDoell. Burrough ad McDoell compare the results of dfferet techques usg these data. If you wsh to expermet wth the data yourself, they may be dowloaded Excel, dbase ad SPSS formats usg the Gs9.bat opto from the co for ths module o the desktop. SPATIAL SAMPLING Iterpolato volves makg estmates of the value of a attrbute varable at pots for whch we have o formato usg the data for a sample of pots where we do have formato. I geeral the more sample pots we have the better. Lkewse t s best to have a good spread of sample pots across the study area. The spatal arragemet of these pots ca take dfferet forms (see fgure overleaf). Regular samplg guaratees a good spread of pots, but t ca result bases f the attrbute to be mapped has regular fluctuatos (e.g. sol mosture where there are regular spaced dras). It s therefore preferable to have some form of radom samplg, where the co-ordates of the pots are selected usg radom umbers. Pure radom samplg teds to produce a patter wth lttle clusters some areas ad a sparse coverage other areas, so a - -
stratfed radom sample whch pots are radomly allocated wth a regular spaced lattce provdes a good compromse. The other three types show overleaf are for specal purposes: cluster (or ested) samplg provdes detaled formato o selected areas ad s sometmes used to exame spatal varato at dfferet scales (e.g. you could compare wth the amout of varato wth a cluster wth the varato betwee clusters); trasect samplg s used to survey profles; ad cotour samplg s sometmes used to sample prted maps to make a DEM (although, as we shall see ext day, ths ca be problematc). The area or volume of the sample at each sample pot upo whch measuremets are made s referred to as the support. For example, f measurg the meral cotet of sols, the support would be the amout of sol (e.g. x x 5 cm) whch s take for aalyss at each locato. I huma geography, the support mght be a rregular sed ad shaped area (e.g. a ED). GLOBAL INTERPOLATORS There are two broad approaches to global terpolato. Oe uses classfcato techques to fer the values of oe varable attrbute based upo a kowledge of the values of aother attrbute. The other uses regresso techques to fer the value of the varable of terest based upo a kowledge of attrbutes that are easy to measure. Classfcato Techques Ths approach may be used f spatal data are very sparse, although regresso techques (see below) would geerally be preferred f there s suffcet data. Classfcato techques are global, exact ad determstc. The basc assumpto s that the value of the varable of terest s strogly flueced by aother varable whch ca be used to classfy the study area to oes. Gve that ma source of the heavy metals the sols our case study area s from floodg by the rver Maas, t seems plausble that the c levels are a fucto of the frequecy of floodg. The study area ca be dvded to three oes based upo the frequecy of floodg. For each pot the study area, the c cocetratos ca be thought of as comprsg three compoets: ( x ) µ + + = k - -
where s the c cocetrato at locato x, µ s the overall mea c cocetrato, k s mea c cocetrato floodg class k (relatve to the overall mea), ad s a radom ose factor (whch govers the varatos wth each class). The best estmate of the c cocetratos for each pot flood class k (.e. µ+ k ) ca be calculated by takg the mea of the values for the sample pots fallg wth flood class k. The actual (but ukow) value at each pot to be extrapolated wll vary from the estmated value by a radom amout determed by. Stadard aalyss of varace tests ca be coducted to gauge the usefuless of the classfcato. If the F rato (defed as the rato of the betwee class varace to the pooled wth class varace) based o the formato for the sample pots s low, the the classfcato s weak. The approach makes a umber of basc assumptos:. Varatos the value of wth a floodg class are radom (.e. there s o spatal autocorrelato).. The varace of the ose s the same for all floodg classes. 3. The c cocetratos are ormally dstrbuted. 4. Spatal chages c cocetratos occur steps at the boudares of the floodg classes. Most of these assumptos probably do ot hold. If the data are o-ormal, they ca be trasformed to make them more ormal, but there s lttle that ca be doe about the other assumptos. Tred Surface Aalyss Tred surface aalyss s global, exact ad determstc. A tred surface ca be thought of as a hgh order three dmesoal regresso surface. To uderstad ths, let us beg wth a smple stuato whch we ca model the data values alog a trasect usg a smple regresso model: ( x) b + + = xb where (x) s the c cocetrato at locato x, b s the tercept (.e. value of whe x=), b s the slope or gradet, ad s the resdual (error term or ose). I some cases the data values caot be adequately summarsed as a lear fucto, whch case a hgher order polyomal may provde a better summary. For example, a secod order polyomal (or quadratc) equato mght provde a better ft: ( x) = b + b x + b + x Tred surfaces are smlar except, stead of havg data values alog a trasect, the sample pots would be two dmesos (measured by x ad y co-ordates) wth the attrbute values modelled as a thrd dmeso. A frst order tred surface (aalogous to a smple regresso le) s a cled plae wth the formula: ( x, y) b + b x + + = yb A secod order tred surface would be a udulatg surface wth the formula: ( x y) = b + b x + b y + b x + b xy + +, yb It should be oted that ths model cludes a cross product term (.e. b 4 xy). Hgher order tred surfaces ot oly clude eve hgher powers of x ad y, but also more cross product terms. A thrd order tred surface has a total of terms. Hgher order tred surfaces are more covoluted tha lower order tred surfaces, ad provde closer fts to the observed data values. However, ths does ot ecessarly result more accurate predctos for the pots betwee. I fact, tred surfaces hgher tha thrd order ted to become couter productve. The objectve may staces, cosequetly, s ot to get a 'perfect' ft of the observed data values usg a hgher order tred, but to detfy areas where there are spatally autocorrelated resduals from a low order tred surface as ths may dcate the presece of locally mportat flueces upo the varable of terest. 3 4 5-3 -
The values of the b terms ca be easly determed usg the stadard regresso optos avalable most statstcal packages. The sgfcace of a tred surface ca be tested usg a aalyss of varace test. There s also a aalyss of varace test to test whether a tred surface of a gve order represets a sgfcat mprovemet o a tred surface oe order lower. (See Burrough ad McDoell, p, for detals). Regresso Techques The tred surface techques dscussed the prevous secto use oly formato measured o the varable of terest (.e. c cocetratos) ad the locato of the sample pots, whereas the classfcato methods dscussed the prevous secto make use of 'exteral' formato (.e. frequecy of floodg). A thrd strategy s to make use of exteral formato usg regresso techques. Regresso techques are global, exact ad stochastc. There s o restrcto o the type of exteral formato that may be used, provded that the regresso model s tutvely plausble. However, t obvously makes sese to use formato whch ca be readly obtaed. The example provded by Burrough ad McDoell models c cocetratos usg the model: ( x) b + b P + b + = P where P s dstace from the rver, ad P s elevato. Iformato o dstace from the rver ad elevato ca be readly calculated for ay gve pot usg a GIS. I the absece of a DEM from aother source, the elevato data collected for the sample pots ca be used to costruct oe. Both varables ca be assumed to exercse a fluece upo the lkelhood of floodg ad therefore of c deposto. The regresso model determes the relatve mportace of the two varables usg other formato avalable for the sample pots. The parameters of the regresso model ca obvously be estmated usg stadard regresso techques. Lkewse the sgfcace of the regresso ca be tested usg stadard techques. Ths type of regresso whch emprcally estmates the values of a varable usg exteral formato s sometmes referred to as a trasfer fucto. Other Global Iterpolators More complex mathematcal techques, such as spectral aalyss or Fourer aalyss, ca be used to model the surface a maer aalogous to tred surface aalyss. They geerally requre large amouts of data at dfferet scales of resoluto. LOCAL INTERPOLATORS Whe usg global terpolators local varatos ted to dsmssed as radom ose. However, tutvely ths does ot make sese as the data values for each pot ofte ted to be very smlar to those for eghbourg pots. Local terpolators therefore attempt to estmate the data values for ukow pots usg the kow data values for pots earby. The geeral procedure s to detfy a lattce of pots for whch data values are to be estmated. For each pot, the procedure volves the followg steps:. A search area (eghbourhood) s defed aroud the pot;. The sample pots wth the search area are detfed; 3. A mathematcal fucto s selected to model the local varato betwee these pots; 4. The data value for the pot s estmated from the fucto. The depedet varable could be represeted as (x,y) to mata cosstecy wth the prevous secto. However, the hadout follows the otato used by Burrough ad McDoell -.e. the depedet varable s represeted as (x), where x refers to the set of cartesa co-ordates for a partcular pot. Other texts gore the locatoal qualfers completely, ad smply refer to the depedet varable as. The ma dfferece s that, stead of usg polyomals, the surface s modelled as the sum of a umber of susodal fuctos wth dfferet wavelegths. - 4 -
Dfferet results wll ted to arse depedg upo the se of the search area ad the type of mathematcal fucto selected. We wll dscuss a few of the more commo fuctos. Thesse Polygos I ths approach, Thesse polygos (also kow as Drchlet or Voroo dagrams) are costructed aroud each sample pot. All pots wth a polygo are assumed to have the same data value as the sample pot aroud whch the polygo s costructed. Ths s equvalet to sayg that each pot has the same data value as ts earest sample pot. Iterpolato usg Thesse polygos would be classed as local, exact ad determstc. Thesse polygos are costructed by drawg les betwee eghbourg pots - our case sample pots. These les form the sdes of Delauay tragles. A Delauay tragle has the property that a crcle draw through ts three corers wll ever cota ay other sample pot. If the crcle cotas a sample pot, the the tragles eed to be redraw. To costruct the Thesse polygos, a secod set of les are the costructed to bsect the frst set of les (.e. the edges of the Delauay tragles) at rght agles at ther md-pots. The secod set of les form the boudares of the Thesse polygos whlst ther tersectos form the corers of the polygos. 3 The key property of a Thesse polygo s that all pots wth a polygo le closer to the pot aroud whch t was costructed tha to ay other pot. Usg Thesse polygos to terpolate results sharp jumps data values as you move from oe polygo to the ext, so a techque kow as pycophylactc terpolato s sometmes used to smooth the trasto (see dagram below). Ths techque was developed by Waldo Tobler to 'blur' the boudares choropleth maps showg features lke populato desty for admstratve regos. The techque preserves 'volume' (.e. the total umber of people per area), but 'moves' them aroud wth the areas to form a cotuous surface. Burrough ad McDoell (pp.6-7) provde detals of the mathematcs. A more tradtoal cartographc techque, kow as dasymetrc mappg uses a somewhat smlar approach, except that the locato of dfferet shadg categores wth a polygo s based upo exteral formato. For example, f the mea populato desty a admstratve area s 5 people per square mle, but t s kow that exactly half the area s a uhabted bog, the the populato desty the bog area could be allocated a shadg 3 Fgure 5.7 Burrough ad McDoell, purportg to show Thesse polygos (ad replcated here), s poorly costructed, resultg some areas beg allocated to the wrog polygo. - 5 -
dcatg people per square mle, ad the rest of the area could be allocated a shadg dcatg, people per square mle. Weghted Movg Average Methods Ths famly of techques estmates the data value for each pot by calculatg a dstace weghted average of the pots wth the search radus. These techques are local, exact ad determstc. The geeral formula s: ( x ) = ( x ) ˆ. where (x ) are the data values for the pots (x x ) wth the search radus, ad are weghts to be appled to the data values for each pot. Oe costrat s that the weghts must add up to.. The weghts are usually some fucto of the dstace betwee the pot for whch the estmate s beg made ad the sample pots. The most commo fucto s the verse dstace weghtg (IDW) predctor. The above formula the becomes: ˆ r. r dj j ( x ) = ( x ) where j represets the pot whose value s beg terpolated, d j s the dstace from pot j to sample pot ad r s a arbtrary value whch ca be selected by the vestgator. If r s set equal to, the ths becomes a smple lear terpolator. The r value s frequetly set to, whch case the fluece of each sample pot s proporto to the square root of ts dstace from the pot to be terpolated. However, r ca be set to hgher values f requred. Hgh values of r gve eve more weght to the earer sample pots. If the pot to be terpreted correspods exactly wth oe of the sample pots, d j would be ero ad the sample pot would be assged a fte weght, producg a fte value for the estmate. Ths s obvously udesrable, so such stuatos the pot to be terpolated s smply assged the same value as the sample pot. Oe problem wth ths partcular techque s that solated data pots ted to produce 'duck-egg' patters. Oe soluto mght be to crease the search radus, but ths may have the udesrable effect of over-smoothg other parts of the surface. Aother varat s to base the terpolato at each pot o a fxed umber of sample pots. The search radus wll the be small areas where there s a hgh desty of sample pots, but larger where there s a lower desty of pots. If t s beleved that there may be a drectoal bas, the search wdow could eve be dvded to, say, 4 quadrats ad ts radus mght be creased utl there was a mmum of a specfed umber of pots each quadrat. d Sples Before computers cartographers used flexble rulers called sples to ft smooth curves through a umber of fxed pots. Sple fuctos are the mathematcal equvalets of these flexble rulers. Sples are pece-wse fuctos -.e. they cosst of a umber of sectos, each of whch s exactly ftted to a small umber of pots, such a way that each of the sectos jo up at pots referred to as break pots. Oe advatage of pece-wse fuctos s that f there s a chage the data value at a pot, the t s oly ecessary to make a local adjustmet (.e. to the relevat secto) whereas, cotrast, the whole surface would eed to be completely recalculated f usg a tred surface. Sples may be classes as local, exact ad determstc. The sples are ormally ftted usg low order polyomals (.e. secod or thrd order) costraed to jo up. The sples may be two dmesoal (e.g. f smoothg a cotour le) or three dmesoal (f modellg a surface). Thrd order three dmesoal sples are frequetly used. These are sometmes referred to as a bcubc sple. - 6 -
Most practcal applcatos use a specal type of sple called a B-sple. These are descrbed by Burrough ad McDoell as 'themselves the sums of other sples that by defto have the value of ero outsde the terval of terest' (p.9). The results of applyg a sple are depedet upo varous decsos take by the vestgator. These clude the order of polyomal selected, the umber of break-pots, ad also whether the break pots are selected to correspod wth sample pots or at pots betwee. It s also possble to relax the requremet that the sple should ft the sample pots exactly. Some exact sples ca produce excessvely hgh or low values. Th plate sples (a exact alteratve) are therefore sometmes used to smooth the surface, subject to the costrat that the dfferece betwee the observed data values ad those gve by the fucto are mmsed. KRIGING Krgg s a form of local terpolato usg geostatstcal methods developed by a Frech geostatstca called Georges Mathero ad a South Afrca mg egeer called D.G. Krge. Krgg s local, exact ad stochastc. Whlst much more complex tha the methods dscussed above, t provdes a umber of advatages:. Gve suffcet data, krgg produces better estmates tha the other methods because the method takes explct accout of the effects of radom ose.. Although the vestgator ca choose betwee dfferet types of krgg, krgg s less susceptble to arbtrary decsos take by the vestgator (e.g. search dstace, umber of sample pots to use, locato of break pots, etc.). Krgg detfes the optmal terpolato weghts ad search radus. 3. Krgg provdes a dcato of the relablty of the estmates. Krgg begs wth the recogto that the spatal varato of ay cotuous attrbute s usually too complex to be modelled by a smple, smooth mathematcal fucto. So, stead, t s modelled as a stochastc surface or radom feld. Regoaled varable theory assumes that the spatal varato of ay varable ca be expressed as the sum of three compoets:. A structural compoet havg a costat mea or tred;. A radom, but spatally correlated compoet, kow as the regoaled varable; ad 3. Spatally ucorrelated radom ose.e. resdual compoet. The value of a radom varable Z at x s gve as: Z ( x) m( ) + '( ) + ' ' = xx where m(x) s a structural fucto descrbg the structural compoet, '(x) s the stochastc but spatally autocorrelated resduals from m(x) (.e. the regoaled varable), ad '' s radom ose havg a ormal dstrbuto wth a mea of ad a varace. The frst step s to decde o a sutable fucto for m(x). I the smplest case ths ca be thought of as a flat surface wth o tred. The mea value of m(x) s the mea value wth the sample area. Ths meas the expected dfferece the values for two pots x ad x+h (where h s the dstace betwee the pots) s ero..e. E [ Z( x) Z( + )] = hx It s also assumed that the varace of the dffereces s a fucto of the dstace betwee the pots (.e. they are spatally autocorrelated, meag that ear pots are more lkely to have smlar values, or smaller dffereces, tha dstat pots) -.e. E [{ Z( x) Z( x + h) } ] = E { '( x) '( x + h) } [ ] = ( h) where (h) s kow as the semvarace. Uder these two assumptos (.e. statoarty of dfferece ad statoarty the varace of dffereces), the orgal model ca be expressed as: ( x) m( ) + ( ) + ' ' = hx The semvarace ca be estmated from the sample data usg the formula: Z - 7 -
( h) = { ( x ) ( x + h) } where s the umber of pars of sample pots separated by dstace h. ˆ The semvarace ca be calculated for dfferet values of h. A plot of the calculated semvarace values agast h s referred to as a expermetal varogram. Expermetal varograms typcally have a umber of characterstc features, as llustrated the dagram:. The varace s small for low values of h, but creases as the value of h gets larger. However, beyod a certa pot the graph levels off to form what s kow as the sll.. The dstace at whch the graph levels off s kow as the rage. At dstaces less tha the rage, pots closer together are more lkely to have smlar values tha pots further apart. At dstace larger tha the rage, pots do ot exert a fluece upo oe aother. The rage therefore provdes a dcato of how large the search radus eeds to be whe dog a dstace weghted terpolato. 3. The ftted model does ot pass through the org -.e. accordg to the graph the semvarace whe h s ero has a postve value referred to as the ugget. However, oe would expect the varace to be ero (.e. oe would expect the dfferece betwee pots ad themselves to be ero). The ugget provdes a dcato of the amout of o-spatally autocorrelated ose (.e. ''). The semvarace depcted a expermetal varogram must be modelled by a mathematcal fucto takg the form (x)=... Dfferet models may be used, each wth a dfferet formula. The choce betwee these models s determed by the shape of the expermetal varogram. The fgure below shows some of the more commoly used models. A sphercal model (a) s used whe the varogram has the 'classc' shape, a expoetal model (b) s used whe the approach to the sll s more gradual, a Guassa model (d) may provde a good ft whe the ugget s small ad the varato s very smooth, ad a lear model (c) may be the most approprate whe there s o sll wth the study area. Fttg the most approprate model to the data requres a lot of skll. Other varogram shapes may dcate a dfferet course of acto s requred. For example,. A varogram that becomes creasgly steep wth larger values of h dcates that there s a tred the data that should be modelled separately.. If the ugget varace s large ad the varogram shows o tedecy to dmsh wth smaller values of h, the terpolato s ot really sesble. The best estmate of (x) s smply the overall mea of the sample pots. 3. A osy varogram showg o partcular patter may dcate that there are too few sample pots. 4. If the rage s smaller tha the dstace betwee sample pots, the the sample pots are too far apart to fluece oe aother. The best estmate of (x) s aga the overall mea of the sample pots. 5. If the varogram dps at dstaces further tha the rage to create a 'hole effect', the the study area may be too small to capture some log wave-legth varato the data. - 8 -
More complex varogram models ca be developed where t s felt they are requred. For example:. Asotropc models may be developed f t s beleved that there may be drectoal effects (.e. that the relatoshp betwee the semvarace ad dstace s dfferet dfferet drectos). Ths essetally volves calculatg a dfferet expermetal varogram for each drecto. These ca be show as a cotour 'map' rather tha a graph.. A complex varogram may be requred f s beleved that the total varato s the sum of the effects of two or more regoaled varables. 3. Dfferet varograms may be requred for dfferet cover classes (e.g. rock types, lad covers, etc.). I such stuatos a separate varogram should be calculated for each cover class. Oce the varogram has bee modelled, the ext step s to determe the weghts requred for local terpolato. The weghts for each pot should sum to.. They are also selected so that the estmated data value for the pot s ubased, ad the varace of the estmato s mmsed -.e. the estmate s a best lear ubased estmate (BLUE). It ca be show that the estmato varace e s mmsed whe ( x x ) + ( ), = xx j, where s the umber of samples pots used to terpolate the data value for pot x ad s a Lagrage multpler (whch esures that the values add up to.). Ths equato ca be solved by substtutg the estmated values of from the varogram model. Burrough ad McDoell provde a worked example o pp. 4-4. Fally, the data value for each pot ca be estmated by puttg the weghts to the formula: ˆ. x ( x ) = ( ) The estmato varace e (also kow as the krgg varace) for each pot ca be calculated usg the formula: ˆ e = ( x, x ) + The estmato varace e (or more usually the krgg stadard devato e ) ca be mapped. Ths provdes a dcato of the relablty of the estmate at each locato. - 9 -
The valdty of the varogram model may also be tested for cosstecy by comparg the actual values for a data pot wth the estmated value calculated usg a varogram calculated usg all the other data pots. Ths process, kow as cross valdato, s repeated omttg each pot tur. If the mea dfferece betwee the predcted values ad actual values s close to ero wth a small varace the the varogram ca be assumed to ubased. The above method, kow as ordary krgg, s the most commo form of krgg. However, there are other varats. Block krgg s used to terpolate data values to areas larger tha the support, smple krgg ca be used uder a assumpto of secod order statoarty ad a kow mea, but as these codtos rarely apply t s seldom used o ts ow; o-lear krgg ca be used f the data are o-ormally dstrbuted (e.g. log-ormal data), dcator krgg ca be used to terpolate bary data, stratfed krgg s used to terpolate data stratfed by dfferet cover classes, co-krgg ca be used to make use of data dfferet from, but correlated wth, the varable to be estmated, whlst uversal krgg s used to corporate formato about treds. Fally codtoal smulato techques (or stochastc magg) may be used to detfy the most lkely data value for each place (but at the expese of cotuty betwee adjacet areas). Burrough ad McDoell provde further detals o most of these methods. REFERENCES Burrough ad McDoell, Chapters 5 ad 6. - -