J. J. Kennedy, 1 N. A. Rayner, 1 R. O. Smith, 2 D. E. Parker, 1 and M. Saunby Introduction

Transcription

1 Reassessig biases ad other ucertaities i sea-surface temperature observatios measured i situ sice 85, part : measuremet ad samplig ucertaities J. J. Keedy, N. A. Rayer, R. O. Smith, D. E. Parker, ad M. Sauby Abstract. New estimates of measuremet ad samplig ucertaities of gridded i situ sea-surface temperature aomalies are calculated for 85 to 6. The measuremet ucertaities accout for correlatios betwee errors i observatios made by the same ship or buoy due, for example, to miscalibratio of the thermometer. Correlatios betwee the errors icrease the estimated ucertaities o grid-box averages. I grid boxes where there are may observatios from oly a few ships or driftig buoys, this icrease ca be large. The correlatios also icrease ucertaities of regioal, hemispheric ad global averages above ad beyod the icrease arisig solely from the iflatio of the grid-box ucertaities. This is due to correlatios i the errors betwee grid boxes visited by the same ship or driftig buoy. At times whe reliable estimates ca be made, the ucertaities i global-average, souther-hemisphere ad tropical sea-surface temperature aomalies are betwee two ad three times as large as whe calculated assumig the errors are ucorrelated. Ucertaities of orther hemisphere averages approximately double. A ew estimate is also made of samplig ucertaities. They are largest i regios of high seasurface temperature variability such as the wester boudary currets ad alog the orther boudary of the Souther Ocea. The samplig ucertaities are geerally smaller i the tropics ad i the ocea gyres.. Itroductio I order to uderstad chages i sea-surface temperature (SST) at global ad regioal levels, it is importat to quatify the expected ucertaities i the observatios. A umber of studies have attempted to quatify the measuremet errors i observatios made by ships, driftig buoys ad moored buoys. Typically, studies have cosidered two separate, but related, problems. Oe is the problem of estimatig ucertaities associated with radom measuremet errors, which are assumed to be ucorrelated from oe observatio to the ext (for example Emery et al. [], Ket ad Challeor [6], Rayer et al. [6]). The secod is that of idetifyig biases i the data due to chages i the way that measuremets were take. For example, may early SST measuremets were made by collectig water samples i poorly isulated buckets. The buckets lost heat as they were hauled to the deck thus itroducig a persistet cold bias ito records of SST (e.g. Follad ad Parker [995], Smith ad Reyolds []). Part of this paper (Keedy et al. [a]) deals with estimatig biases ad their ucertaities o log space ad time scales due to chagig measuremet methods. However, the two types of errors caot be separated so easily. Although measuremets from ships exhibit characteristic average biases arisig from the methods used to make the measuremets, o two ships or buoys are idetical. Therefore, oe would expect biases that vary from ship to ship, or from driftig buoy to driftig buoy. By comparig Met Office Hadley Cetre, FitzRoy Road, Exeter, EX 3PB, UK. Ocea Physics Group, Departmet of Marie Sciece, Uiversity of Otago, Duedi, New Zealad Copyright by the America Geophysical Uio. 48-7//$9. SST observatios to SST fields output from the Met Office Numerical Weather Predictio model, Ket ad Berry [8] estimated the stadard deviatios of these microbiases, which they referred to as iter-platform errors to distiguish them from actually-radom itra-platform errors. They foud sigificat iter- ad itra-platform errors i observatios from ships, driftig buoys ad moored buoys. Keedy et al. [b] foud similar errors i comparisos of i situ observatios with measuremets made by the Alog Track Scaig Radiometer. The iter-platform ad itraplatform desigatios ca be cofusig so here we refer to micro-bias errors ad radom measuremet errors istead. I additio to measuremet ucertaities, gridded data sets of SST also icur a additioal ucertaity from estimatig area averages from a fiite umber of observatios. Rayer et al. [6] estimated combied measuremet ad samplig ucertaities from the SST data, but did ot explicitly estimate the samplig ucertaity aloe. Samplig ucertaities for lad air temperatures have bee estimated by Joes et al. [997], Broha et al. [6] ad She et al. [7]. The formalism used i these studies depeds o the average correlatios betwee poits withi a grid box. The situatio is slightly more complicated for marie data because the observatios are radomly placed i time as well as i space. Morrissey ad Greee [9] exteded the average correlatio cocept to marie data, which is the startig poit for this aalysis. I the preset paper, the estimates of micro-bias ad radom measuremet ucertaities are used to estimate the ucertaities of gridded SST fields ad o global ad regioal average SSTs for the period 85 to 6. Sectio briefly describes the data source used i the aalysis. Sectio 3 details the theoretical basis for the error model used i the subsequet sectios. Sectio 4 deals with the practical problem of estimatig ucertaities particularly whe it is ot possible to idetify idividual ships because the ships call sig, or ship ame is ot cotaied i the meteorological reports. Sectio 5 describes the results. I part of the paper Keedy et al. [a] a bias adjusted versio of the

2 Met Office Hadley Cetre SST data set is described. The data set ad ucertaity estimates together costitute versio 3 of that data set. HadSST3 rus from 85 to 6 ad icludes bias adjustmets ad more comprehesive error estimates.. Data The sea-surface temperature data for 85 to 6 come from versio.5 of the Iteratioal Comprehesive Ocea Atmosphere Data Set (ICOADS, Woodruff et al. []). ICOADS comprises meteorological measuremets origiatig from ships, oceaographic statios, moored buoys, driftig buoys ad research vessels. SST data from ICOADS were quality cotrolled ad processed accordig to the method detailed i Rayer et al. [6] to produce mothly 5 latitude 5 logitude grids. There are three pricipal types of platform measurig SST i situ: ships, driftig buoys ad moored buoys. Ships ad buoys are idetified by a uique call sig, or other idetifier. For may historical reports, however, this iformatio is abset. Where preset, call sig iformatio is recorded i ICOADS metadata ad was recorded i Global Telecommuicatio System (GTS) reports util December 7. After this date, the callsig iformatio was removed from some GTS feeds ad ecrypted o others owig to cocers about ship security. Due to the lack of public call sig iformatio it was ot possible to complete this ucertaity aalysis after 6. Most of the ship-based data i ICOADS were take by ships egaged o other busiess. The Volutary Observig Ships (VOS) were recruited ito atioal fleets ad issued with stadardised equipmet ad istructios for their use. Although coutries made a effort to stadardise equipmet ad istructios withi their fleets at ay oe time, there have always bee differeces betwee coutries cocerig best practice. The size of the VOS fleet peaked aroud 985, whe there were more tha 75 ships o the World Meteorological Orgaisatio s VOS fleet list. Numbers have declied sice, with fewer tha 4 ships remaiig o the list today ( Driftig buoys cosist of a plastic ball, approximately 3 cm i diameter, attached to a drogue. The drogue esures that the buoy remais correctly orieted ad that it drifts with the currets i the mixed layer. The SST sesor is embedded i the uderside of the buoy ad measures at a depth of approximately 5cm i calm seas. Movemet of the buoy ad the actio of waves mea that the measuremet is represetative of the upper m of the water colum (Lumpki ad Pazos [7]). The desig of driftig buoys was stadardised i the early 99s; cosequetly, measuremets from driftig buoys should be cosistet at all times ad places thereafter. I cotrast, moored buoys come i a wide variety of shapes ad sizes, from the m discus buoys to the.5m fixed buoys deployed i the North Sea. The samplig characteristics of these three platform types are quite distictive. Ships travel betwee ports makig regular observatios so the observatios from a sigle ship ca provide a represetative sample for a large area. However, most ship observatios are made i the stadard shippig laes so the coverage obtaied i this way ca be limited. Driftig buoys, as their ames suggests, drift. How far they drift depeds o the prevailig currets, but durig a moth they do ot ofte travel far. They typically take hourly observatios ad cosequetly provide dese samplig of a limited area i ay give moth. Drifters are geerally deployed to provide a quasi-uiform coverage of the oceas. Moored buoys take regular measuremets at a fixed poit, typically i coastal areas, but there are also a umber of tropical moorigs i the ope oceas. 3. Error model - theory Cosider a SST measuremet, O ij, take by ship i at poit j that has bee coverted to a aomaly from a referece climatology. This observatio is a combiatio of the true SST aomaly at that poit, T ij, a radom measuremet error, M ij, that is differet for every observatio ad a costat offset - the micro-bias error - for measuremets from that ship, B i. Here ship is used to refer to ay sigle etity - be it ship, driftig buoy, or moored buoy - that takes SST measuremets. O ij ca thus be writte as: O ij = T ij + M ij + B i () M ij has mea zero ad stadard deviatio σ mi ad is differet at each poit j. B i is draw for each ship, i, from a sample with mea zero ad stadard deviatio σ bi. A mea of zero assumes that the bias adjustmets discussed i Part of this paper have adjusted successfully for the mea bias for each measuremet type. The idices i ad j ca be used to keep track of observatios i a sigle grid box. I this istace i =,,..., m; j =,,..., i. i.e. there are m ships ad ship i takes i observatios. The total umber of observatios is. The grid-box average, G, is therefore, G = i (T ij + M ij + B i). () j= It is assumed that the Ms ad Bs are idepedet of the T s ad the Ms are idepedet of the Bs. Hece var(g) = m i i [var (T ij) + var (M ij) (3) j= +var j= i (B i)] (4) j= ad all the covariace terms betwee the differet sources of error are equal to zero. The variace of the first term i the square brackets ca be expressed (Kaga [966]; Yevjevich [97]) as, var i (T ij) = σ s [ + ( ) r] (5) j= where σs is the stadard deviatio of the SST aomalies at a fixed poit, which is assumed to be a costat i ay give grid box, ad r is the average correlatio of the SST aomalies measured at ay pair of poits withi the grid box. The Ms are idepedet of each other, as are the Bs, so we have var ad var i (M ij) = j= i (B i) = j= i var (M ij) = j= iσm i (6) i var[ (B i)] = (7) var[ ib i] = j= i σb i. (8)

3 The total variace of the grid-box average is therefore σ tot = iσm i + i σb i + σ s [ + ( ) r]. The total variace icludes a compoet represetig the variability of the true grid-box average from oe moth to the ext. This variability is give by the costat term (σs r) ad represets the climate sigal. Subtractig σs r from the total variace leaves oly those terms that together provide a estimate of the excess variace due to uder-samplig errors ad measuremet errors: σerror = iσ m i + i σb i + σ s [ r] () This formula gives the ucertaity o the grid-box average for a give set of observatios made withi that grid box ad teds to zero as ad m icrease. To better uderstad the formula it is istructive to cosider the case where each of the m ships takes the same umber ( i = /m) of observatios ad has the same radom measuremet ad micro-bias error characteristics, the the formula above reduces to σ error = σ m + σ b (9) m + σ s ( r). () The three terms correspod to the radom measuremet ucertaity, the micro-bias ucertaity ad the samplig ucertaity respectively. This is the miimum ucertaity for a give umber of observatios ad ships m ad shows that it is oly possible to reduce sigificatly the ucertaity of the grid-box average by icreasig both the umber of observatios () ad the umber of ships (m) makig those observatios. It is also iterestig to cosider the limitig cases for a give umber of observatios. The ucertaity of the grid-box average will be a miimum whe m = ad a maximum whe m =. Betwee these extremes, the ucertaity will ted to reduce with more slowly tha. The micro-bias error term will also lead to correlatios betwee the errors i two grid boxes where the same ship makes observatios i both. This will be explored further i the ext sectio. 3.. Iter-grid box error correlatios The micro-bias error, σb i, for a give ship i equatio is correlated both withi a grid box ad betwee ay two grid boxes that the ship visits. Radom measuremet ad samplig errors are ucorrelated betwee grid boxes. Therefore the covariace, C p,q, betwee two grid boxes is: C p,q p q = k p k q k σ b k N pn q () where pk ad qk are the umber of observatios from ship k i grid boxes p ad q respectively. k rus from to the umber of ships that are preset i both grid boxes. N p ad N q are the total umber of observatios i boxes p ad q. If there are o ships that made measuremets i both grid boxes the C p,q is zero. C p,p is give by the error variace o the grid-box average i Equatio. To do this calculatio it is ecessary to kow which ships took measuremets i each grid box, how may observatios were made by each ship ( pk ad qk ) ad the estimated bias error associated with that ship (σb k ). Give this iformatio, the covariace matrix ca easily be calculated durig griddig. Ufortuately, as was oted above, ot all SST observatios made by ships ca be uambiguously associated with a idividual vessel. This problem is discussed i Sectio Samplig error Samplig errors arise whe the area-average of a spatially-varyig quatity is estimated from a fiite umber of observatios. Eve if the idividual measuremet errors were zero, the mea of the observatios i the grid box would ot be equal to the true spatial-average. As discussed above, the total variace of a grid-box average of perfect observatios is give (Kaga [966]; Yevjevich [97]) by σ grid = σ s ( + r ( )). (3) The formula assumes that the variace at every poit withi the grid box is equal to σ s ad that the poits are radomly distributed. These are likely to be good approximatios for most grid boxes, but this additioal ucertaity is discussed i Sectio 4.5. The samplig ucertaity term is the part that depeds o i.e. the excess variace caused by uder samplig: σ se = σ s ( r). (4) It is possible to estimate σ s for a give grid box by calculatig the grid-box average variace for moths whe is large. σ = = σ s r (5) The value of r ca be estimated from the data as i Joes et al. [997] ad Broha et al. [6] who used the correlatios betwee grid boxes to estimate the average correlatio withi a grid box for lad statios. Lad statios geerally make two or more observatios a day every day, thus esurig perfect temporal samplig. The oly cocer the is that the measuremets made at the statio are ot represetative of the whole area of the grid box. Estimatig r is slightly more complicated for marie data because observatios are sampled radomly i both space ad time. Morrissey ad Greee [9] examied the case of SST samplig ad exteded the r cocept to iclude a time dimesio. The special case where the observatios are radomly distributed ad there is o correlatio betwee the space ad time compoets is used here. It will ted to lead to a slightly larger estimate of the samplig error tha for the case where the two are correlated. I Joes et al. [997] r spacewas estimated by calculatig a correlatio decay legth from iter-grid box correlatios. A similar method is followed here. The correlatio betwee two poits is assumed to be of the form: r = exp( x x ) (6) where x is the distace betwee two poits ad x is the characteristic legth scale. I most grid boxes a expoetial form for r gives a better fit to the data tha a gaussia form. SST data were take from 96 to, usig HadSST to exted the series from 7-. The data were detreded usig a 5-term polyomial fit to the time series of observatios i each grid box. Wisorised correlatios (Wilcox []) were calculated betwee the de-treded time series of SST from a give grid box with de-treded time series i all grid boxes withi ± of logitude ad ± of latitude. A miimum of 3 moths of coicidet data were

4 9N (a) 9N (b) 9S 8 9W 9E 8 9S 8 9W 9E N (c) 9N (d) 9S 8 9W 9E 8 9S 8 9W 9E Figure. (a) r space, (b) r time, (c) r all ad (d) σ s( r), the samplig error ( C) o a sigle observatio. required to estimate the correlatios. The value of x that miimised the RMS differece betwee the observed correlatios ad the model was calculated. The average spatial correlatio, r space, withi each grid box was estimated by choosig pairs of poits from withi the grid box, calculatig the correlatios usig equatio 6 ad takig the average. The map of r space is show i Figure. r time was calculated i a aalogous maer by fittig a expoetially decayig lag correlatio fuctio to the de-treded (usig a low-pass filter) mothly time series of SST aomalies i each grid box. The resultig field of r time was multiplied by r spaceto get the fial estimate of r all for each grid box (Figure ). Typically the average correlatios are highest i the tropics, particularly i the easter Pacific. They are lower i the regio of the wester boudary currets, o the orther boudary of the Souther Ocea ad at the edges of the field ear areas of seasoal sea ice cover. The true stadard deviatio at a poit i a give grid box, σs, was calculated from σ= = σs r. This was estimated by Rayer et al. [6] from the observed data. A map of σs( r), the samplig ucertaity of a grid box average calculated from a sigle observatio, is show i Figure. Samplig ucertaities are highest i the wester boudary currets ad alog the orther boudary of the Souther Ocea where SST ca chage rapidly ad spatial temperature gradiets are strog. Samplig ucertaities are smaller i the tropics - particularly i the Idia ad Atlatic Oceas - where the spatial ad temporal correlatios are large. I the easter tropical Pacific there is a area of higher variability i the regio of the El Niño cold togue. I the North Pacific the Kuroshio extesio icreases samplig ucertaity as far east as Hawaii. The patter ad magitude of the samplig ucertaity is similar to that calculated from sub-samplig complete satellite data (Rayer et al. [9]) The full error covariace matrix Combiig all the terms of the error covariace matrix gives C pp = iσm i + alog the diagoal, ad C pq, p q = i σb i + σ s ( r) (7) k p k q k σ b k N pn q (8) o the off-diagoal, which ca be evaluated oly whe we have call sig iformatio. Whe calculatig area-averages ad other useful quatities, it is ecessary to keep track of the ucertaities through the calculatio. For a geeric fuctio f = f(x, x...x ), where the covariace of x p ad x q is give by C p,q, the stadard expressio for the error variace of f, σf is give by σ f = p= q= ( f x p ) f C pq. (9) x q Whe f is the weighted average of grid boxes p= f = apxp () p= ap

5 where a p is the weight of grid box p ad f x p f x q = Therefore, i matrix otatio, σ f = a pa q ( l= a l) () acat (Σ l= a l) () where a is a vector of weights a = (a, a,..., a ). For calculatig area averages of the SST data, the values of a are the grid box areas, or set to zero where data are missig Ucertaities of temporal averages It is useful to kow the ucertaity of aual average values as well as mothly average values. The correlated errors that lead to o-zero covariace i the spatial fields of ucertaity will also lead to o-zero covariaces i the full space-time covariace matrix. The o-zero covariaces eed to be accouted for whe propagatig ucertaities from mothly to aual averages. There are two extreme cases: first that the mothly errors are idepedet; secod, that they are completely correlated. All other assumptios beig correct, the correct error variace will lie somewhere betwee the two correspodig estimates. It is impractical to create the covariace matrix for the whole year. For a 5 latitude 5 logitude mothly aalysis such a matrix would cotai early a billio ((7 36 ) ) elemets. Istead, the weight that each observatio takes i the global average was calculated ad the ucertaity was built up i the followig way. σ global average = (Σw ij) ( + i ( w ij) σb i (3) + j= i wijσ m i (4) j= i wijσ se ij ) (5) j= where w ij is the weight that observatio j from ship i takes i the global aual average. σ seij is the samplig error i the grid box cotaiig observatio ij. w ij depeds o how the aual average is calculated. Here, the aual global average is calculated by first takig the area-weighted average of the mothly SST fields ad the averagig the umbers to get the aual average. Therefore the weight of each observatio is w ij = area i,j gridbox moths area total,moth (6) where area i,jis the area of the grid box cotaiig observatio i, j, ad the area total,moth i the deomiator is the total area of occupied grid boxes i the field for a give moth. gridbox is the umber of observatios i the mothly grid box cotaiig observatio j from ship i ad moths is the umber of moths i the average, usually. 4. Error model - implemetatio The method was applied to data from ICOADS.5 from 85 to 6. Each year was split ito pseudomoths comprisig six five-day petads with the exceptio of August which has seve. Leap years have a extra day i the fial February petad. The data were processed followig Rayer et al. [6]. I Rayer et al. [6] the grid box averages were calculated by first takig the wisorised average of the SST aomalies withi each xxpetad grid box. The wisorised average of the xxpetad grid boxes withi each fial 5x5xmothly grid box were the calculated. Despite the averagig method differig from that assumed i the error model, equatio 9 provides a good fit to the grid box variaces. This was checked by repeatedly resamplig a grid box i the North Sea which cotais more tha observatios each moth (ot show). I this way it was possible to esure that the model fitted the data for a wide rage of ad m. The values of σm i ad σb i are take from Keedy et al. [b] for ships ad driftig buoys ad Ket ad Berry [8] for moored buoys. I Keedy et al. [b] observatios from each uique ship or driftig buoy were compared to a backgroud SST field from the Advaced Alog-Track Scaig Radiometer (AATSR) istrumet ad the differece time series were used to calculate represetative values of σm i ad σb i for ships ad driftig buoys. The differig values, show i Table, reflect the relative accuracies of ship ad buoy measuremets. Ket ad Berry [8] calculated σm i ad σb i for moored buoys by comparig their measuremets to SST fields take from the Met Office Numerical Weather Predictio system. I the followig aalysis all ships are cosidered to have the same error characteristics i.e. a fixed value of σ m ad σ b. Driftig ad moored buoys have their ow characteristic values. The values used are summarised i Table. The error model as described above assumes that it is possible to uambiguously idetify each ship, or buoy. However, this is ot always possible. Ofte the call sigs by which idividual ships ca be idetified are missig from the ICOADS reports or a geeric call sig is used istead. I other cases the ship idetifier is icomplete. For example, betwee 87 ad the 9s may observatios have call sig. May of these observatios have the same time stamp, but differet locatios so the call sig caot be a uique idetifier. The o-uique call sigs idetified were SHIP, SHIPX,, PLAT,, 58, 7, ad (i.e. o call sig). Excludig these observatios from the ucertaity calculatio would lead to a uderestimate of the ucertaities. A alterative approach might be to assume that the observatios with o-uique call sigs did ideed come from the same vessel, but this would lead to a very large overestimate of the ucertaities at those times whe may observatios caot be uiquely idetified with a particular ship. The followig sectios attempt a more reasoable estimate of the ucertaity compoet associated with the uidetifiable observatios. 4.. Grid-box average The first difficulty i implemetig the error model is i estimatig the micro-bias ucertaity term, i σb i (7) Table. Estimated radom measuremet (σ m) ad microbias (σ b ) ucertaities for ships ad driftig buoys from Keedy et al. [b] ad for moored buoys from Ket ad Berry [8]. Platform σ m( C) σ b ( C) Ship.74.7 Moorig.3. Drifter.6.9

6 whe the ships caot be idetified ad the i are therefore ukow. However, what is kow i all situatios is, the total umber of observatios i the grid box. The micro-bias term ca be estimated usig oly this iformatio. The value of Equatio 7 depeds o how the observatios are partitioed betwee the ships that made them. Assumig that the partitioig of observatios betwee ships is roughly costat for a give grid box over time, i i ca be estimated usig a fuctio of the form a b i (8) where is the total umber of observatios i the grid box. The parameters a ad b ca be estimated from those ob- for which the call sigs are kow ad for which servatios i ca therefore be evaluated. Values of a ad b were estimated for each grid box for each year usig data from alterate moths (Jauary, March, May...) for the ie year period cetred o that year. The logarithm of i was regressed o the logarithm of. log i = log(a) + b log() (9) i Ships that took hourly observatios, such as ocea weather ships, were excluded because they teded to skew i towards high values that were ot represetative of typical SST observatios. Moths where the value of i was i the upper 5% were also rejected because these were ofte domiated by a few exceptioally high values. The model parameters were verified usig the remaiig moths (February, April, Jue...). Where the correlatio of the estimate from the model with the actual value was below.5, a ad b were flagged as missig. Values of a ad b are show i Figure 4. Where observatios were available from may ships, b was close to. For example, i the case where each of m ships takes g observatios, is equal to mg ad i is equal to mg, so a = g ad b =. Where most of the observatios come from a sigle ship, b approaches. Therefore, (a) (c) i Ratio Fractio of observatios with uique callsig (b) (d) Figure. (a) Ucertaity of global mothly average SST aomaly for the error model icludig iter-grid box correlatios (grey) ad for the error model with o itergrid box correlatios (black). (b) Ratio of the two lies i pael (a). (c) as for pael (a) but icludig a estimate of the full ucertaity of the global mothly average SST aomaly (upper black lie). This is.4 times the lower black lie prior to 98 ad equal to the grey lie after 98. (d) Fractio of observatios with uique call sigs. alog the well travelled shippig laes, the observed value of b teds to be aroud. At high latitudes ad i other areas where shippig traffic is more sporadic, b teds to be higher, reflectig the fact that i these areas the grid-box average is likely to be based o observatios from a smaller umber of vessels. Whe b is close to, the value of a is geerally close to, as expected. Where it was ot possible to estimate a ad b, it was assumed that b was equal to ad a was equal to. This is the most coservative case ad reflects the fact that i areas where observatios were ot sufficietly umerous to calculate reliable estimates of a ad b, the observatios were likely to have come from a sigle ship. I the 86s there were few observatios ad most had o call sig, so may grid boxes had to be filled i this way. Oce a ad b had bee calculated, they were used to estimate the micro-bias ucertaity term thus: i σb i + a b ukowσb (3) i(id kow) All observatios with missig or geeric call sigs were assumed to come from ships ad ot from driftig or moored buoys. This iformatio ca the be combied with the samplig ad radom measuremet ucertaities, like so C pp = iσm i + ID kow i σ b i (3) + ab ukowσ b + σ s ( r) (3) alog the diagoal. 4.. Regioal average Because of the problem of missig call sigs, it is ot possible to explicitly calculate the off-diagoal elemets of Fractio of observatios with uique callsig Neff 6 4 (a) (c) (b) (d) Figure 3. (a) fractio of observatios i each moth from 85 to 6 with uique call sigs. (b) Ucertaity of the global aual average sea-surface temperature calculated from oly those observatios with uique call sigs. The upper lie shows the full error calculatio described i Sectio 4.3, ad the lower lie was calculated from the ucertaities o the mothly averages assumig that they are idepedet. (c) N eff calculated from the ucertaities show i pael (b). The horizotal lie is at eff =.5. (d) The ucertaity of the global aual average sea-surface temperature calculated usig all observatios ad assumig eff =.5.

7 9N a 994 9N b 994 9S 8 9W 9E S 8 9W 9E N a 94 9N b 94 9S 8 9W 9E S 8 9W 9E N a 884 9N b 884 9S 8 9W 9E 8 9S 8 9W 9E Figure 4. Values of a (left) ad b (right) calculated for three differet ie-year periods cetred o: 994 (top), 94 (middle) ad 894 (bottom). a ad b are dimesioless quatities. the covariace matrix (Equatio ) for all moths. To estimate the size of this effect, the off-diagoal elemets, C pq, were calculated usig oly data from ships that could be idetified uambiguously. Geeric call sigs (SHIP, SHIPX,, PLAT etc.) were ot icluded, either were observatios with o call sig. A secod estimate was made usig all the data, but assumig that all the off-diagoal elemets were set to zero. Figures (a) ad (b) show the ucertaity of the global average calculated usig these two methods ad the ratio betwee them. Also show are the fractios of observatios with uique call sigs (Figure (d)). I the 86s ad 97s there are few observatios with a uique call sig. I these cases C pq could oly be estimated i very few cases so the grey lie rus close to the black lie ad is a uderestimate of the true ucertaity. I cotrast, there are a umber of periods durig which the umber of idetified callsigs is large relative to the umber of uidetified call sigs: , 88-89, ad from aroud 98. Before 98, there is a approximately liear relatioship betwee the fractio of observatios with uique call sigs ad the ratio show i Figure (b). Extrapolatig a liear fit to the case where all observatios have uique call sigs gives a estimated ratio of.4 betwee the ucertaities based oly o the diagoal elemets ad the ucertaity from the full error model i the case whe all call sigs are kow. A estimate of the ucertaity of the global average was made by multiplyig the series of ucertaities based oly o the diagoal elemets by.4. After 98, the estimates from the full error model were used because the umber of o-uique call sigs is relatively small ad the samplig characteristics chaged markedly with the itroductio of driftig ad moored buoys. The resultig combied series is show i the upper black lie of Figure (c) which is.4 times the lower black lie util 98. The multiplier is differet for each regio, reflectig the ature of the local shippig ad observatioal coverage. Prior to 98, a factor of.4 was used for the globe ad souther hemisphere ad a factor of. was used for the tropics. A multiplier of. was more appropriate for the better observed orther hemisphere, agai, prior to Temporal average The calculatio of the ucertaity of a aual average, σ aual, is oce agai complicated by the presece of missig call sigs. This is dealt with i a similar way to the regioal

8 averages, by comparig the aual ad mothly ucertaities calculated from observatios for which the call sigs are all kow. There are two limitig cases σ mothly σ σ mothly aual (33) where σmothly are the error variaces of the mothly averages. I the first case the mothly ucertaities are assumed to be idepedet. I the latter case, they are assumed to be perfectly correlated. I practice, the truth will lie somewhere betwee these extremes. A effective umber of moths, eff, was defied such that σ σaual mothly = (34) eff To estimate eff, Equatio 5 was used to estimate the ucertaity o the global aual average, σaual, calculated oly from those observatios with uique callsigs. The sum of the error variaces of the mothly global averages was calculated from oly those observatios with uique call sigs ad divided by twelve times the error variace o the aual average to get eff. The results are show i Figure 3. Figure 3(b) shows the ucertaity of the aual average calculated oly for those observatios with call sigs ad the lower lie shows the ucertaity o the aual average assumig that the mothly errors are idepedet. Pael (c) shows the calculated value of eff for each year. The lowest values of eff are aroud.5 i the late 99s, suggestig that i this period there was sigificat correlatio betwee errors i idividual moths. At other times values as high as 7.5 were recorded. The values ca be used to estimate ucertaities o averages at aual time scales for the full data set by assumig a costat value for eff ad usig this to estimate the ucertaity o the aual average from the ucertaities o the mothly averages (Figure 3d). I this case eff was chose to be.5, as this is a reasoably coservative estimate that is also cosistet with well observed periods i the record Coverage ucertaity Whe calculatig area averages from a gridded data set there is a additioal ucertaity that arises because there are ofte large areas, ad cosequetly, may grid boxes, which cotai o observatios. Such ucertaities are referred to here as coverage ucertaities. I Broha et al. [6] coverage ucertaities were estimated by subsamplig reaalysis data. A similar method is used here. SST aomalies from the globally complete HadISST data set (Rayer et al. [3]) were used i the place of reaalysis data. For example, to calculate the ucertaity o the March 973 mothly average for the North Pacific a time series of North Pacific average SST aomalies was calculated usig HadISST from 87 to. The coverage of HadISST at all time steps was the reduced to that of HadSST3 for March 973. The North Pacific time series was recalculated from the sub-sampled data ad the stadard deviatio of the differece betwee the series from the complete ad subsampled series was used as a estimate of the ucertaity for March 973. Data from the ERSSTv3b (Smith et al. [8]) ad COBE (Ishii et al. [5]) data sets were also used i place of HadISST ad gave similar results suggestig that the ucertaities do ot deped strogly o the statistical assumptios made i creatig HadISST. Coverage ucertaities calculated usig HadISST are show i Figure Additioal ucertaities A umber of additioal sources of ucertaity are ot icluded i this aalysis. These iclude, but are ot limited to, the followig.. Ucertaities associated with the adjustmets for biases i the data are dealt with i part of this paper (Keedy et al. [a]). O larger spatial ad loger temporal scales, they are comparable to, or of greater importace tha the measuremet ad samplig errors discussed here.. No estimate has yet bee made of ucertaities i the estimate of the climatological average SST that has bee used to covert actual SST measuremets to aomalies. I sparsely observed regios, such as the Souther Ocea, the climatological ucertaity could be large. For future applicatios it might be wise to estimate climatological averages from a moder well-observed period ad make use of satellite data. There is a ope problem i how regios that have, util recetly, bee covered by sea ice should be treated. Is it meaigful to assig a sea-surface temperature aomaly to a regio that, durig the climatology period, was covered with sea ice? This questio is of particular sigificace if estimates of SST aomalies are combied with lad surface air temperature aomalies to create a estimate of global average temperature. 3. Ucertaities iheret i the estimated values for σ m, σ b, σ s ad r have ot bee icluded. For example, due to radom measuremet ad samplig errors, the estimates of r, the average correlatio of two poits withi a grid box derived i Sectio 3., might be uderestimated leadig to a samplig ucertaity that is too large. This effect has bee reduced by the use of a robust measure of correlatio. Aother compesatig factor might be that micro-bias errors lead to a slight icrease of the correlatio of separated grid boxes. It has also bee poited out (Rayer et al. [9]) that samplig withi a 5 degree grid box might ot be radom as supposed here. Morrissey ad Greee [9] developed a more geeral meas of estimatig the samplig error that takes ito accout the locatios of the observatios withi the grid boxes. Clusterig of the observatios would typically lead to a uderestimate of the samplig ucertaity, which curretly assumes that the observatios are radomly distributed. Work by Ket et al. [993] ad Ket ad Challeor [6] shows that the measuremet ucertaities of SST observatios from ships usig buckets to make their SST measuremets were differet from those made by ships usig egie itake water. They also foud variatios betwee the ships recruited by differet coutries. (a) 89s N. (c) 99s N 3N 3N Latitude Latitude 3S 3S 6S 6S. (b) 9s N. (d) s. 6N 3N 3N Latitude Latitude Figure 5. Zoal average of 5 latitude by 5 logitude grid box -sigma ucertaities ( C) for: (a) 89s, (b) 9s, (c) 99s ad (d) -6. The black lie is the ew estimate of the ucertaity ad the grey lie is the estimate from Rayer et al. (6) S 3S 6S 6S

9 ad that may observatios were oly recorded to whole or half degrees. More detailed aalysis usig metadata for differet ships could be used to further refie the aalysis i future. 4. Ucertaities arisig from outliers, or otherwise oormal deviatios, i the data have ot bee cosidered although their effects have bee quatified elsewhere (Keedy et al. [b]). 5. Results Figure 5 shows the zoal mea of the idividual grid-box ucertaities for four differet periods. Despite improvemets i data coverage associated with usig ICOADS.5 rather tha ICOADS., the grid-box ucertaities are geerally larger tha the estimates made i Rayer et al. [6] owig to the correlatio of errors withi the grid boxes, which was eglected i Rayer et al. [6]. I the 89s, the ucertaities are higher at most latitudes reflectig the sparseess of the observatios ad the small umber of platforms. I the 9s, the mid-latitude orther hemisphere was more desely observed, pricipally as a result of icreased shippig rather tha a icrease i the umber of observatios per ship, ad cosequetly there is little differece betwee the estimates. The latitude at which the lies cross is also marked by high variability as it is the latitude of the Gulf Stream ad Kuroshio. I these regios, the ucertaity of a grid-box average based o a sigle observatio is ofte higher i HadSST tha i HadSST3. However at lower latitudes ad i the souther hemisphere the effects of data sparsity ad the lack of diversity i the observig etwork are still apparet. By the 99s, the desity of observatios had icreased at most latitudes except i the souther hemisphere, southward of about 4 S, where ucertaities remai large. There is a relative icrease i the average ucertaity (betwee HadSST ad HadSST3) i the mid-latitude orther hemisphere i the 99s that was larger tha i the 9s. This arises because, although more observatios were made i the 99s, this was due maily to a icrease i the umber of observatios made per ship without a proportioate icrease i the umber of Aual Mothly Coverage Correlated measuremet ad samplig Ucorrelated measuremet ad samplig Figure 6. -sigma ucertaities of aual-average (upper pael) ad mothly-average (lower pael) global SST aomalies. The three lies i each pael correspod to the coverage ucertaity (solid black lie), the correlated measuremet ad samplig ucertaity (dashed black lie) ad the ucorrelated measuremet ad samplig ucertaity (solid grey lie). ships. Sice, the differece is more uiform ad reflects a move to fewer platforms (pricipally driftig buoys) makig more observatios. Ket ad Berry [8] saw a similar icrease i grid box ucertaities relative to those i Rayer et al. [6]. A sigificat icrease i global ad regioal ucertaities arises from the iter-grid box correlatios. The off-diagoal terms of the error covariace lead to approximately a factor of.4 icrease i the global-average ucertaity relative to the case where they are assumed to be ucorrelated betwee grid boxes. This effect ca be see i the time series of ucertaities i global, hemispheric ad tropical average sea-surface temperature aomalies show i Figure 7. The red lie, which is ot corrected for missig call sigs, is likely to be a uderestimate at times whe there are may missig call sigs. This is most apparet betwee 86 ad 87 whe the off-diagoal compoet is almost zero. Very few call sigs are associated with the observatios i this datasparse period. The sudde apparet icrease i ucertaity i 98 arises because there are far more observatios with call sigs available after 98 tha before. The estimated full error rage which accouts for the abset ad geeric call sigs is give by the upper blue lie i each diagram. Overall, there is a geeral decrease i the ucertaity of the global average SST from the mid-ieteeth cetury to the preset, iterrupted by periods of icreased global ucertaity durig the two world wars. The ucertaity is highest i the 86s, aroud.3 C, whe there are few observatios ad fewer extat metadata. The dip i souther hemisphere ucertaity i 979 is due to the temporary deploymet of may driftig buoys durig the First GARP Global experimet. The fall i total ucertaity at the very ed of the series occurred after the umber of driftig buoys i the oceas was almost doubled. This had a particularly strikig effect i the souther hemisphere. Measuremet ad samplig ucertaities are comparable to the coverage ucertaities (Figure 6) throughout much of the record with each exhibitig similar temporal correlatios. It is iterestig to compare the SST ucertaities with comparable estimates of ucertaity o global ad hemispheric averages of lad surface air temperature. I Broha et al. [6], i which lad surface air temperatures (LSAT) ad their ucertaities were calculated, ucertaities o SST (from Rayer et al. [6]) were geerally much smaller tha ucertaities o LSAT. This was due to a large extet to the higher variability of temperature aomalies over lad. Although the ew ucertaities o SST measuremets are sigificatly larger tha those i Rayer et al. [6] they are still typically lower tha hemispheric estimates of lad surface air temperature ucertaities. 6. Summary A error model that accouts for correlated (micro-bias) ad idepedet (radom measuremet) radom errors i sea-surface temperature measuremets was described. The correlated errors lead to a icrease i ucertaities of gridbox average ad regioal-average SSTs compared to previous estimates. It was ot possible to estimate the full correlated error term at all times because that requires that each cotributig ship ca be uambiguously idetified. At times this was ot possible because the call sig iformatio was ot icluded i the meteorological report. From December 7, the call sigs of GTS reports were removed from some feeds ad ecrypted o others, so this aalysis could oly be completed for the period 85 to 6. New estimates of samplig ucertaity were also calculated. The ew estimates of the grid-box ucertaities for HadSST3 are typically larger tha estimates made i

10 (a) Globe (b) Norther Hemisphere (c) Souther Hemisphere (d) Tropics Figure 7. -sigma ucertaities o mothly SSTs for (a) Global-, (b) Norther Hemisphere-, (c) Souther Hemisphere- ad (d) Tropical-averages. The blue lie shows the ucertaity calculated usig the full error model icludig iter grid box correlatios corrected for missig callsigs. The red lie shows the ucertaity calculated usig the full error model icludig iter grid box correlatios, but ot corrected for missig call sigs. The gree lie shows the ucertaity if the grid-box errors are assumed to be ucorrelated. The lower black lie is the Rayer et al. (6) estimate. Note the differet scales o each diagram. The right had colum is show o a expaded scale because the ucertaities are geerally smaller i the later period. HadSST. The differeces are largest i areas where the observig etwork was less diverse ad observatios were made by oly a small umber of ships or buoys; for example, at high latitudes, i the earliest part of the record ad durig the Secod World War. The ew estimates of the ucertaities of regioal averages are also larger owig to correlatios betwee grid boxes. At times whe reliable estimates ca be made, the ucertaities i global-average, souther-hemipshere ad tropical-average sea-surface temperatures are betwee two ad three times as large as the case where errors are cosidered to be ucorrelated. Ucertaities of orther hemisphere averages approximately double. There is oe additioal poit to ote. Ucertaities o derived quatities calculated usig the ew error model will ot ecessarily icrease i all situatios. Whe differeces betwee grid box values are take, the correlated compoet will cacel to a certai extet. Persistet biases, ad biases arisig due to geeric meas of measurig SST are dealt with i part of the paper. The gridded SST aomalies, time series ad ucertaities are available via Ackowledgmets. The authors were supported by the Joit DECC/Defra Met Office Hadley Cetre Climate Programme (GA). The ICOADS data for this study are from the Research Data Archive (RDA) which is maitaied by the Computatioal ad Iformatio Systems Laboratory (CISL) at the Natioal Ceter for Atmospheric Research (NCAR). NCAR is sposored by the Natioal Sciece Foudatio (NSF). The origial data are available from the RDA ( i dataset umber ds54.. Ia Jolliffe made helpful commets o the otatio of the error aalysis. The authors would also like to thak the three aoymous reviewers for their isightful ad costructive commets, which improved the mauscript. Refereces Broha, P., J. Keedy, I. Harris, S. Tett, ad P. Joes (6), Ucertaity estimates i regioal ad global observed temperature chages: a ew dataset from 85, Joural of Geophysical Research, (D6), doi:.9/5jd6548. Emery, W., D. Baldwi, P. Schlüssel, ad R. Reyolds (), Accuracy of i situ sea surface temperatures used to calibrate ifrared satellite measuremets, Joural of Geophysical Research, 6 (C), , doi:.9/jc46.

11 Follad, C., ad D. Parker (995), Correctio of istrumetal biases i historical sea surface temperature data, Quarterly Joural of the Royal Meteorological Society, (5), , doi:./qj Ishii, M., A. Shouji, S. Sugimoto, ad T. Matsumoto (5), Objective aalyses of sea-surface temperature ad marie meteorological variables for the th cetury usig ICOADS ad the Kobe collectio, It. J. Climatol., 5 (7), , doi:./joc.69. Joes, P., T. Osbor, ad K. Briffa (997), Estimatig samplig errors i large-scale temperature averages, Joural of Climate, (), , doi:.75/5-44(997) 548:ESEILS..CO;. Kaga, R. (966), A evaluatio of the represetativeess of precipitatio data (i Russia), Gidrometeoizdat, p. 9. Keedy, J., N. Rayer, R. Smith, M. Sauby, ad D. Parker (a), Reassessig biases ad other ucertaities i seasurface temperature observatios measured i situ sice 85, part : biases ad homogeisatio, JGR Atmospheres. Keedy, J., R. Smith, ad N. Rayer (b), Usig AATSR data to assess the quality of i situ sea surface temperature observatios for climate studies, Remote Sesig of Eviromet. Ket, E., ad D. Berry (8), Assessmet of the marie observig system (ASMOS): fial report, Tech. Rep. 3, Natioal Oceaography Cetre Southampto. Ket, E., ad P. Challeor (6), Toward estimatig climatic treds i SST. Part II: Radom errors, Joural of Atmospheric ad Oceaic Techology, 3 (3), , doi:.75/jtech844.. Ket, E., P. Taylor, B. Truscott, ad J. Hopkis (993), The accuracy of Volutary Observig Ships meteorological observatios - results of the VSOP-NA, Joural of Atmospheric ad Oceaic Techology, (4), 59 68, doi:.75/5-46(993) 59:TAOVOS..CO;. Lumpki, R., ad M. Pazos (7), Measurig surface currets with Surface Velocity Program drifters: the istrumet, its data, ad some recet results, Cambridge Uiversity Press. Morrissey, M., ad J. Greee (9), A theoretical framework for the samplig error variace for three-dimesioal climate averages of ICOADS mothly ship data, Theoretical ad Applied Climatology, 96 (3-4), 35 48, doi:.7/s Rayer, N., D. Parker, E. Horto, C. Follad, L. Alexader, D. Rowell, E. Ket, ad A. Kapla (3), Global aalyses of sea surface temperature, sea ice, ad ight marie air temperature sice the late ieteeth cetury, Joural of Geophysical Research, 8 (D4), 447, doi:.9/jd67. Rayer, N., P. Broha, D. Parker, C. Follad, J. Keedy, M. Vaicek, T. Asell, ad S. Tett (6), Improved aalyses of chages ad ucertaities i sea surface temperature measured i situ sice the mid-ieteeth cetury: the HadSST data set, Joural of Climate, 9 (3), , doi:.75/jcli Rayer, N., A. Kapla, E. Ket, R. Reyolds, P. Broha, K. Casey, J. Keedy, S. Woodruff, T. Smith, C. Dolo, L. Breivik, S. Eastwood, M. Ishii, ad T. Brado (9), Evaluatig climate variability ad chage from moder ad historical SST observatios, i Proceedigs of OceaObs 9: Sustaied Ocea Observatios ad Iformatio for Society (Vol. ), Veice, Italy, -5 September 9, edited by J. Hall, D. Harriso, ad D. Stammer, ESA Publicatio WPP- 36, doi:.57/oceaobs9.cwp.7. She, S., H. Yi, ad T. Smith (7), A estimate of the samplig error variace of the gridded GHCN mothly surface air temperature data, Joural of Climate, (), 3 33, doi:.75/jcli4.. Smith, T., ad R. Reyolds (), Bias correctios for historical sea surface temperatures based o marie air temperatures, Joural of Climate, 5 (), 73 87, doi:.75/5-44()5 73:BCFHSS..CO;. Smith, T., R. Reyolds, T. Peterso, ad J. Lawrimore (8), Improvemets to NOAA s historical merged lad-ocea surface temperature aalysis (88-6), Joural of Climate, (), 83 96, doi:.75/7jcli.. Wilcox, R. (), Fudametals of Moder Statistical Methods: Substatially Improvig Power ad Accuracy, New York: Spriger. Woodruff, S., S. Worley, S. Lubker, Z. Ji, J. Freema, D. Berry, P. Broha, E. Ket, R. Reyolds, S. Smith, ad C. Wilkiso (), ICOADS release.5: extesios ad ehacemets to the surface marie meteorological archive, Iteratioal Joural of Climatology, doi:doi:./joc.3. Yevjevich, V. (97), Probability ad statistics i hydrology, Water resources publicatios, p. 3. J. J. Keedy, Met Office Hadley Cetre, FitzRoy Road, Exeter, EX 3PB, UK. (joh.keedy@metoffice.gov.uk) D. E. Parker, Met Office Hadley Cetre, FitzRoy Road, Exeter, EX 3PB, UK. N. A. Rayer, Met Office Hadley Cetre, FitzRoy Road, Exeter, EX 3PB, UK. M. Sauby, Met Office Hadley Cetre, FitzRoy Road, Exeter, EX 3PB, UK. R. O. Smith, Ocea Physics Group, Departmet of Marie Sciece, Uiversity of Otago, Duedi, New Zealad