Homogenization of long-term monthly Spanish temperature data

Size: px
Start display at page:

Download "Homogenization of long-term monthly Spanish temperature data"

Transcription

1 INTERNATIONAL JOURNAL OF CLIMATOLOGY Published online in Wiley InterScience ( Homogenization of long-term monthly Spanish temperature data M. Staudt,* M. J. Esteban-Parra and Y. Castro-Díez Departamento de Física Aplicada, Universidad de Granada, Granada, Spain Abstract: Reliable time-series is the basic ingredient when analysing climatic changes. However, the errors in real data are frequently of the same order as the signal being sought. Therefore, the available long-term monthly series of Spanish minimum and maximum temperatures have been compiled from the late 19th century on, in order to compile a high-quality data set. The series are organized into climatically homogeneous regional groups and, in each group, the detection and adjustment is based on relative homogeneity and an analysis of the stationarity of the whole set of temperature-difference series. These series are scanned with moving t, Alexandersson, and Mann Kendall tests. The detected inhomogeneities are adjusted by weighted averages of the regional series. The method is iterative and advances in steps of detection, adjustment, and actualization. Individual inhomogeneous data are discarded and gaps are filled by similar weighted multiple means. For the analysis of the temperature evolution in the Iberian Peninsula, each region is finally represented by one local series and the regional average. The urban effect on minimum temperatures is adjusted by an empirical method, and for Madrid also by a correction derived from new homogenized data. Generally, rigorous homogeneity cannot be achieved because the initial data quality is deficient in many cases and metadata are sparse. Nevertheless, the data homogeneity and quality has been considerably enhanced: the total error margin in a series is of the order of 0.3 C 0.4 C, under consideration of a worst-case error accumulation. On the other hand, the number of inhomogeneities is considerable and their average amplitude is of the order of 1 C reflecting the much larger error margin in the raw data. The homogenized dataset compiled constitutes an important basis for the subsequent detection of thermal changes in Spain in the last 130 years, on a clearly higher confidence level than before. KEY WORDS temperatures; data homogeneity; statistical tests; climate change; Spain; Iberian Peninsula Received 14 July 2005; Accepted 9 December 2006 INTRODUCTION Reliable data are a necessary basis for a study of the evolution of a climatic variable and the detection of changes. In many countries, systematic instrumental weather observations began in the 19th century and since then, the availability of quantitative data has considerably improved. A time series of a climatic variable is called homogeneous when its variations have a climatic origin only (Mitchell et al., 1966). Unfortunately, a vast majority of all climate records is adversely affected by nonclimatic changes in the data. A relocation of an observatory, replacement of instruments, variations in the environment or in reading procedures, as well as human errors in data processing are rather frequent. Under these circumstances, a series suffers artificial biases, most frequently sudden jumps or breaks, and may fail to represent the real climatic evolution. A reliable detection of climate change * Correspondence to: M. Staudt, Departamento de Física Aplicada, Facultad de Ciencias, Campus de Fuentenueva, Universidad de Granada, Granada, Spain. mstaudt@ugr.es is hard or impossible when the error related to data quality is of the same order of magnitude as the signal being sought. The large extent of the data quality problems is well known in the recent literature of climate research. In Chapter 12, the third assessment report of the IPCC (IPCC, 2001) states that The quality of observed data is a vital factor. Homogeneous data series are required with careful adjustments to account for changes in observing system technologies and observing practices. Moreover, Petersonet al. (1998b) point out that Unfortunately, most long-term climatological time series have been affected by a number of non-climatic factors that make these data unrepresentative of the actual climate variation occurring over time. Trenberth (2002) notes that we do not have an adequate climate observing system and There must be an active program of research and analysis utilizing climate data sets to ensure the data are state-of-the-art and meet requirements. Besides observational programmes for improving future data quality, undoubtedly a strong effort must also be dedicated to homogenization and quality control of the existing data.

2 M. STAUDT., M. J. ESTEBAN-PARRA AND Y. CASTRO-DÍEZ An early introductory study on homogeneity and statistical tests was given in Mitchell et al. (1966). They described the problem of achieving absolute homogeneity. Among the more recent efforts in data homogeneization, Goossens and Berger (1986) applied different statistical methods, such as the Mann Kendall test, to the detection of changes in climatic series. Alexandersson (1986) developed the Standard normal homogeneity test (SNHT), applied to the Swedish precipitation series in subsequent work (Alexandersson and Moberg, 1997; Moberg and Alexandersson, 1997). The SNHT is one of the most efficient tests for homogeneity, as Ducré- Robitaille et al. (2003) recently demonstrated. Several homogenization methods have been created and first applied to North American data. Karl and Williams (1987) developed a method that explicitly considers the metadata, detects and adjusts data changes statistically, using adjacent series. This method applies to a large number of North American temperature and precipitation series. Young (1993) and Rhoades and Salinger (1993) presented alternative methods, also based on similar data from highly correlated series. Peterson and Easterling (1994) and Easterling and Peterson (1995), developed a different strategy, with reference series and a Monte Carlo method, and they adjusted the data by least-square linear regressions. The method of Vincent (1998) works with multiple regressions and is applied to daily Canadian temperature series in Vincent and Gullett (1999) and Vincent et al. (2002). In recent studies, attempts have been made to homogenize European data. Slonosky et al. (1999) have created a method with multiple comparisons and adjustments between adjacent series, but without reference series, and have applied it to long-term European pressure series. Their results prove to be similar to those of analytically more sophisticated methods, such as the statistical technique by Mestre (1999). González-Rouco et al. (2001) have homogenized the south-western European precipitation series with an iterative method by extending the strategy of Hanssen-Bauer and Forland (1994). Stepanek (2003) has recently created the software AnClim, especially for the practical application of virtually all relevant homogenization methods for climate data. In a recent international effort on data quality, Wijingaard et al. (2003) analyzed the daily temperature and precipitation data of the European Climate Assessment (ECA). They have found that a vast majority of the series suffer clear homogeneity problems. Nevertheless, among the applications of the homogenization methods in literature, there is still a lack of systematic treatment of Spanish temperature data. The present study carefully prepares these data series, seeking to achieve maximum data quality. The aim is to set a solid base for a reliable subsequent analysis of thermal changes and its confidence levels on a regional scale since the late 19th century. DATA The Spanish temperature data used in this study have been provided by the National Meteorology Institute (INM). The recording of monthly temperatures began sometime between 1869 and 1880 in about 20 observatories, mainly in province capitals (older records are rare), although at some sites the observations were not recorded until the first or second decade of the 20th century. Data quality is problematic or even poor in many cases, because of frequent site changes and data gaps, and metadata are scarce. Figure 1 gives a schematic overview of the temporal data coverage until 1980 and shows the geographic distribution of the observatories. Definition of the regional groups of data series The Spanish monthly temperature series contain a high degree of common variability. The cross-correlations between the anomalies usually exceed 0.5, even at distances of the order of 500 km. Nonetheless, the temperature-anomaly patterns show regional distinctions, as found for the winter maxima by Frías Domínguez et al. (2002). The prior compilation of the data series into climatic groups derives from these regional differences. The basic threefold distinction separates the peninsular mode of thermal evolution that on the one hand includes, geographically, the central plains and the major part of the south, and on the other, the Mediterranean (eastern) and Cantabrian (northern) coastal areas. Furthermore, Galicia, western Andalusia, Extremadura and the Ebro valley are also treated as climatically different groups, in order not to eliminate possible regionally distinctive details of the temperature evolution. A preliminary analysis of the series from the high plains and the Mediterranean did not detect significant differences between the temperature evolutions in their northern and southern regions. The cross-correlations between the anomalies in each regional group systematically exceed 0.6 and clearly confirm the high level of regional synchronicity of the variations, an essential ingredient for the homogenization method. According to these results, the regional groups that will be homogenized separately, without mixing information between them by adjustments, are (the number of series in each group is given in parenthesis): Galicia (6), Cantabria (5), Ebro Valley (4), Mediterranean (6), central high Plains (14), western Andalusia (4) and Extremadura (2). In each of these climatic regions, all the series are homogenized and then the regional mean series (simple mean of the anomalies) is computed a-posteriori. From the homogeneity viewpoint, each individual series could represent its region, but the regional mean series is particularly valuable for subsequent analysis. Hence, each region is going to be represented by one local series and the regional mean, in order to analyze the recurrence of the results, in the sense of coherence among the two representative series: a variability feature is of high authenticity if it appears in both series.

3 LONG-TERM MONTHLY SPANISH TEMPERATURE DATA Figure 1. Scheme of the temporal and spatial coverage of the Spanish maximum and minimum temperature series, between 1860 and 1980 (in more recent years, the coverage is complete, with very few exceptions). The series are: 1. La Coruña, 2. Santiago, 3. Pontevedra, 4. Orense, 5. Vigo, 6. Finisterre, 7. San Sebastián, 8. Bilbao, 9. Santander, 10. Vitoria, 11. Pamplona, 12. Oviedo, 13. Zaragoza, 14. Huesca, 15. Logroño, 16. Teruel, 17. Lérida, 18. Gerona, 19. Barcelona, 20. Castellón, 21. Valencia, 22. Alicante, 23. Murcia, 24. Almería, 25. Burgos, 26. Valladolid, 27. Salamanca, 28. Soria, 29. León, 30. Palencia, 31. Zamora, 32. Ávila, 33. Segovia, 34. Madrid, 35. Guadalajara, 36. Toledo, 37. Cuenca, 38. Albacete, 39. Ciudad Real, 40. Córdoba, 41. Seville, 42. Huelva, 43. Jerez, 44. Málaga, 45. Granada, 46. Jaén, 47. Badajoz, 48. Cáceres. The regional groups are: A) Galicia, B) Cantabria, C) Ebro valley, D) Mediterranean, E) central plains, F) western Andalusia and G) Extremadura. Discarded data due to homogeneity problems The rejection of data or intervals has not been avoided, when the homogeneity problems were too strong to permit an adjustment at an acceptable confidence level. This happens under the following circumstances: Individual data or intervals are discarded, if their difference with at least two (or three) of the other series of the region is extreme at the 95% confidence-level (in an appropriate time interval around these data). Disconnected short intervals (shorter than a decade) with many interruptions are also discarded as well as intervals where more than approximately one-third of the data are missing. Apart from all the available difference series, the anomalies of the candidate series are always thoroughly cross checked. An interval has to be discarded when the available data in a given region do not permit an adjustment of an inhomogeneous break at a satisfactory confidence level (when no other or only one more series is available).

4 M. STAUDT., M. J. ESTEBAN-PARRA AND Y. CASTRO-DÍEZ A whole series is generally discarded when more than five discontinuous breaks (or other clear inhomogeneities to be adjusted) are found. This decision depends also on the length and overall quality of the series (one long series is maintained with six adjusted breaks Table I). Unfortunately, the following long intervals or entire series had to be rejected: In eastern Andalusia, the available long-term series from Jaén and Granada were discarded because their temporal data coverage was unsatisfactory. Table I. Total numbers of data, adjustments (adj.) and rejected (rej.) data (individual data or intervals) in the maximum and minimum temperature series. Series Nr. Nr. Nr. rej. Nr. Nr. Nr. rej. data adj. data data adj. data Maximum temperatures Minimum temperatures La Coruña Santiago Pontevedra Orense Vigo Finisterre San Sebastián Bilbao Santander Vitoria Pamplona y Zaragoza Huesca Logroño Lérida Valencia y Gerona Barcelona y Castellón Alicante y Murcia Madrid Ávila Burgos León y Palencia Salamanca Segovia Soria y Zamora Guadalajara y Toledo Albacete Ciud. Real Cuenca Seville Córdoba Huelva Jerez y Málaga Badajoz Cáceres The data in Galicia before 1880 and the minima in the Mediterranean before 1893 were also rejected because of severe homogeneity problems and/or lack of data. The 19th century data in western Andalusia and Extremadura could not be connected with sufficient confidence to the 20th century data and therefore were not considered. The lost information was partially recovered by defining an average series that consisted of western Andalusia, Extremadura and Málaga, where these data were connected and used. Some individual series or long intervals were rejected because of severe homogeneity problems: the maxima and minima series in Valladolid, the maximum records in Alicante and the minima in Ciudad Real, as well as the minima in Orense until 1949, the maxima in León until 1937, the minima in Guadalajara until 1970 and in Cuenca until METHODOLOGY Statistical properties of the monthly temperature data Temperature records show little variation on spatial scales of hundreds of kilometres, in regions with a regular orography such as the central plains of the Iberian Peninsula and along the coasts, where the crosscorrelations between the monthly series generally exceed a factor of 0.7. Nonetheless, regional differences may be crucial in studies of the temperature evolution and its significance levels. The dataset of the present study is developed not only for a high-confidence analysis of the general trends, but also of the interregional differences in Spanish temperatures. Monthly temperature series show a distinct lack of stationarity, because of frequent trends at time scales of months or several years, which are highly significant in many cases. Schönwiese and Rapp (1997) point out that... short-term trends... become enormously unstable in all seasons, even changing their sign. This variability characteristic complicates the detection of inhomogeneities and requires high significance levels, to avoid an erroneous detection and attribution of inhomogenities. These stationarity properties do not differ significantly among the treated regions and therefore, the same statistical criteria are applied everywhere. The statistical distribution of the temperature data is normal as a good approximation and there is no problem in applying parametric statistics designed for Gaussiandistributed variables, as the t-test or the SNHT. The autocorrelations (serial correlations) in these series are rather slight (coefficients between 0.1 and 0.3) but several statistical tests require corrections (the reduced sample size for the t-test and prewhitening of the series for the Mann Kendall test), in order to achieve realistic confidence levels. The basic homogenization concept The criterion of absolute homogeneity is fulfilled if a climatic series does not include any variability, except

5 LONG-TERM MONTHLY SPANISH TEMPERATURE DATA for the real climatic evolution. However, this condition is almost never fulfilled, because of the problems in real data. Easterling et al. (1996) pointed out that... the real homogeneity of climatic data is irretrievably lost. From the analysis of an individual series, it is generally impossible to decide at a high confidence level whether or not a certain change is inhomogeneous, and the absolutehomogeneity criterion is therefore not applied in the present study. The concept of relative homogeneity developed here is based not on individual series, but on their differences, because the anomalies of highly correlated time series are essentially synchronous. Hence, a local inhomogeneity can be detected in the difference series, where on the other hand an authentic extreme anomaly tends to vanish. This detection method fails if several series suffer a simultaneous data problem (e.g. a common sudden jump). Comparing as many difference series as possible minimizes this risk. The following relative homogenization method is on the basis of multiple comparisons between the climatically similar series within each predefined climatic region. No reference series is defined because the frequent inhomogeneities and missing data do not permit a reliable apriori reference. The whole set of difference series (differences of anomalies) is statistically tested for significant changes (see The scheme of the homogenization method ). Once identified, an inhomogeneous change is adjusted by a weighted mean of the highest-correlated series. The weighting factors depend on the synchronicity (crosscorrelation) and the number of common data of each surrounding series of the same region, relative to the candidate. For an abrupt change, the after : before difference is replaced by this weighted average (see The adjustment algorithm ). The series are adjusted separately in each region, to avoid merging information. This is essential to prepare the dataset for a subsequent detection of regional differences. The scheme of the homogenization method 1. The raw-data series are converted into anomalies, relative to the monthly mean of a given reference period (the final reference is ). The whole set of anomaly difference series is computed within each region (these are more efficient than absolute differences, because in the latter, stronger residuals of the annual cycle remain). Following the idea of multiple comparisons, in a region with n series, n 1 1 i difference series are simultaneously analyzed, in order to detect (and then to adjust) the significant inhomogenities in all series. 2. The suspicious inhomogeneities are marked (mostly abrupt changes or breaks, but also individual extreme data), with particular attention to the metadata information. 3. The largest and most obvious extreme values (outliers) are identified and discarded when the anomalies exceed a certain level (four standard deviations of a running 30-year interval, centerd at each data point, although sometimes, data coverage restricts the detection interval length). This search is based on the difference series, to avoid the rejection of authentic large anomalies. In this step, the criterion is severe and still preserves inhomogeneous data. It removes only the very large inhomogeneous outliers, prior to the closer analysis. 4. The set of difference series is recalculated and the possible abrupt inhomogeneities (breaks) are searched for and classified. Then, for each feature suspected to be inhomogeneous, an appropriate base interval is individually defined for statistical detection and verification. The length of these base intervals is generally years, symmetrically around the possible break-point (if possible) and must strictly avoid temporal overlapping with other inhomogeneities, that would produce skewed results. Besides a reasonable sample size (at least of the order of 100), the socalled station drift must be considered: the differences between highly correlated temperature series are often not stationary, but show frequent trends of changing signs, (even in the absence of site changes or other inhomogeneities, Rhoades and Neill, 1995). Therefore, the base intervals of the candidate series must be shorter if a stronger drift (less stationarity) is present, because earlier or later data are then less valid for the adjustment at a certain time. 5. The statistical tests are applied on the whole set of difference series in the base intervals that have been defined in the previous point. Moving t and SNHT (Alexandersson) tests scan the intervals, to determine the probability of a break, as a function of its time (see The statistical detection of discontinuous inhomogeneities ). Special attention is given to the metadata, by examining first the time intervals around the incidents reported in the literature. But the metadata are scarce and the method considers them, but does not need them. The general detection criterion for an inhomogeneous break is at a level significantly higher than 99% in the t-test, and at least 50% above the 95% level in the SNHT, recurrent in three difference series with highly correlated data. The local anomalies are checked in order to avoid wrong conclusions and, in doubtful cases, the results are subjected to the sequential Mann Kendall test. 6. Once an inhomogeneous break is detected, the adjustment works with a weighted average of the highestcorrelated simultaneous regional data (up to five series). The candidate s after : before difference is replaced by a weighted average of the analogous differences of the correction series. In very few particular and highly significant cases, continuous inhomogeneous features are detected and adjusted by a similar procedure. The method is similar because the detection is performed with the same statistical tests and the adjustment consists of a linear trend that is obtained

6 M. STAUDT., M. J. ESTEBAN-PARRA AND Y. CASTRO-DÍEZ as the weighted mean of the slopes of the highly correlated nearby series. To assure the essential noninterference between the different adjustments, steps 4 6 are executed in an iterative way, although common for all series: after adjusting all the disjointed inhomogeneities in all the series of a region in the first iteration, the set of difference series is recalculated before applying the tests again in the second iteration. This iterative method is necessary, because the correction intervals frequently overlap each other and in this sense, not all breaks and its corrections are independent or disjointed from each other. Furthermore, in some cases, slighter inhomogenities could not be detected until a large inhomogeneity was adjusted and the detection was repeated in the next iteration step (with all data actualized). The iteration stops when no more significant inhomogeneities are detected after actualising the series. 7. A search is made for the individual inhomogeneous data by detecting extreme values (as explained in part F) of the difference series and controlling these data points at each local series. The detected inhomogeneous data are removed. 8. The missing data are filled by weighted means of the best-correlated synchronous data (originally missing data or gaps created by removed inhomogeneous data). The filling algorithm works with up to five regional series and assumes synchronicity between these series (see The replacement of missing data ). 9. Finally, the dataset is prepared with two time series for each climatic region, expressed as anomalies, relative to the reference period : one local series and the regional average (all the local series are also available, for further purposes). The statistical detection of discontinuous inhomogeneities A break is detected when the corresponding significance exceeds the 99% level in the t-test and exceeds the 95% level by 50% in Alexandersson s test, in a recurrent way in at least three cases (three differences series of the same candidate). The windowed t-test. This well known statistical test measures the significance level of a change in the mean, is parametric and assumes normality and serial independence. It is robust against slight deviations from normality, if the sample is large enough (n >20), but significant autocorrelations cause skewed results. The test overestimates the significance when these are positive (common in temperature records). With a first-order autocorrelation coefficient ρ 1, the reduced sample size correction replaces the sample size n by n = n(1 ρ 1 )/(1 + ρ 1 ). This correction is valid, because the memory of monthly temperature records is rather short and its autocorrelations are essentially of a first order. Preliminary experiments show that this correction reducesthe statistical confidence typically by 20 30%. The SNHT (Alexandersson s standard normal homogeneity test). This test by Alexandersson (1986) (initially applied to the Swedish precipitation series) is now frequently used in climatology. It detects a single abrupt change (break) in the mean value of a Gaussian time series, assuming two stationary subseries, before and after the (possible) break, against the null hypothesis of one stationary series. The 95% confidence level for a break is 9.15 for a sample size of 100, rises slightly to 10 for 400 data and to 10.5 for 800 data. As mentioned, the present study requires higher significance levels: the detection will be considered highly significant if the coefficient exceeds the 95% level by 50% (value =15). An example of the detection of an inhomogeneous break is given in Figure 2. The running t-test and SNHT show a similar behavior and confirm a highly significant break at the beginning of the 1980s (the 95% levels are =2 forthet-test and =10 for the SNHT). The SNHT has a sharper peak, due to its quadratic algorithm. The t-test in the 20-year running window gives lower significance levels than in the 40-year interval, because of the smaller sample size. The adjustment algorithm After an inhomogeneity (break) in the candidate series and its time is detected, the adjustment works as follows: Case 1. When there is a sufficiently long overlapping interval (at least 3 years) of the subseries xt 1 and xt 2 around the break point, after verifying the synchronicity of the evolution in both subseries and the absence of clearly inhomogeneous features, the adjustment is made as the mean difference = k 1 k ( t=1 x 1 t xt 2 ) in the overlapping period (t = 1,...,k). Case 2. When a series undergoes a break for nonclimatic reasons, usually there are no overlapping data and the adjustments are based on multiple differences between the candidate series and the highly correlated series of the same region. The cross-correlations ρ j between the candidate series x t and the j available series x j t are computed for these intervals, with corrections for the autocorrelations, if necessary (use of the whitened residuals, after separating an ARIMA-process from the series). Up to k = 5series are chosen for correction under the criterion of highest cross-correlations. Given an adjustment interval of m months, with data x t and x j t at each side of the break at t = τ: {x t,x j t ; t = τ m + 1,...τ; j = 1,...k} and {x t,x j t ; t = τ + 1,...τ + m; j = 1,...k}, the mean after : before difference is computed for each j = 1,...k. With the indices bef = before and aft = after the break, the

7 LONG-TERM MONTHLY SPANISH TEMPERATURE DATA Figure 2. (A) a difference series of maximum temperatures; (B) the coefficients of the t-test with a 20-year running window (discontinuous line) and in the whole 40-year interval (both left axis) and of the SNHT in the 40-year interval (thick line, right axis). partial adjustment j, given by one neighbouring series j, is j = (x j af t x af t ) (x j bef x bef ) = 1/m ( τ+m ) (x j t x t τ )) (x j t x t t=τ+1 t=τ m+1 (1) The total adjustment term is a linear superposition of the k individual offsets j, with the squared crosscorrelations ρj 2 and the coefficients q j, (common data fraction) as weight factors: ( k ) ( k ) = j=1 ρ2 j q j j / j=1 ρ2 j q j (2) This adjustment is applied to the data before the break, because leaving the recent data unchanged is a practical advantage for later updates. The adjustments of the breaks are always based on the whole monthly dataset, but generally, the adjusted value does not depend on the month or the season. Only a few seasonally distinct adjustments are applied, when the seasonal discrepancies are particularly large. In the literature, different types of adjustments can be found (see for example Peterson et al., 1998a). Inhomogeneities in climate data often depend on the month or season, because of the seasonally diverging impacts of instrumental or environmental changes. Hence, an adjustment that depends on the month of the year can theoretically be better. However, it modifies the variability, the autocorrelation structure, and the annual cycles of the data, whereas the adjustments generally performed here consist of a simple additive term. Furthermore, a monthly varying adjustment must work with 12 times fewer data (for a given interval length) and the confidence margins are substantially wider. Hence, adjustments of this type become more attractive when the initial data quality is higher than in the present study. The detection of individual extreme anomalies The extreme anomalies are detected relative to a symmetrically running 30-year interval centerd at each data point. The detection does not work relative to a fixed reference interval, but with a moving window, to determine extreme events relative to the mean temperatures and variability of their adjacent period. All local extreme anomalies are catalogued, but, as in the preceding steps, their differences are crucial for the homogeneity analysis. An anomaly is generally (with few exceptions) considered inhomogeneous when its amplitude exceeds the 2.81 σ - level (99.5% confidence) in at least three difference series between the candidate and the surrounding series. Once the inhomogeneous data is deleted, the gaps can be filled in, as described in the following section. The replacement of missing data Data gaps are frequent in almost all climate data and are an obstacle for an analysis that requires complete series. However, to reject all incomplete series would mean the discarding of almost all data, and therefore a filling strategy for the gaps is necessary. Missing at random (MAR), Little and Rubin, 1987) is a basic condition for missing data (usually presupposed). It is fulfilled when the occurrence probability of a gap at a certain time is independent of the variable s value at this time. This is not the case when certain data are lacking because the values were extreme and could therefore not be measured. The first type of available information for filling gaps is the intrinsic information in the series. The ARIMAmethod (autoregressive integrated moving average, see for example Box and Jenkins, 1976) analyses stationarity and autocorrelations of a series and decomposes it into an ARIMA-series and white-noise residuals. A prediction can be drawn out of the ARIMA-subseries (the best predictor of a white noise is always zero). For monthly temperatures, this method has little predictive potential

8 M. STAUDT., M. J. ESTEBAN-PARRA AND Y. CASTRO-DÍEZ because the white noise component generally explains 70 90% of the variability. Hence, the ARIMA method is used to fill a data gap with the intrinsic information of the candidate series, only when no simultaneous regional data are available. The second information type, the synchronous data of the adjacent series, is more efficient in replacing missing data on account of the high cross-correlations between nearby series. The basic idea is to weight the contribution of each time series according to its confidence level. This is done by a weighted average of the anomalies of up to m = 5 related series. Several points have to be considered for a proper gap-filling algorithm: The most relevant information is again contained in the adjacent years, because the series memory is rather short. Hence, all involved anomalies are computed relative to a symmetric interval (usually 30 years) around the gap. The interpolations are based on standardized anomalies, because standard deviations may differ systematically, even between highly correlated temperature series. The higher the correlation with the candidate, the higher is the weight that will be given to a series contribution. The confidence in a contribution (by a series) that is based on a reduced number of common data with the candidate decreases and so must its weighting factor. If the cross-correlations of the surrounding series with the candidate are weak, the amplitude of the correction is reduced. This means a more cautious gap filling (closer to zero), because in case of complete ignorance, the gap would be filled with an anomaly of zero. Let T c and T j (j = 1,...m) be the temperature anomalies of the candidate and the correction series at a certain time, σ c and σ j their standard deviations and c j the weighting factors. Then, the anomaly to fill in the gap of the candidate series is T s = m j=1 c j σ s σ j T j, while m c j = 1 (3) j=1 The weights c j depends on the squared crosscorrelations ρ 2 j with the candidate and on a common data-parameter q j,sothat m c j = q j ρj 2 / q j ρj 2 (4) j=1 Weak cross-correlations are considered by computing their sum of squares S. If this factor is smaller than unity, the temperature estimation in Equation (3) is multiplied by S. If S is even smaller than 0.5, this method is discarded and the data gap is filled by an ARIMA- interpolation (intrinsic information in the candidate series). After this step, the regional average is computed and the series are now almost ready for a comparative analysis in each region, with minimized homogeneity problems and without gaps. The last factor (given below) to be considered is the urban heat island. THE ADJUSTMENT OF THE URBAN HEAT ISLAND The urban heat Island The small-scale urban warming is a well-known phenomenon in climatology. Its principal causes are the heat-storage capacity of buildings and streets, the quick removal of rainwater, the heat emissions from houses, vehicles and industry, and sometimes a reduced infrared radiative heat loss, due to locally increased atmospheric turbidity. Several studies at different latitudes have found significant thermal differences between urban and rural observatories (Tereshchenko and Filonov, 2001; Figuerola and Mazzeo, 1998; Shahgedanova et al., 1997; Landsberg, 1981; Oke, 1973) and the state of knowledge about the urban heat island is described in Arnfield (2003). An adjustment of the urban effect is necessary to attain realistic results concerning thermal evolution and changes for large cities. The urban effect usually is greatest during the minimum temperatures in the early morning and under anticyclonic conditions (Montávez et al., 2000; Unger, 1996; Colacino and Lavagnini, 1982). Consequently, it also depends on the season: Yagüe et al. (1991) found a strong urban effect in Madrid in summer and a weaker effect in spring. In the present study, the adjustment will be on a long-term basis, without a need to discriminate between seasons or weather types. Hence, the aim is to establish a quantitative relationship between the urban population (as a measure of city size) and urban-rural temperature differences. The empirical urban adjustment Unfortunately, the Spanish data coverage is not sufficient for a study of this relationship on a solid statistical basis, and thus an empirical result is used. After reviewing the literature the aforementioned studies and, moreover, Kukla et al. (1986), Colacino and Rovelli (1983), Moreno García (1994), Portman (1993); Kozuchowski et al. (1994) and Karl et al. (1988) we adopted from the latter study the relation T urb rur = a popul 0.45 urb (5) where popul urb represents the population of the city. This result is based on a large number of data series (more than 1200) and recently confirmed by Englehart and Douglas (2003). The most consistent results were found with a coefficient a = (2.39 ± 0.70) 10 3 K(±95%) for minimum temperatures (Karl et al., 1988). For maximum temperatures, the results did not differ significantly from zero. Furthermore, an urban effect on the maxima was

9 LONG-TERM MONTHLY SPANISH TEMPERATURE DATA neither theoretically well explained nor clearly confirmed in the above works, although Philandras et al. (1999) reported an urban effect in Athens that was stronger in the maximum than in the minimum temperatures. Hence, in the present study, this empirical urban correction is applied to the local representative series of each region, as a function of its population, but only for the minimum temperatures The urban thermal effect is generally weaker in Europe than in northern America and the corresponding adjustment factor of 0.7 (Karl et al., 1988) is applied and discussed for Spain. An alternative adjustment for the minimum temperatures in Madrid For Madrid, an alternative correction for the minimum temperatures is constructed with the data of Madrid Retiro and Toledo. The latter observatory is far enough from Madrid to be outside the urban area, but near enough to have almost identical climatical conditions. Segovia and Ávila are discarded, because a mountain range divides these observatories from Madrid, while Guadalajara is rejected because of its limited data coverage (Table I). The differences of the minimum temperatures of Madrid Toledo show the urban influence, with a clear and highly significant increase between 1930 and 1970 (Figure 3), roughly synchronous with the urban growth of Madrid (the growth of Toledo is considered negligible). The linear regression T urb rur = p + q pop Madrid (6) links the differences in minimum temperature and the population (the parameters p and q are given in Table II). For the minimum temperatures in Madrid, both urban corrections can be compared (The empirical urban correction Table II. Row I: parameters of the linear regression (x) for the differences of Madrid Toledo in minimum temperatures, as a function of the population of Madrid (in millions). Row II: like I, but for the 8-year moving averages; ε is the total error of the estimation in the period The coefficient in row II has a smaller error than the one in row I, due to the compensation of individual anomalies, and has been chosen to compute the adjustment. P( C) q( C/10 6 habit.) ε ( C) (I) ± ± (II) ± ± and the approach with the data of Madrid and Toledo). The resulting coefficient q in Equation (6) defines an urban correction of approximately 0.35 C for each million habitants, whereas the empirical adjustment Equation (5) is larger, applied to Madrid: even under application of the reduction factor of 0.7, the correction is about 1.27 C for the first million habitants, 0.46 C for the second and 0.35 C for the third million. RESULTS The compiled homogenized dataset; adjustments, and rejected data In this study, 43 monthly series of maximum and minimum temperatures, almost all available Spanish long-term series with coverage longer than 30 years, have been organized into seven regional groups and homogenized. The analysis of data quality confirmed widespread homogeneity problems. Adjustments were necessary in almost all series, although the criteria for the detection of inhomogeneities were severe (high significance and Figure 3. Population of Madrid (dotted line, right axis) and 8-year moving averages of the differences of Madrid Toledo in minimum temperatures (continuous line, left axis).

10 M. STAUDT., M. J. ESTEBAN-PARRA AND Y. CASTRO-DÍEZ redundancy levels). In some cases, long intervals (the maxima in León and minima in Guadalajara and Cuenca) or entire series (the maximum temperatures in Valladolid and the minimum temperatures in Ciudad Real) were rejected, because of a lack of homogeneity. On the whole, 59 (85) inhomogenities were adjusted to maximum (minimum) temperatures (Table II), with the mean amplitudes of 1.00 C (1.05 C); in addition, there were many rejected intervals and individual data. On average, one adjustment was made for every 44.5 years (66 years) and a series of years required an approximate mean of two adjustments. The temperature evolution of each region was then represented by its average anomalies and one local series. This dataset maximized the confidence, because all participating series were carefully analyzed and adjusted for homogeneity. Interregional differences ( km scale) in the temperature evolution were resolved, although they were of second order, compared to the common variability at the 1000 km scale. The sub-regional differences (<100 km) were of third order and impossible to resolve with these series, owing to the limited data quality and because the adjustments had intentionally mixed the data within one region. The adjustments applied to the data series consist mainly of corrections of breaks (abrupt changes). Moreover, the series were also scanned for individual inhomogeneous and extreme data, and the gaps in the two representative series of each region are filled. Table II gives an overview of all the adjustments and as an example, Table III lists the details of the adjustment for the maximum and the minimum temperatures in Madrid. As further examples of the results, in Figures 4 and 5, the monthly temperature anomalies before and after the homogeneization process are shown for four different series. Some effects of the homogeneization are clearly visible: in La Coruña (Figure 4(A), (B)), the net warming of the minima was too large, as a consequence of an inhomogeneous break of considerable amplitude in ; in Seville (Figure 4(C), (D)), the 19th century data were rejected because of the lack of simultaneous regional data, an important break in was adjusted and the data of three different series were unified (with adjustments); in the maxima in Madrid (Figure 5(A), (B)), the large break in impedes a reasonable analysis without homogenizing and in the minima in Madrid (Figure 5(C), (D)), the two breaks in and again are important, too. The increase in data homogeneity and quality for climatechange studies (the main goal of this study) is further investigated below in the sections An Estimation of the error Margins of Raw and homogenized Data, and, A Figure 4. Monthly anomalies of the minimum temperatures in La Coruña and the maximum temperatures in Seville: raw data (left side) and homogenized data (right side; all series with distance weighted least squares fits).

11 LONG-TERM MONTHLY SPANISH TEMPERATURE DATA Figure 5. Monthly anomalies of maximum and minimum temperatures in Madrid: raw data (left side) and homogenized data (right side; all series with distance weighted least squares fits). Comparison of some Results, Based on Raw and Homogenized Data. An estimation of the error margins of raw and homogenized data The instrumental error in temperature measurement was of the order of 0.1 C (Linacre, 1992; Servicio Meteorológico Nacional, 1956) and increased to around 0.2 C when differences between two series are concerned (assuming linear error propagation). Any linear homogeneity adjustment based on the same data type added an error of the same amplitude. A long-term series of roughly one century required an average of two adjustments and the mean margin of this error (instrumental plus homogeneization) increased to C, an amplitude of the order of the mean global warming of the 20th century (0.6 C, IPCC, 2001). This comparison illustrates the crucial role of data quality. On the other hand, the mean amplitude of the adjustments (around 1 C) defined the mean error of the inhomogeneities and the uncertainty in the raw-data series, besides the instrumental error of 0.1 C. A large series had between one and five inhomogeneous breaks, with a statistical average of around two. The errors tended to cancel each other or accumulate (partially or entirely). In the latter case, the total error (instrumental plus inhomogeneities) could exceed 1 C, or sometimes even be higher than 2 C. This hampered the detection of climate changes on any reasonable confidence level. The critical role of data homogeneity was confirmed in an extensive analysis of 20th century surface-air temperature and precipitation data from the European Climate Assessment, in Wijngaard et al. (2003). The authors organized the quality of the tested series into the classes useful, doubtful, and suspect. In the period ( ), 94% (61%) of the temperature series are labelled doubtful or suspect. Referring to trends and the variability of weather extremes, the authors state that Clearly, this type of analysis is limited by the degree of inhomogeneity of the data. To compare these statements with the data of the present study, the following paragraph summarizes some comparisons between temperature changes in raw and adjusted data. A comparison of some results, based on raw and homogenized data To compare the thermal changes detected with raw and homogenized data, we applied a t-test

12 M. STAUDT., M. J. ESTEBAN-PARRA AND Y. CASTRO-DÍEZ Table III. Details of the adjustments of the maximum and the minimum temperatures in Madrid. The iteration steps 1 and 3 are there because of the other series, although nothing was done in the Madrid series. The adjusted values are added to all data before the break. The symbol s/n is a signal to noise ratio : the quotient of the adjusted value and the standard deviation of the base interval. Maximum temperatures Minimum temperatures A. Adjustment of inhomogeneous breaks and rejection of inhomogeneous data Iteration step 1 Iteration step 2 break: , adjusted with data of Burgos, anomaly October 1894 rejected. Salamanca, Segovia, Soria and Albacete, base interval: , value: 1.81 C, s/n = break: , adjusted with data of Salamanca, Segovia, Soria, Toledo, Ciudad Real and Cuenca, base interval: , value: C, s/n = 0.45 Iteration step 3 Iteration step 4 break: Nov March 1937, adjusted with data of Burgos, Palencia, Salamanca, Segovia, Toledo, Cuenca; interval: Jan Dec. 1951, value: 0.65 C, s/n = anomaly August 1993 rejected. break: , adjusted with data of Burgos, Salamanca, Segovia, Soria and Albacete, base interval: , value: C, s/n = break: , adjusted with data of Burgos, Salamanca and Albacete, base interval: , value: 0.71 C, s/n = B. Filling of data gaps with data from 4 11 series (of the central plains) and base intervals of approximately 20 years around the missing data. Feb. and Dec. 1875, July 1878, July 1879, July Dec. 1875, July 1897, Oct. 1894, Sept. 1928, 1897, June 1905, June 1922, Nov. Dec. 1936, Nov. Dec. 1936, Jan Feb Apr Oct. 1937, Jan Feb April Oct. 1937, March and April Mar-Apr 1939, Aug , Aug (autocorrelation-corrected) to the temperature means of the first and last 30-year intervals of the 20th century, in six examples (Table IV). After homogenization, the regional net temperature changes were substantially more similar and more consistent between the local and the mean representative series. In several cases, even the qualitative results and their significance levels differed: in the maximum temperatures of the Cantabrian and the Ebro valley and the minima of the Mediterranean, there was a lack of consistency between the raw local and mean series, where only one series showed a highly significant change. In all cases, the degree of consistency in the homogenized data was at least similar, but was usually higher. These results confirmed the substantially larger errors of the raw series and suggested that an analysis based on the raw data in many cases may not be valid if a reasonable confidence level is requested. Furthermore, according to An estimation of the error margins of raw and homogenized data, the homogenization procedure improves the data quality by reducing the error margins (seeeliminating the error of the order of 1 C, due to the inhomogeneities) and is strongly recommended as a previous step, before analysing the data. The empirical urban correction and the approach with the data of Madrid and Toledo To test the performance of both urban corrections for Madrid, the differences between the average series of central Spain (without Madrid) and Madrid were compared (Figure 6). The average series stemmed from medium-sized towns with an average population of around , for which the urban effect was negligible, compared to Madrid. The decreasing trend of the differences without any urban correction was, at least partly, owing to the urban effect (the climatic differences in central Spain were not large). With the empirical correction, this trend was reversed, signifying over-adjustment of the urban effect. The series C, with the alternative adjustment (Madrid Toledo) still showed an increase, but clearly weaker and less significant than B, indicating a more realistic correction, although slightly too great. The urban effect in Madrid was smaller than it would be theoretically, following the population data and the comparison with Toledo. The Madrid data were compiled from the Retiro park observatory, located in the urban centre, but close to the edge of this green area of about 1.2 km 2. The minima at dawn were very probably lowered, thus attenuating the urban effect. García Hernández et al. (1997) stated a clear influence of

PRESENTATION OF E.164 NATIONAL NUMBERING PLAN COUNTRY CODE 34 SPAIN SHORT CODES. 0 3 3 Short codes Social value services

PRESENTATION OF E.164 NATIONAL NUMBERING PLAN COUNTRY CODE 34 SPAIN SHORT CODES. 0 3 3 Short codes Social value services PRESENTATION OF E.164 NATIONAL NUMBERING PLAN COUNTRY CODE 34 SPAIN (Updated 01-10-2013) N(S)N number 00 2 2 International prefix SHORT CODES 0 3 3 Short codes Social value services 1 4 4 Short codes 103

More information

LIFE08 ENV/IT/436 Time Series Analysis and Current Climate Trends Estimates Dr. Guido Fioravanti guido.fioravanti@isprambiente.it Rome July 2010 ISPRA Institute for Environmental Protection and Research

More information

Trends in frequency indices of daily precipitation over the Iberian Peninsula during the last century

Trends in frequency indices of daily precipitation over the Iberian Peninsula during the last century JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 116,, doi:10.1029/2010jd014255, 2011 Trends in frequency indices of daily precipitation over the Iberian Peninsula during the last century M. C. Gallego, 1 R. M. Trigo,

More information

The Effects of Climate Change on Water Resources in Spain

The Effects of Climate Change on Water Resources in Spain Marqués de Leganés 12-28004 Madrid Tel: 915312739 Fax: 915312611 secretaria@ecologistasenaccion.org www.ecologistasenaccion.org The Effects of Climate Change on Water Resources in Spain In order to achieve

More information

Guidelines on Quality Control Procedures for Data from Automatic Weather Stations

Guidelines on Quality Control Procedures for Data from Automatic Weather Stations WORLD METEOROLOGICAL ORGANIZATION COMMISSION FOR BASIC SYSTEMS OPEN PROGRAMME AREA GROUP ON INTEGRATED OBSERVING SYSTEMS EXPERT TEAM ON REQUIREMENTS FOR DATA FROM AUTOMATIC WEATHER STATIONS Third Session

More information

High Speed Rail in Spain. Victorino Pérez Senior Manager International Relations Renfe Operadora FEBRUARY 25 th, 2014

High Speed Rail in Spain. Victorino Pérez Senior Manager International Relations Renfe Operadora FEBRUARY 25 th, 2014 1 High Speed Rail in Spain Victorino Pérez Senior Manager International Relations Renfe Operadora FEBRUARY 25 th, 2014 2 Spanish Railway Network Total railway network: 15,333 km HS network standard gauge

More information

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

WORKSHOP REGULATING ACCESS TO PROFESSIONS: NATIONAL PERSPECTIVES. Breakout session 3: Social workers

WORKSHOP REGULATING ACCESS TO PROFESSIONS: NATIONAL PERSPECTIVES. Breakout session 3: Social workers WORKSHOP REGULATING ACCESS TO PROFESSIONS: NATIONAL PERSPECTIVES Breakout session 3: Social workers Ana Isabel Lima Fernández President General Council of Social Work, Spain Brussels,17June 2013 General

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

More information

AIR TEMPERATURE IN THE CANADIAN ARCTIC IN THE MID NINETEENTH CENTURY BASED ON DATA FROM EXPEDITIONS

AIR TEMPERATURE IN THE CANADIAN ARCTIC IN THE MID NINETEENTH CENTURY BASED ON DATA FROM EXPEDITIONS PRACE GEOGRAFICZNE, zeszyt 107 Instytut Geografii UJ Kraków 2000 Rajmund Przybylak AIR TEMPERATURE IN THE CANADIAN ARCTIC IN THE MID NINETEENTH CENTURY BASED ON DATA FROM EXPEDITIONS Abstract: The paper

More information

Interpolations of missing monthly mean temperatures in the Karasjok series

Interpolations of missing monthly mean temperatures in the Karasjok series Interpolations of missing monthly mean temperatures in the Karasjok series Øyvind ordli (P.O. Box 43, -0313 OSLO, ORWAY) ABSTRACT Due to the HistKlim project the sub daily data series from Karasjok was

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Global Seasonal Phase Lag between Solar Heating and Surface Temperature

Global Seasonal Phase Lag between Solar Heating and Surface Temperature Global Seasonal Phase Lag between Solar Heating and Surface Temperature Summer REU Program Professor Tom Witten By Abstract There is a seasonal phase lag between solar heating from the sun and the surface

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Interactive comment on Total cloud cover from satellite observations and climate models by P. Probst et al.

Interactive comment on Total cloud cover from satellite observations and climate models by P. Probst et al. Interactive comment on Total cloud cover from satellite observations and climate models by P. Probst et al. Anonymous Referee #1 (Received and published: 20 October 2010) The paper compares CMIP3 model

More information

Methodology For Illinois Electric Customers and Sales Forecasts: 2016-2025

Methodology For Illinois Electric Customers and Sales Forecasts: 2016-2025 Methodology For Illinois Electric Customers and Sales Forecasts: 2016-2025 In December 2014, an electric rate case was finalized in MEC s Illinois service territory. As a result of the implementation of

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Climate Extremes Research: Recent Findings and New Direc8ons

Climate Extremes Research: Recent Findings and New Direc8ons Climate Extremes Research: Recent Findings and New Direc8ons Kenneth Kunkel NOAA Cooperative Institute for Climate and Satellites North Carolina State University and National Climatic Data Center h#p://assessment.globalchange.gov

More information

Analysis of Turkish precipitation data: homogeneity and the Southern Oscillation forcings on frequency distributions

Analysis of Turkish precipitation data: homogeneity and the Southern Oscillation forcings on frequency distributions HYDROLOGICAL PROCESSES Hydrol. Process. 21, 3203 3210 (2007) Published online 7 March 2007 in Wiley InterScience (www.interscience.wiley.com).6524 Analysis of Turkish precipitation data: homogeneity and

More information

Temporal variation in snow cover over sea ice in Antarctica using AMSR-E data product

Temporal variation in snow cover over sea ice in Antarctica using AMSR-E data product Temporal variation in snow cover over sea ice in Antarctica using AMSR-E data product Michael J. Lewis Ph.D. Student, Department of Earth and Environmental Science University of Texas at San Antonio ABSTRACT

More information

Forecaster comments to the ORTECH Report

Forecaster comments to the ORTECH Report Forecaster comments to the ORTECH Report The Alberta Forecasting Pilot Project was truly a pioneering and landmark effort in the assessment of wind power production forecast performance in North America.

More information

Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

More information

Industry Environment and Concepts for Forecasting 1

Industry Environment and Concepts for Forecasting 1 Table of Contents Industry Environment and Concepts for Forecasting 1 Forecasting Methods Overview...2 Multilevel Forecasting...3 Demand Forecasting...4 Integrating Information...5 Simplifying the Forecast...6

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Climate Data and Information: Issues and Uncertainty

Climate Data and Information: Issues and Uncertainty Climate Data and Information: Issues and Uncertainty David Easterling NOAA/NESDIS/National Climatic Data Center Asheville, North Carolina, U.S.A. 1 Discussion Topics Climate Data sets What do we have besides

More information

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION

REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety

More information

James Hansen, Reto Ruedy, Makiko Sato, Ken Lo

James Hansen, Reto Ruedy, Makiko Sato, Ken Lo If It s That Warm, How Come It s So Damned Cold? James Hansen, Reto Ruedy, Makiko Sato, Ken Lo The past year, 2009, tied as the second warmest year in the 130 years of global instrumental temperature records,

More information

Robichaud K., and Gordon, M. 1

Robichaud K., and Gordon, M. 1 Robichaud K., and Gordon, M. 1 AN ASSESSMENT OF DATA COLLECTION TECHNIQUES FOR HIGHWAY AGENCIES Karen Robichaud, M.Sc.Eng, P.Eng Research Associate University of New Brunswick Fredericton, NB, Canada,

More information

Climate and Weather. This document explains where we obtain weather and climate data and how we incorporate it into metrics:

Climate and Weather. This document explains where we obtain weather and climate data and how we incorporate it into metrics: OVERVIEW Climate and Weather The climate of the area where your property is located and the annual fluctuations you experience in weather conditions can affect how much energy you need to operate your

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Appendix 1: Time series analysis of peak-rate years and synchrony testing.

Appendix 1: Time series analysis of peak-rate years and synchrony testing. Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Short-Term Forecasting in Retail Energy Markets

Short-Term Forecasting in Retail Energy Markets Itron White Paper Energy Forecasting Short-Term Forecasting in Retail Energy Markets Frank A. Monforte, Ph.D Director, Itron Forecasting 2006, Itron Inc. All rights reserved. 1 Introduction 4 Forecasting

More information

Do Commodity Price Spikes Cause Long-Term Inflation?

Do Commodity Price Spikes Cause Long-Term Inflation? No. 11-1 Do Commodity Price Spikes Cause Long-Term Inflation? Geoffrey M.B. Tootell Abstract: This public policy brief examines the relationship between trend inflation and commodity price increases and

More information

El Niño-Southern Oscillation (ENSO) since A.D. 1525; evidence from tree-ring, coral and ice core records.

El Niño-Southern Oscillation (ENSO) since A.D. 1525; evidence from tree-ring, coral and ice core records. El Niño-Southern Oscillation (ENSO) since A.D. 1525; evidence from tree-ring, coral and ice core records. Karl Braganza 1 and Joëlle Gergis 2, 1 Climate Monitoring and Analysis Section, National Climate

More information

Critical Limitations of Wind Turbine Power Curve Warranties

Critical Limitations of Wind Turbine Power Curve Warranties Critical Limitations of Wind Turbine Power Curve Warranties A. Albers Deutsche WindGuard Consulting GmbH, Oldenburger Straße 65, D-26316 Varel, Germany E-mail: a.albers@windguard.de, Tel: (++49) (0)4451/9515-15,

More information

163 ANALYSIS OF THE URBAN HEAT ISLAND EFFECT COMPARISON OF GROUND-BASED AND REMOTELY SENSED TEMPERATURE OBSERVATIONS

163 ANALYSIS OF THE URBAN HEAT ISLAND EFFECT COMPARISON OF GROUND-BASED AND REMOTELY SENSED TEMPERATURE OBSERVATIONS ANALYSIS OF THE URBAN HEAT ISLAND EFFECT COMPARISON OF GROUND-BASED AND REMOTELY SENSED TEMPERATURE OBSERVATIONS Rita Pongrácz *, Judit Bartholy, Enikő Lelovics, Zsuzsanna Dezső Eötvös Loránd University,

More information

JetBlue Airways Stock Price Analysis and Prediction

JetBlue Airways Stock Price Analysis and Prediction JetBlue Airways Stock Price Analysis and Prediction Team Member: Lulu Liu, Jiaojiao Liu DSO530 Final Project JETBLUE AIRWAYS STOCK PRICE ANALYSIS AND PREDICTION 1 Motivation Started in February 2000, JetBlue

More information

Pozuelo de Alarcón is the city with the highest level of income and lowest unemployment rate of the 109 analyzed

Pozuelo de Alarcón is the city with the highest level of income and lowest unemployment rate of the 109 analyzed 30 June 2015 Urban Indicators (Urban Audit) Year 2015 Pozuelo de Alarcón is the city with the highest level of income and lowest unemployment rate of the 109 analyzed Sanlúcar de Barrameda has the highest

More information

Multiple Regression: What Is It?

Multiple Regression: What Is It? Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection

Stepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.

More information

Integrated Resource Plan

Integrated Resource Plan Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1

More information

http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions

http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Forecasting with ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos (UC3M-UPM)

More information

Residential Market Report

Residential Market Report Residential Market Report Madrid City and the Metropolitan Area of Madrid A g u i r r e N e w m a n June 2014 A G E N D A 01 02 03 04 INTRODUCTION AND METHODOLOGY GEOGRAPHIC DISTRIBUTION CONCLUSIONS OF

More information

The Fundación Secretariado Gitano

The Fundación Secretariado Gitano The Fundación Secretariado Gitano Mission, values and aims The Fundación Secretariado Gitano is a nonprofit inter-cultural social organisation which provides services for the development of the Roma community

More information

CLOUD COVER IMPACT ON PHOTOVOLTAIC POWER PRODUCTION IN SOUTH AFRICA

CLOUD COVER IMPACT ON PHOTOVOLTAIC POWER PRODUCTION IN SOUTH AFRICA CLOUD COVER IMPACT ON PHOTOVOLTAIC POWER PRODUCTION IN SOUTH AFRICA Marcel Suri 1, Tomas Cebecauer 1, Artur Skoczek 1, Ronald Marais 2, Crescent Mushwana 2, Josh Reinecke 3 and Riaan Meyer 4 1 GeoModel

More information

Advanced Forecasting Techniques and Models: ARIMA

Advanced Forecasting Techniques and Models: ARIMA Advanced Forecasting Techniques and Models: ARIMA Short Examples Series using Risk Simulator For more information please visit: www.realoptionsvaluation.com or contact us at: admin@realoptionsvaluation.com

More information

Data Processing Flow Chart

Data Processing Flow Chart Legend Start V1 V2 V3 Completed Version 2 Completion date Data Processing Flow Chart Data: Download a) AVHRR: 1981-1999 b) MODIS:2000-2010 c) SPOT : 1998-2002 No Progressing Started Did not start 03/12/12

More information

Promotional Forecast Demonstration

Promotional Forecast Demonstration Exhibit 2: Promotional Forecast Demonstration Consider the problem of forecasting for a proposed promotion that will start in December 1997 and continues beyond the forecast horizon. Assume that the promotion

More information

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

More information

Module 6: Introduction to Time Series Forecasting

Module 6: Introduction to Time Series Forecasting Using Statistical Data to Make Decisions Module 6: Introduction to Time Series Forecasting Titus Awokuse and Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and

More information

Probabilistic Forecasting of Medium-Term Electricity Demand: A Comparison of Time Series Models

Probabilistic Forecasting of Medium-Term Electricity Demand: A Comparison of Time Series Models Fakultät IV Department Mathematik Probabilistic of Medium-Term Electricity Demand: A Comparison of Time Series Kevin Berk and Alfred Müller SPA 2015, Oxford July 2015 Load forecasting Probabilistic forecasting

More information

An introduction to Value-at-Risk Learning Curve September 2003

An introduction to Value-at-Risk Learning Curve September 2003 An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk

More information

STATISTICAL ANALYSIS OF UBC FACULTY SALARIES: INVESTIGATION OF

STATISTICAL ANALYSIS OF UBC FACULTY SALARIES: INVESTIGATION OF STATISTICAL ANALYSIS OF UBC FACULTY SALARIES: INVESTIGATION OF DIFFERENCES DUE TO SEX OR VISIBLE MINORITY STATUS. Oxana Marmer and Walter Sudmant, UBC Planning and Institutional Research SUMMARY This paper

More information

Physics Lab Report Guidelines

Physics Lab Report Guidelines Physics Lab Report Guidelines Summary The following is an outline of the requirements for a physics lab report. A. Experimental Description 1. Provide a statement of the physical theory or principle observed

More information

Predicting daily incoming solar energy from weather data

Predicting daily incoming solar energy from weather data Predicting daily incoming solar energy from weather data ROMAIN JUBAN, PATRICK QUACH Stanford University - CS229 Machine Learning December 12, 2013 Being able to accurately predict the solar power hitting

More information

THE STATISTICAL TREATMENT OF EXPERIMENTAL DATA 1

THE STATISTICAL TREATMENT OF EXPERIMENTAL DATA 1 THE STATISTICAL TREATMET OF EXPERIMETAL DATA Introduction The subject of statistical data analysis is regarded as crucial by most scientists, since error-free measurement is impossible in virtually all

More information

THE DEVELOPMENT OF A NEW DATASET OF SPANISH DAILY ADJUSTED TEMPERATURE SERIES (SDATS) (1850 2003)

THE DEVELOPMENT OF A NEW DATASET OF SPANISH DAILY ADJUSTED TEMPERATURE SERIES (SDATS) (1850 2003) INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 26: 1777 1802 (2006) Published online 5 May 2006 in Wiley InterScience (www.interscience.wiley.com).1338 THE DEVELOPMENT OF A NEW DATASET OF SPANISH

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

Climatography of the United States No. 20 1971-2000

Climatography of the United States No. 20 1971-2000 Climate Division: CA 6 NWS Call Sign: SAN Month (1) Min (2) Month(1) Extremes Lowest (2) Temperature ( F) Lowest Month(1) Degree s (1) Base Temp 65 Heating Cooling 100 Number of s (3) Jan 65.8 49.7 57.8

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

Examining the Recent Pause in Global Warming

Examining the Recent Pause in Global Warming Examining the Recent Pause in Global Warming Global surface temperatures have warmed more slowly over the past decade than previously expected. The media has seized this warming pause in recent weeks,

More information

How To Calculate Global Radiation At Jos

How To Calculate Global Radiation At Jos IOSR Journal of Applied Physics (IOSR-JAP) e-issn: 2278-4861.Volume 7, Issue 4 Ver. I (Jul. - Aug. 2015), PP 01-06 www.iosrjournals.org Evaluation of Empirical Formulae for Estimating Global Radiation

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests Content Sheet 7-1: Overview of Quality Control for Quantitative Tests Role in quality management system Quality Control (QC) is a component of process control, and is a major element of the quality management

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Service Charter 2014-2017

Service Charter 2014-2017 State Public Employment Service Service Charter 2014-2017 GOBIERNO DE ESPAÑA MINISTERIO DE EMPLEO Y SEGURIDAD SOCIAL SERVICIO PÚBLICO DE EMPLEO ESTATAL Catálogo de publicaciones de la Administración General

More information

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

More information

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

More information

Development of new hybrid geoid model for Japan, GSIGEO2011. Basara MIYAHARA, Tokuro KODAMA, Yuki KUROISHI

Development of new hybrid geoid model for Japan, GSIGEO2011. Basara MIYAHARA, Tokuro KODAMA, Yuki KUROISHI Development of new hybrid geoid model for Japan, GSIGEO2011 11 Development of new hybrid geoid model for Japan, GSIGEO2011 Basara MIYAHARA, Tokuro KODAMA, Yuki KUROISHI (Published online: 26 December 2014)

More information

TEC H N I C A L R E P O R T

TEC H N I C A L R E P O R T I N S P E C T A TEC H N I C A L R E P O R T Master s Thesis Determination of Safety Factors in High-Cycle Fatigue - Limitations and Possibilities Robert Peterson Supervisors: Magnus Dahlberg Christian

More information

On Correlating Performance Metrics

On Correlating Performance Metrics On Correlating Performance Metrics Yiping Ding and Chris Thornley BMC Software, Inc. Kenneth Newman BMC Software, Inc. University of Massachusetts, Boston Performance metrics and their measurements are

More information

COMPUTING CLOUD MOTION USING A CORRELATION RELAXATION ALGORITHM Improving Estimation by Exploiting Problem Knowledge Q. X. WU

COMPUTING CLOUD MOTION USING A CORRELATION RELAXATION ALGORITHM Improving Estimation by Exploiting Problem Knowledge Q. X. WU COMPUTING CLOUD MOTION USING A CORRELATION RELAXATION ALGORITHM Improving Estimation by Exploiting Problem Knowledge Q. X. WU Image Processing Group, Landcare Research New Zealand P.O. Box 38491, Wellington

More information

Application and results of automatic validation of sewer monitoring data

Application and results of automatic validation of sewer monitoring data Application and results of automatic validation of sewer monitoring data M. van Bijnen 1,3 * and H. Korving 2,3 1 Gemeente Utrecht, P.O. Box 8375, 3503 RJ, Utrecht, The Netherlands 2 Witteveen+Bos Consulting

More information

South Africa. General Climate. UNDP Climate Change Country Profiles. A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1

South Africa. General Climate. UNDP Climate Change Country Profiles. A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1 UNDP Climate Change Country Profiles South Africa A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1 1. School of Geography and Environment, University of Oxford. 2. Tyndall Centre for Climate

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Water for services advanced study. Report on Grant Agreement No 71301.2006.002-2006.471

Water for services advanced study. Report on Grant Agreement No 71301.2006.002-2006.471 Report on Grant Agreement No 71301.2006.002-2006.471 Statistics Sweden 3(21) Contents Abstract 4 Background 5 The Project 5 Objective 5 Project plan 5 Methodology 6 Phase 1: Preparation of study 6 Defining

More information

TIME SERIES ANALYSIS

TIME SERIES ANALYSIS TIME SERIES ANALYSIS L.M. BHAR AND V.K.SHARMA Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-0 02 lmb@iasri.res.in. Introduction Time series (TS) data refers to observations

More information

An Assessment of Prices of Natural Gas Futures Contracts As A Predictor of Realized Spot Prices at the Henry Hub

An Assessment of Prices of Natural Gas Futures Contracts As A Predictor of Realized Spot Prices at the Henry Hub An Assessment of Prices of Natural Gas Futures Contracts As A Predictor of Realized Spot Prices at the Henry Hub This article compares realized Henry Hub spot market prices for natural gas during the three

More information

CALCULATION OF COMPOSITE LEADING INDICATORS: A COMPARISON OF TWO DIFFERENT METHODS

CALCULATION OF COMPOSITE LEADING INDICATORS: A COMPARISON OF TWO DIFFERENT METHODS Olivier Brunet OECD Secretariat Paris, France Olivier.Brunet@oecd.org Session: Macroeconomic Analysis and Forecasting B (CLI) CALCULATION OF COMPOSITE LEADING INDICATORS: A COMPARISON OF TWO DIFFERENT

More information

How To Forecast Solar Power

How To Forecast Solar Power Forecasting Solar Power with Adaptive Models A Pilot Study Dr. James W. Hall 1. Introduction Expanding the use of renewable energy sources, primarily wind and solar, has become a US national priority.

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Climatography of the United States No. 20 1971-2000

Climatography of the United States No. 20 1971-2000 Climate Division: CA 4 NWS Call Sign: Month (1) Min (2) Month(1) Extremes Lowest (2) Temperature ( F) Lowest Month(1) Degree s (1) Base Temp 65 Heating Cooling 1 Number of s (3) Jan 59.3 41.7 5.5 79 1962

More information

ideas from RisCura s research team

ideas from RisCura s research team ideas from RisCura s research team thinknotes april 2004 A Closer Look at Risk-adjusted Performance Measures When analysing risk, we look at the factors that may cause retirement funds to fail in meeting

More information

Preparatory Paper on Focal Areas to Support a Sustainable Energy System in the Electricity Sector

Preparatory Paper on Focal Areas to Support a Sustainable Energy System in the Electricity Sector Preparatory Paper on Focal Areas to Support a Sustainable Energy System in the Electricity Sector C. Agert, Th. Vogt EWE Research Centre NEXT ENERGY, Oldenburg, Germany corresponding author: Carsten.Agert@next-energy.de

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

A new approach for event study of private placement announcement effect: Evidence from China Yuling Zhao 1,a*

A new approach for event study of private placement announcement effect: Evidence from China Yuling Zhao 1,a* International Conference on Education, Management and Computing Technology (ICEMCT 21) A new approach for event study of private placement announcement effect: Evidence from China Yuling Zhao 1,a* 1 Experiment

More information

ENERGY STAR for Data Centers

ENERGY STAR for Data Centers ENERGY STAR for Data Centers Alexandra Sullivan US EPA, ENERGY STAR February 4, 2010 Agenda ENERGY STAR Buildings Overview Energy Performance Ratings Portfolio Manager Data Center Initiative Objective

More information

Nonparametric Tests for Randomness

Nonparametric Tests for Randomness ECE 461 PROJECT REPORT, MAY 2003 1 Nonparametric Tests for Randomness Ying Wang ECE 461 PROJECT REPORT, MAY 2003 2 Abstract To decide whether a given sequence is truely random, or independent and identically

More information

Data Quality Assurance and Control Methods for Weather Observing Networks. Cindy Luttrell University of Oklahoma Oklahoma Mesonet

Data Quality Assurance and Control Methods for Weather Observing Networks. Cindy Luttrell University of Oklahoma Oklahoma Mesonet Data Quality Assurance and Control Methods for Weather Observing Networks Cindy Luttrell University of Oklahoma Oklahoma Mesonet Quality data are Trustworthy Reliable Accurate Precise Accessible Data life

More information

Effects of CEO turnover on company performance

Effects of CEO turnover on company performance Headlight International Effects of CEO turnover on company performance CEO turnover in listed companies has increased over the past decades. This paper explores whether or not changing CEO has a significant

More information