Climate change impact on hydrological extremes along rivers and urban drainage systems in Belgium

Size: px
Start display at page:

Download "Climate change impact on hydrological extremes along rivers and urban drainage systems in Belgium"

Transcription

1 CCI-HYDR project (contract SD/CP/3A) for: Programme SSD «Science for a Sustainable Development» TECHNICAL REPORT, MAY 28 Climate change impact on hydrological extremes along rivers and urban drainage systems in Belgium 3. Statistical analysis of historical rainfall, ETo and river flow series trends and cycles Faculty of Engineering Department of Civil Engineering Hydraulics Division CCI-HYDR project Royal Meteorological Institute of Belgium Meteorological Research and Development Department Risk Analysis and Sustainable Development Section

2 Faculty of Engineering Department of Civil Engineering Hydraulics Section Kasteelpark Arenberg 4 BE-3 Leuven, Belgium tel fax Patrick.Willems@bwk.kuleuven.be Meteorological Research and Development Department Risk Analysis and Sustainable Development Section Avenue Circulaire, 3 BE-8 Brussels, Belgium tel fax Emmanuel.Roulin@oma.be No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without indicating the reference : Ntegeka V., Willems P., 28. Climate change impact on hydrological extremes along rivers and urban drainage systems. III. Statistical analysis of historical rainfall, ETo and river flow series trends and cycles, Belgian Science Policy SSD Research Programme, Technical report CCI-HYDR project by K.U.Leuven Hydraulics Section & Royal Meteorological Institute of Belgium, May 28, 37 p.

3 CCI-HYDR III. Statistical analysis trends and cycles - Content i Table of contents Introduction... 2 Statistical analysis of trends and cycles in the historical Uccle rainfall series Introduction Methodology based on extreme value analysis Peaks-Over-Threshold extremes Aggregation levels and time scales Frequency analysis of extremes Rainfall distributions The two-component exponential distribution The Weibull distribution Calibration of the distributions Random sample generations Monte Carlo confidence intervals Quantile-perturbation approach Results Quantile perturbations for extreme rainfall conditions Slopes of quantile perturbations versus return period Mean POT perturbations Number of events Statistical hypothesis testing on the clustering of rainfall extremes Slope hypothesis testing Hypothesis testing average perturbations Conclusions Statistical trend analysis for evapotranspiration Trend analysis of historical river flow series Flow perturbation analysis Summer flow perturbations Autumn flow perturbations Winter flow perturbations Spring flow perturbations Conclusions from the flow perturbations s...35

4

5 CCI-HYDR III. Statistical analysis trends and cycles Introduction There is a general perception that the climate for the most recent decades has changed. In Europe evidence shows increasing temperature trends since the late 2 th and early 2 st centuries (Luterbacher et al., 24; Xoplaki et al., 25). Various statistical methods exist for studying the variability of extremes. The use of statistical techniques provides a basis for understanding and evaluating the changes within hydro-meteorological time series. While the temperature records are easier to interpret for the long-term records, patterns of precipitation are less clear. For instance, statistical techniques allow for easy manipulation of the temporal variability of precipitation, which is inherently intermittent. Frequency and perturbation methods can be combined to reveal trends and cycles which would have otherwise been missed. This section covers the statistical approach for studying rainfall extremes for the historical Uccle precipitation series. The approach combines aspects of frequency and perturbation and thus provides an insightful temporal assessment of the trends and cycles of the extremes. 2 Statistical analysis of trends and cycles in the historical Uccle rainfall series 2. Introduction Climate changes can be detected using physically based methods and/or empirical methods. Physically based methods detect changes using climate models while empirical methods detect changes using statistical techniques. This section covers the empirical statistical analysis for the long-term high-frequency homogeneous rainfall series at the climatological station of the Royal Meteorological Institute of Belgium at Uccle that starts in 898 and which is continued to date (Demarée et al., 998; Demarée, 23). Vaes et al. (22) carried out before a preliminary trend analysis on the rainfall extremes in this long-term rainfall series at Uccle for the period The most recent period (last 7 years) was, however, not included. Blanckaert and Willems (26) conducted a spectral analysis based on Fast and Windowed Fourier Analysis and a wavelet analysis based on the hourly series for the period This research is based on the extended long-term historical series of minutes rainfall intensities for the period and makes use of an alternative method based on quantile-perturbations. The method allows a consistency check to be made of the results from the climate model simulations with the recent empirical trends. It is noteworthy that the data is of high quality and was provided by the Royal Meteorological Institute of Belgium. The general aim of the statistical analysis is investigation on whether the recent historical changes in frequency and amplitude of the rainfall extremes can be considered statistically significant in comparison with the natural temporal variability of rainfall intensities (as observed in the full available series since 898). The objective for the study is not to predict future trends, but to detect trends and cycles in retrospect. The analysis is carried out for different aggregation levels (time spans over which the rainfall intensities are averaged) spread over the range of concentration times along Belgian rural and urban catchments. These are the relevant time scales for the rainfall that determine the peak flow downstream of the catchment, and need to cover for rural and urban hydrology in Belgium the range from minutes to the seasonal scale. In the trend analysis, also the effect of clustering in time on the temporal variability of the frequency and amplitude of the rainfall extremes is taken into account. The statistical analysis is based on the application of the frequency and perturbation techniques. While frequency techniques focus on how often an event may occur, perturbation techniques determine the relative magnitudes of events based on a certain baseline. The frequency- or quantileperturbation analysis compounds the two concepts thereby making it possible to study the changes in extremes.

6 2 Historical Uccle rainfall series 2.2 Methodology based on extreme value analysis 2.2. Peaks-Over-Threshold extremes Extreme value theory has for long been applied in the analysis of extremes. Extremes are selected from a series by applying a threshold which connotes that the analysis is valid only for those values above a certain return period. However the selection of the threshold is subjective. There is no universal technique used for the selection of the threshold. Lang et al. (999) proposed that the selection of the threshold should be based on the distribution of the Peaks- Over-Threshold values and the hypothesis of the independency. For this research the threshold is considered as the value above which a distribution could be reasonably fitted due to the intended use of the distributions for the Monte Carlo calculations. Also, the identification of extremes requires the use of an independency criterion. Extreme value theory assumes total statistical independency of the sampled extremes (Willems, 2) thereby providing a theoretical basis for distribution fitting. The independency criterion is based on the independency criterion for extracting the Peaks-Over-Threshold (POT) values for rainfall which is similar to that of extracting Peaks-Over-Threshold values for discharges. The independency criterion for discharge events states that two consecutive events are independent if the occurrence of one event does not affect the occurrence of the other event. The separation of two consecutive flood peaks is subjective due to the uncertainty associated with the physical independency because the occurrence of the last peak may be partially explained by the occurrence of the previous peak (Lang, 999). However rainfall independency is less uncertain because two consecutive rainfall peaks are independent if the rainfall event between them drops to zero. Thus the main criterion for peak extraction from rainfall series is the inter-event time. Willems (2) proposed a minimum of a 2hr inter-event time because two events happening within the same day or night are considered as one event. The peak-over-threshold extremes are extracted based on the simple moving average series. Moving average series are preferred to unchanged series for trend analysis because they capture the intrinsic trends within a series which would otherwise have been unnoticeable. They also allow the rainfall intensities to be investigated at the time scales which are relevant for the hydrological applications. The time span of the moving window is called aggregation level and covers the range of concentration times of the river and sewer catchments of interest. The sampling for the peak over threshold values is based on a 2hr independency criterion. For the different aggregation levels, the independency inter-event time is taken as equal to the aggregation level for those levels above the minimum (i.e., 2hrs) Aggregation levels and time scales Different aggregation levels and time scales are used in the statistical analysis. Aggregation levels of minutes, 6 minutes, 8 minutes and 44 minutes and monthly seasonal volumes are used in the analysis. In addition to the aggregations the data is also grouped in blocks of years ranging from 5 to 5 years. Therefore the analysis is based on a particular aggregation level for a particular block of years. For instance given an aggregation level of minutes and year blocks, the analysis involves studying the statistical properties based on minutes POT extremes grouped in year blocks for the period Note that the total number of non-overlapping blocks of years for a given period can easily be calculated if the initial year and final year is known. For the period the decades are , 98-97,..., , However these decades are not sufficient for a complete temporal analysis. Therefore a sliding window is used in the analysis. The sliding window requires the shift to the right of one year which leads to a new set of -year blocks: , 99-98,, , Note here that the last block does not shift because 24 is the last available year in the series. The sliding window is applied n times, where n is the number of years in a block. For the previous example, the shift of one year is applied times (for a year block). The last shift includes the blocks: 97-96, ,, , Table gives an overview of the complete set of years blocks considered. It is on the basis of the Peaks-Over-Threshold values grouped according to these blocks that the statistical trend and cycle analysis is applied.

7 CCI-HYDR III. Statistical analysis trends and cycles 3 Table : years blocks considered in the analysis. SUMMER WINTER WINDOWS The analysis is also carried out for the four seasons: Winter (December, January and February), Spring (March, April, May), Summer (June, July, August) and Autumn (September, October and November) Frequency analysis of extremes Frequency analysis deals with how often an even occurs. Thus, based on frequency analysis, extremes may be identified. The decision on what makes an event extreme depends on the intended use in design or future planning. The extremes for this study partially conform to the definition offered by Pickands (975). Pickands (975) stated that the extremes extracted from a series after applying a threshold can be fitted to a Generalized Pareto distribution (GPD). The threshold needs to be high enough to ensure that the extremes can be fitted to a distribution. Due to the nature of the study, selection of an optimal threshold (threshold that most accurately fits the distribution) is not restricted. Applying an optimal threshold would in some cases reduce the number of events and thus affect the number of events from the extracted blocks of years. Also, considering the number of series to be fitted, selecting an optimal threshold for each series would involve computational constraints. For instance, for a period of there are 7 series for each season. If each series is to be fitted accurately it would require that each series is fitted for a different threshold while having a constant threshold for all the series would simplify the distribution fitting computation for all the 7 series. An initial analysis reveals that although the fits are not accurate for some series they are reasonable approximations as long as the threshold is selected high enough. However, the definition of what constitutes an extreme event is still debated. An extreme event may be selected based on frequency, intensity or threshold exceedance and physical expected impacts. The thresholds used in this study are selected using the criterion of having at least 5 Peaks-Over-Threshold values per year in a particular season and also having a reasonable distribution fit Rainfall distributions Distribution fitting is an essential component of this study. It is on the basis of the distribution that the hypothesis testing on the significance of the historical trends and variations (based on Monte Carlo calculations) is done. For Monte Carlo calculations it is crucial that the variate can be generated through inversion of the distribution. The Normal and Gamma distributions are some of the distributions without analytic inverse transforms (Charles, 986). Fortunately, this

8 4 Historical Uccle rainfall series study is based on rainfall extremes, which according to previous studies, can be fitted to Weibull and Exponential distributions; for which analytic inverse transforms exist. The probability distribution of point precipitation intensities has been examined in many previous studies for the Uccle series by e.g. Demarée (985), Willems (998, 2), Mohymont (25). Willems (998) presented a systematic methodology which derived the type of the distribution and the optimal threshold. The exponential distribution has been suggested as presenting a good approximation to the underlying precipitation process: more specifically a two-component distribution to represent storms of two different types (air mass thunderstorms and cyclonic/frontal storms). This was done for durations in the range minutes till 5 days The two-component exponential distribution The two-component exponential distribution is defined as: G( x) = pa Ga ( x) + ( - pa ) Gb ( x) () in which G a (x) and G b (x) are two different exponential distributions and subscripts a and b represent the thunderstorms and frontal storms respectively: a x - x t G a ( x) = - exp(- ) (2) β x - x t G b ( x) = - exp(- ) (3) β Equations (2) and (3) represent the cumulative probability distributions where β is considered to be the scale parameter while x and x t are the rainfall variable and the threshold respectively. The probability distribution G(x) is considered as the combined distribution of two exponentially distributed populations a and b, in which pa represents the proportion of the population a. The two distributions arise from the fact there are two different types of storms; storms associated with thunderstorms in summer and storms associated with cyclonic and frontal storms. The first storm type is associated with population a (largest parameter β); because it is known that the extreme precipitation intensities are on average, larger for this storm type. The parameters p a, β a, β b were determined for the Uccle rainfall intensities in the range of aggregation levels between minutes and 5 days by Willems (2) The Weibull distribution Willems (2) discovered that the two-component exponential distribution was valid for aggregation levels less than 2 days while a one component distribution was valid up to 5 days. However, Willems (2) based his analysis on aggregated values up to 5 days. One of the temporal scales included in this study is the monthly scale. Due to the independency criterion of the minimum inter-event time of at least one month, the number of monthly extracted Peaks- Over-Threshold data would be limited, leading to more uncertainty. Therefore, for the monthly scale, the aggregation and use of the independency criteria is ignored. Only a threshold is applied to the series after calculating the cumulative monthly volumes. With this adjustment, the fitted exponential distribution suggested by Willems (2) is graphically found to be suspect. This could be explained by the fact that his analysis is based on aggregated values for time scales varying from minutes to 5 days. The Weibull distribution is found to have better fit to the monthly seasonal volumes than the exponential distribution. The Weibull 3-parameter distribution is characterised by the following equation: b x - x t α F( x) = - exp(-( ) ) (4) β where F(x) is the cumulative distribution function, x the rainfall volume, x t the threshold, β the scale parameter and α the shape parameter. Note that Equation (2) is indeed an alteration of Equation (6) with α equal to. Since the distribution can easily be transformed as will later be shown, the fitting is consequently less complicated allowing for easier manipulation of the Monte Carlo calculations.

9 CCI-HYDR III. Statistical analysis trends and cycles Calibration of the distributions One of the prerequisites for performing uncertainty analysis by Monte Carlo simulation is distribution fitting. As previously stated, the distribution should have an inverse transform which can easily be computed. Some probability distributions are too computationally expensive because they may require numerical integration. Fortunately, this study was based on the two easily transformable distributions, namely the Weibull 3-parameter distribution and the twocomponent exponential distribution. Various methods exist for fitting distributions. Some of these methods include the least squares estimation, the method of maximum likelihood and the method of moments among others. The fitting was primarily based on linear and non-linear least squares techniques. The two-component exponential distribution was fitted using the non-linear approach while the Weibull distribution was fitted using the linear approach. For the two-component exponential distribution, equations (), (2) and (3) can be combined to form Equation (5): x - x t x - x t - G( x) = pa exp(- ) + ( - pa ) exp(- ) (5) β β a The probability of exceedance G(x) can also be calculated using order statistics based on certain plotting position formulas. The Weibull plotting position formula is preferred in most cases because of its minimum variance (Ghosh, 999): b i G( x) = (6) n + where i is the rank of the series sorted in descending order and n is the number of data points in the series. Thus, by minimising the mean square error while adjusting β a, β b and p a, it is possible to calibrate the parameters for the two-component exponential distribution. Note here that since the Equation (5) has not been linearised, non linear least squares are used. Nonlinear least squares method usually requires initial estimates for the parameters. Willems (2) developed Equations (7) and (8) for estimating the parameters in the two-component exponential distribution: log( β a [ mm / h] ) = log( D[ days]) (7) log( β b [ mm / h] ) = log( D[ days]) (8) Based on these estimates initial guesses can be made for β a and β b. Note here that the threshold x t may be considered constant and p a is a value between and. From initial guesses the parameters with a minimum mean square error value of the residual of the empirical with the theoretical distribution are found. These represent the calibrated distribution for the series. The approach for calibrating the Weibull distribution is similar to that of the two-component exponential distribution although it is based on linear regression techniques. Equation (4) can be transformed linearly by taking logarithms twice and rearranging it to give: ln(-ln(- F( x)) αln( xx - t )-αln( β) By replacing -F(x) with Equation (6), Equation (9) changes to: = (9) i ln(-ln( )) = αln( xx - t )-αln( β) n + Both α and β can now be estimated by fitting a straight line using least squares regression with i the ordinate taken as ln(- ln( )) n + () and the abscissa taken as ln( x - x ). The slope of the line of best fit gives the shape parameter α from which the scale parameter β is estimated. The regression, however, requires prior estimation to be made of the threshold x t. Without such prior estimate, replacing ln( x - x ) by ln(x ) in previous equations, the shape parameter α can be t t

10 6 Historical Uccle rainfall series determined as the asymptotic slope towards the higher observations in a plot with the ordinate i taken as ln(-ln( )) n + and the abscissa taken as ln(x ) Random sample generations The approach adopted in this study for confidence interval calculation (see section 2.2.5) requires random samples (or a number of random values) to be generated from the rainfall distributions. A random value is usually thought of as a value selected such that each value in the population has an equal chance of being selected (Charles, 986). A random value can be selected from any probability distribution as long as the values are independent of each other. Random numbers can be generated for distributions by making use of the fact that the cumulative probability function for any continuous variate is uniformly distributed over the interval from to. Therefore with a continuous uniform variable U in [, ] and an invertible distribution F, the random variable X = F - (U) can be generated. This is called inverse transform sampling, which requires that a random number is selected from the interval [, ] followed by computation of the variate from the inverse cumulative distribution Monte Carlo confidence intervals Climate change is related to the statistically significant variations from the natural variability that persists for a long period, typically decades or longer. It involves shifts in the frequency and magnitude of sporadic weather events (IPCC, 2). Confidence intervals can be used to define a region of natural variability or randomness. Thus they are also used for testing hypotheses of significant deviations under the hypothesis of no trend or temporal clustering of rainfall extremes. Since a confidence interval defines a region of expected variability, any region outside the confidence bounds is considered to be statistically significant (hypothesis rejected of no significant trends and oscillations) while the region within the bounds is statistically insignificant (hypothesis accepted). Using this criterion, one is able to ascertain the statistical significance for a given hypothesis. Monte Carlo methods are statistical sampling methods used for generating several random outcomes from the derived distributions. Given a distribution, a program using a Monte Carlo algorithm, derives a large number of random outcomes from which a statistic is derived. The approach used for this study is based on the parametric bootstrapping method which requires that a sample is first fitted to a distribution after which random samples are generated from the distribution. One of the advantages of bootstrapping is that it allows confidence intervals to be estimated. This approach was originally introduced for independent data (Enfron, 979) but has evolved over time to allow for the analysis of dependent data by the use of the block bootstrapping (Vogel and Shallcross, 996). Block bootstrapping groups data in blocks from which the resampling is made and thus preserving the time within the series. However, with the use of Peaks-Over-Threshold values selected after using independency criteria, the data is assumed to be independent. Even though frequency analysis eliminates the time aspect within a particular block, the use of several blocks restores the time aspect since the statistics (e.g. 95% confidence intervals) can be calculated for each block. By connecting the statistics for each block the temporal evolution of the statistic is realized. The analysis also uses the Peaks-Over-Threshold (POT) series as the major focus of this study is on the extremes. The parametric bootstrapping Monte Carlo procedure is described below:. POT series for the entire period are extracted from the available series. 2. The series are then further separated into seasonal blocks for different block lengths. 3. Considering a particular block, a distribution is fitted to the POT data and the parameters for the distribution are stored. Also the total number of POT values n is stored. 4. Using the Monte Carlo methodology of random number generation, p samples are generated from the fitted distribution with each sample containing n POT values.

11 CCI-HYDR III. Statistical analysis trends and cycles 7 5. Each of the p samples are ranked in descending or ascending order and the confidence interval can calculated for each rank number. The rank numbers can easily be converted to return periods using empirical plotting positions. Note that for each rank number or return period, there are p possible values. Based on these values, the confidence interval is estimated from the rank range [p α/2, p (-α/2) ] where α is the level of confidence. For example, for p= and a 95% confidence interval (α=.5) the confidence interval is given by the 25 th and 975 th values after ranking Quantile-perturbation approach The proposed method investigates the historic changes in the ranked extremes. The method combines aspects of frequency, used in extreme value analysis, and perturbation, used in climate change impact studies. The technique is analogous to the frequency-perturbation approach applied by Harrold et al. (25) and Chiew (26) for deriving climate change scenarios from climate models. For climate change impact analysis on a daily rainfall series, instead of applying one factor (e.g., monthly change) for the entire daily time series (e.g., for same month) they decided to apply different factors based on the ranked daily values. The perturbation factors were calculated as ratios of two similarly ranked values obtained from the future (climate model scenarios) and the observed time series. The proposed method, however, is solely based on historical data. Since the perturbation is a relative change it requires two series. For the climate-model based approach, one of the series is taken as the reference or baseline series while the other is a future scenario series. In the present study, one of the series is derived from the long-term historical distribution while the other series is taken from a particular block (subseries) of interest. For example given a particular block of years, one of the series contains the actual POT values within the block while the other series is derived from the distribution of long-term historical values (from the entire period of 7 years). The POT values within the block were ranked (where i is the rank of each POT event), such that they can be related to empirical return periods /i (or L/i for block series of L years length). After ranking, the POT values correspond with quantiles x(l), x(l/2),, x(l/i),, where x(l/i) is the quantile with empirical return period L/i. The same procedure is applied to the full 7 years series, leading to quantiles x g (7), x g (7/2),, x g (7/i), The perturbation factors then correspond to the ratios x(l)/x g (L), x(l/2)/x g (L/2), It is clear that the return periods L, L/2, do not necessarily coincide with the empirical return periods of the POT events of the full 7 years series. In that case the x g (L/i) values are derived by linear interpolation from the closest (higher and lower) POT events. Figure illustrates the estimation of the first 3 values in the reference series for a -year summer block. The curve contains all the summer POT values in a 7 years period.

12 8 Historical Uccle rainfall series 6 4 Rainfall Intensity[mm/mins] Return period[years] Figure : Precipitation quantiles in long-term baseline calculation. 2.3 Results 2.3. Quantile perturbations for extreme rainfall conditions The calculation of the confidence intervals is based on the historical aggregated Uccle minutes rainfall series (898-24). The selection of the number of samples for the confidence intervals depends on the available computing resources, available time for analysis and level of accuracy. This study opts for samples due to the several blocks of years that are investigated. Seasonal blocks of 5 to 5 years (winter, spring, summer and autumn) from are required for the analysis which involves a large volume of data generation. For any given block of seasonal series there are 7 series each with different number of POT extremes. The POT extraction criterion is that each year has at least 5 POT values. Each block is assigned a confidence interval using the parametric bootstrapping Monte Carlo technique. After fitting a distribution (two-component exponential or Weibull for the monthly time scale) to the data for a particular block, random samples are generated; each containing the same number of events as the parent data. Different statistics can then be obtained for each sample. It is from this ensemble of statistics that a confidence interval is defined. For example for the perturbations, average quantile-perturbations are calculated (average for the higher return periods) for each sample separately. This gives perturbation factors from which the 3 rd and 98 th ranked values define the 95% confidence interval. The same procedure is repeated for all blocks each time defining the confidence interval. The upper and lower confidence interval points are then superimposed on the same plot with the historical factors. The resultant plot shows the factors and the confidence intervals which can then be used to check the hypothesis. A similar procedure is also used for other confidence intervals albeit with a few alterations depending on the statistic. Figure 2 and Figure 3 show the minutes perturbations for each separate rank number for the summer period and all years blocks (decades). The perturbations represent the changes with respect to the long-term historical data. The perturbation is taken as the ratio of similarly ranked data from the two series. The base or reference series is taken as the long-term expected series while the other series is taken as the actual series within a particular block. A single perturbation factor for a particular block of years is calculated as the average of all the perturbations above a particular threshold. The threshold selection is based on a criterion which

13 CCI-HYDR III. Statistical analysis trends and cycles 9 for this study is taken as having 5 events per year. This means that the perturbation can be seen as a quantile perturbation for extreme rainfall conditions. The mean perturbation is assigned to a year which is approximately in the middle of the block. Repeating the averaging over the different blocks assigns one factor to each block which eventually leads to a temporal variation of the perturbation factor. Figure 4 shows the temporal variation obtained from Figure 2 and Figure 3 with each point representing the centre of a -year block. After evaluating the perturbation temporal evolution, the confidence interval is also evaluated and superimposed on the same plot. It is then graphically possible to identify periods that depict significant variations under the hypothesis of no trend or temporal clustering of rainfall extremes (see Figure 8 and section 2.3.5).

14 Historical Uccle rainfall series exceedance probability[-] exceedance probability[-] exceedance probability[-] exceedance probability[-]... exceedance probability[-]... exceedance probability[-].. Figure 2: Quantile perturbations for minutes rainfall extremes and years blocks for summer periods.

15 CCI-HYDR III. Statistical analysis trends and cycles exceedance probability[-] exceedance probability[-] exceedance probability[-] exceedance probability[-].. Figure 3: Quantile perturbations for minutes rainfall extremes and years blocks for summer periods (cont d).

16 2 Historical Uccle rainfall series Figure 4: Estimates of average quantile perturbations for minutes rainfall extremes and years blocks for summer periods Slopes of quantile perturbations versus return period Calculation of the average quantile perturbations assumes these perturbations to be independent of the return period; or the slope of the quantile perturbations versus the return period or exceedance probability to be non-significantly different from zero. To test this, the slope is explicitly calculated and analyzed. Figure 5 shows some of the expected outcomes of the slopes for selected periods. There is a positive slope for the blocks and and negative slope for the blocks and The slope tests the hypothesis that the perturbation factor does not vary with exceedance probability i.e. changes in extremes. In other words, if the slope is significantly different from zero then there is a trend in the perturbations which implies that the higher extreme events have significantly different perturbations from the lower extreme events. Conversely, if the slope is not significantly different from zero (it is nearly horizontal) then the perturbations for both the low and high extremes are the same. Again, the confidence interval aids the analysis as it defines zones of significance. If the zero reference lies within the confidence interval for a particular period say (year 93 in Figure 5), then the slope is not significantly different from zero and thus the perturbation can be assumed to be constant for that period. However if the zero lies out of the confidence interval the slope is significantly different from zero ( ; year 943 in Figure 5) then the perturbation can not be assumed to be constant for the whole range of extremes.

17 CCI-HYDR III. Statistical analysis of trends and cycles Slope(995-24) Slope( ) Slope. Slope =.77 Slope =.968. Exceedance probability [-] Confidence Interval Slope = -.3 Slope = -.82 Slope( ) Slope(898-97). Slope[-] Year [-] Figure 5: Hypothesis testing for slopes of quantile perturbations versus return period Mean POT perturbations The mean POT represents an average Peaks-Over-Threshold value for a given dataset. For example, for this study the mean POT is calculated from the values within a block. The calculation is done for all the blocks of years which eventually lead to an evolution of mean POT values over the long-term period. The confidence interval for the mean POT is based on the mean POT values calculated from the randomly generated samples of the parent data for a particular block of years. From say samples for a particular block, the mean POT is calculated for each sample. From the resulting mean POT values it is possible to define the 95% confidence interval. The confidence interval is used to test the hypothesis that the longterm mean POT does not vary significantly from the reference level. The reference level is taken as the long-term mean POT value over the entire period, e.g., for all the summers in the longterm series Number of events There is a general perception that the frequency of extreme events has increased in recent years (Prudhomme et al., 23; Beniston et al., 24). Due to the availability of the long-term series it is now possible to examine also this perception using statistical hypothesis testing. The hypothesis that the recent number of events is significantly higher than what is expected can be tested for the most recent years. The significance of the number of events can be tested using the non-parametric bootstrapping Monte Carlo technique. The technique has some differences from the previous bootstrapping applications. One notable difference is that there is no distribution fitting. Instead the parent data is randomly distributed in time. For instance for the summer period, all the POT values for summer are randomly distributed in time with each POT value having say possible summer date values in the period Thus for each

18 4 Historical Uccle rainfall series sampling, the number of events above a particular threshold for a particular block of years is noted. The counting of events is repeated for all the random time samples. Therefore each block will have possible number of events from which the 95% confidence interval can be estimated. The analysis on mean POT perturbations (section 2.3.3) and number of events (this section) allows the quantile-perturbations to be further explained by their contributing trends and variations in both the number of events (time frequency of rainfall events) and the amplitude of each event Statistical hypothesis testing on the clustering of rainfall extremes The statistical hypothesis testing is based on the 95% confidence interval. The Interval defines a region of acceptance and is also an indication of the expected randomness boundary. Outside this boundary the hypothesis is rejected. Based on the hypothesis tests, periods of statistically significant behaviour can be identified. As discussed before, this statistical investigation is based on Peaks-Over-Threshold extremes for different aggregation levels and different block lengths using the sliding window technique for block lengths ranging from 5 to 5 years. The investigation is based on aggregation levels of minutes (no aggregation), 6 minutes (hourly), 44 minutes (daily) and 8 minutes (Week) together with monthly seasonal volumes. However, the discursion has been limited to a -year block length and aggregation levels of minutes and 44 minutes and monthly seasonal volumes. For this assessment, monthly volumes are preferred to aggregate monthly values due to the limitation of the POT extraction method which produces few monthly extremes for the analysis. With varying block lengths, periods of clustering of extremes may be identified. For instance if a particular period shows an indication of higher perturbations (showing clustering of rainfall extremes) for a particular block length then other block lengths can be used to check the persistence of the perturbation. If the period with high perturbation is consistent for all the block lengths then the clustering of events for that period is plausible. The block length may also be linked to the objective of the analysis. For example, the Intergovernmental Panel of Climate Change (IPCC, 2) uses decadal analysis for climate change. The 5 year block length may be used to test the perception that the most recent years since the 99 s have experienced more climate change effects compared to the previous periods. The variability of the perturbations of the historical precipitation shows some attributes of trends and oscillations. The temporal variability of the perturbation is made possible by using average perturbations for each block of years. In other words, one perturbation factor represents the mean of the perturbations within a particular block of years and above a rainfall threshold (thus for all rainfall extremes). Table 2 contains the thresholds. The thresholds are selected using a criterion of having at least 5 POT values per block. Table 2: Thresholds above which average quantile perturbations were derived. Statistic Season mins 44 mins Month Threshold Summer (mm) Winter Return period Summer (years) Winter The hypothesis of a constant perturbation above these thresholds can be tested using statistical hypothesis testing for the slope of the perturbation-exceedance probability plots. If the zero value lies within the confidence interval then the slope is not significantly different from zero or it is nearly horizontal. Therefore the assumption of a constant perturbation factor above a threshold is justified. Note that each slope is centred in the middle of each block (there are 7 blocks for the entire period) Slope hypothesis testing Figure 6 shows the slope of the perturbations above the selected thresholds for the different aggregation levels and monthly seasonal volumes.

19 CCI-HYDR III. Statistical analysis of trends and cycles 5 The hypothesis that the slope is not significantly different from zero can be generally accepted. The slope is significantly different from zero only for short periods, e.g., 94s, 96s and 99s for minutes rainfall in summer and 9s, 96s and 97s for minutes winter rainfall. Note that the alternative hypothesis of the slope being significantly different from zero (the is out of the confidence interval) holds for the significant periods. However there is still a 5% chance that the slope is almost horizontal. Since the periods are not of long lengths, this may be a logical compromise. On the other hand significant slopes are indications of high variability of extremes, which matches with clustering of the rainfall extremes (see section ) mins-years-summer 95% Confidence interval slope.6.5 mins-years-winter 95% Confidence interval slope Slope[-].5..5 Slope[-] mins-years-Summer 95% Confidence interval Slope slope mins-years-Winter 95% Confidence interval Slope slope Slope[-].2. Slope[-] Monthly-years-Summer 95% Confidence interval Slope slope.2 Monthly-years-Winter 95% Confidence interval Slope slope.6.8 Slope[-].4.2 Slope[-] Figure 6: -exceedance probability slope estimates for minutes, day and month rainfall extremes and -year blocks for summer period, together with 95% confidence intervals Hypothesis testing average perturbations The perturbation time series analysis is aimed at investigating whether the most recent changes can be considered to be statistically significant in comparison with the natural temporal variability. Identifying statistically significant trends and oscillations enables one to assess the likelihood of climate change effects during the most recent periods. Figure 7, Figure 8 and Figure 9 summarize the results obtained for the 5, and 5 years blocks. These figures show the results for minutes, daily (44 minutes aggregated from minutes) and monthly volumes. The perturbations, number of events and mean Peaks-Over-Threshold values have been included for both summer and winter.

20 6 Historical Uccle rainfall series.6 mins-5years-summer 9 min-5years-summer 4.5 mins-5years-summer % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT mins-5years-winter 8 min-5years-winter 4 mins-5years-winter % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT mins-5years-Summer 7 44min-5years-Summer.8 44mins-5years-Summer % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT mins-5years-Winter 7 44min-5years-Winter.4 44mins-5years-Winter % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT Monthly-5years-Summer 5 Monthly-5years-Summer 3 Monthly-5years-Summer % Confidence interval 5 95% Confidence interval Events Mean POT[mm] % Confidence interval Mean POT[mm] Monthly-5years-Winter 5 Monthly-5years-Winter 2 Monthly-5years-Winter % Confidence interval 5 95% Confidence interval Events Mean POT[mm] % Confidence interval Mean POT[mm] Figure 7: Estimates of average quantile perturbations, number of events and mean POT values for minutes, day and month rainfall extremes and 5 years blocks for summer and winter periods, together with 95% confidence intervals.

21 CCI-HYDR III. Statistical analysis of trends and cycles 7.6 mins-years-summer 8 min-years-summer 4 mins-years-summer ' 95% Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT mins-years-winter 4 min-years-winter 3 mins-years-winter % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT mins-years-Summer 4 44min-years-Summer.6 44mins-years-Summer % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT mins-years-Winter 4 44min-years-Winter.2 44mins-years-Winter % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT Monthly-years-Summer 3 Monthly-years-Summer 6 Monthly-years-Summer % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT[mm] Monthly-years-Winter 3 Monthly-years-Winter 6 Monthly-years-Winter % Confidence interval % Confidence interval Events Mean POT[mm] % Confidence interval Mean POT[mm] Figure 8: Estimates of average quantile perturbations, number of events and mean POT values for minutes, day and month rainfall extremes and years blocks for summer and winter periods, together with 95% confidence intervals.

CCI-HYDR Perturbation Tool. A climate change tool for generating perturbed time series for the Belgian climate MANUAL, JANUARY 2009

CCI-HYDR Perturbation Tool. A climate change tool for generating perturbed time series for the Belgian climate MANUAL, JANUARY 2009 CCI-HYDR project (contract SD/CP/03A) for: Programme SSD «Science for a Sustainable Development» MANUAL, JANUARY 2009 CCI-HYDR Perturbation Tool A climate change tool for generating perturbed time series

More information

Estimating Potential Reduction Flood Benefits of Restored Wetlands

Estimating Potential Reduction Flood Benefits of Restored Wetlands Estimating Potential Reduction Flood Benefits of Restored Wetlands Kenneth W. Potter University of Wisconsin Introduction Throughout the summer of 1993 a recurring question was the impact of wetland drainage

More information

COMPARISON BETWEEN ANNUAL MAXIMUM AND PEAKS OVER THRESHOLD MODELS FOR FLOOD FREQUENCY PREDICTION

COMPARISON BETWEEN ANNUAL MAXIMUM AND PEAKS OVER THRESHOLD MODELS FOR FLOOD FREQUENCY PREDICTION COMPARISON BETWEEN ANNUAL MAXIMUM AND PEAKS OVER THRESHOLD MODELS FOR FLOOD FREQUENCY PREDICTION Mkhandi S. 1, Opere A.O. 2, Willems P. 3 1 University of Dar es Salaam, Dar es Salaam, 25522, Tanzania,

More information

CHAPTER 2 HYDRAULICS OF SEWERS

CHAPTER 2 HYDRAULICS OF SEWERS CHAPTER 2 HYDRAULICS OF SEWERS SANITARY SEWERS The hydraulic design procedure for sewers requires: 1. Determination of Sewer System Type 2. Determination of Design Flow 3. Selection of Pipe Size 4. Determination

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Estimation and attribution of changes in extreme weather and climate events

Estimation and attribution of changes in extreme weather and climate events IPCC workshop on extreme weather and climate events, 11-13 June 2002, Beijing. Estimation and attribution of changes in extreme weather and climate events Dr. David B. Stephenson Department of Meteorology

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Note on growth and growth accounting

Note on growth and growth accounting CHAPTER 0 Note on growth and growth accounting 1. Growth and the growth rate In this section aspects of the mathematical concept of the rate of growth used in growth models and in the empirical analysis

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA

RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA RANDOM VIBRATION AN OVERVIEW by Barry Controls, Hopkinton, MA ABSTRACT Random vibration is becoming increasingly recognized as the most realistic method of simulating the dynamic environment of military

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

An Introduction to Extreme Value Theory

An Introduction to Extreme Value Theory An Introduction to Extreme Value Theory Petra Friederichs Meteorological Institute University of Bonn COPS Summer School, July/August, 2007 Applications of EVT Finance distribution of income has so called

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

8. Time Series and Prediction

8. Time Series and Prediction 8. Time Series and Prediction Definition: A time series is given by a sequence of the values of a variable observed at sequential points in time. e.g. daily maximum temperature, end of day share prices,

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

CSO Modelling Considering Moving Storms and Tipping Bucket Gauge Failures M. Hochedlinger 1 *, W. Sprung 2,3, H. Kainz 3 and K.

CSO Modelling Considering Moving Storms and Tipping Bucket Gauge Failures M. Hochedlinger 1 *, W. Sprung 2,3, H. Kainz 3 and K. CSO Modelling Considering Moving Storms and Tipping Bucket Gauge Failures M. Hochedlinger 1 *, W. Sprung,, H. Kainz and K. König 1 Linz AG Wastewater, Wiener Straße 151, A-41 Linz, Austria Municipality

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online

More information

Chapter 3 : Reservoir models

Chapter 3 : Reservoir models Chapter 3 : Reservoir models 3.1 History In earlier days, the dot graph of Kuipers was used to perform an impact assessment for combined sewer systems [Ribbius, 1951]. For a long period of rainfall, individual

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

The KaleidaGraph Guide to Curve Fitting

The KaleidaGraph Guide to Curve Fitting The KaleidaGraph Guide to Curve Fitting Contents Chapter 1 Curve Fitting Overview 1.1 Purpose of Curve Fitting... 5 1.2 Types of Curve Fits... 5 Least Squares Curve Fits... 5 Nonlinear Curve Fits... 6

More information

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction. 1.1 Introduction Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Havnepromenade 9, DK-9000 Aalborg, Denmark. Denmark. Sohngaardsholmsvej 57, DK-9000 Aalborg, Denmark

Havnepromenade 9, DK-9000 Aalborg, Denmark. Denmark. Sohngaardsholmsvej 57, DK-9000 Aalborg, Denmark Urban run-off volumes dependency on rainfall measurement method - Scaling properties of precipitation within a 2x2 km radar pixel L. Pedersen 1 *, N. E. Jensen 2, M. R. Rasmussen 3 and M. G. Nicolajsen

More information

Low Flow Analysis Using Filter Generated Series For Lake Victoria Basin

Low Flow Analysis Using Filter Generated Series For Lake Victoria Basin Low Flow Analysis Using Filter Generated Series For Lake Victoria Basin Julius Kabubi 1, Francis Mutua 2, Patrick Willems 3, R.J. Mngodo 4 1. Kenya Meteorological Department, Institute for Meteorological

More information

Forecaster comments to the ORTECH Report

Forecaster comments to the ORTECH Report Forecaster comments to the ORTECH Report The Alberta Forecasting Pilot Project was truly a pioneering and landmark effort in the assessment of wind power production forecast performance in North America.

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

More information

South Africa. General Climate. UNDP Climate Change Country Profiles. A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1

South Africa. General Climate. UNDP Climate Change Country Profiles. A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1 UNDP Climate Change Country Profiles South Africa A. Karmalkar 1, C. McSweeney 1, M. New 1,2 and G. Lizcano 1 1. School of Geography and Environment, University of Oxford. 2. Tyndall Centre for Climate

More information

Integrated Resource Plan

Integrated Resource Plan Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Flood Frequency Analysis Using the Gumbel Distribution

Flood Frequency Analysis Using the Gumbel Distribution Flood Frequency Analysis Using the Gumbel Distribution Never Mujere University of Zimbabwe, Department of Geography and Environmental Science, Box MP 167, Mount Pleasant, Harare E-mail mujere@arts.uz.ac.zw

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Non Parametric Inference

Non Parametric Inference Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

More information

An introduction to Value-at-Risk Learning Curve September 2003

An introduction to Value-at-Risk Learning Curve September 2003 An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk

More information

M & V Guidelines for HUD Energy Performance Contracts Guidance for ESCo-Developed Projects 1/21/2011

M & V Guidelines for HUD Energy Performance Contracts Guidance for ESCo-Developed Projects 1/21/2011 M & V Guidelines for HUD Energy Performance Contracts Guidance for ESCo-Developed Projects 1/21/2011 1) Purpose of the HUD M&V Guide This document contains the procedures and guidelines for quantifying

More information

Robichaud K., and Gordon, M. 1

Robichaud K., and Gordon, M. 1 Robichaud K., and Gordon, M. 1 AN ASSESSMENT OF DATA COLLECTION TECHNIQUES FOR HIGHWAY AGENCIES Karen Robichaud, M.Sc.Eng, P.Eng Research Associate University of New Brunswick Fredericton, NB, Canada,

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

Example G Cost of construction of nuclear power plants

Example G Cost of construction of nuclear power plants 1 Example G Cost of construction of nuclear power plants Description of data Table G.1 gives data, reproduced by permission of the Rand Corporation, from a report (Mooz, 1978) on 32 light water reactor

More information

Applying MIKE SHE to define the influence of rewetting on floods in Flanders

Applying MIKE SHE to define the influence of rewetting on floods in Flanders Applying MIKE SHE to define the influence of rewetting on floods in Flanders MARK HENRY RUBARENZYA 1, PATRICK WILLEMS 2, JEAN BERLAMONT 3, & JAN FEYEN 4 1,2,3 Hydraulics Laboratory, Department of Civil

More information

2016 ERCOT System Planning Long-Term Hourly Peak Demand and Energy Forecast December 31, 2015

2016 ERCOT System Planning Long-Term Hourly Peak Demand and Energy Forecast December 31, 2015 2016 ERCOT System Planning Long-Term Hourly Peak Demand and Energy Forecast December 31, 2015 2015 Electric Reliability Council of Texas, Inc. All rights reserved. Long-Term Hourly Peak Demand and Energy

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions

http://www.jstor.org This content downloaded on Tue, 19 Feb 2013 17:28:43 PM All use subject to JSTOR Terms and Conditions A Significance Test for Time Series Analysis Author(s): W. Allen Wallis and Geoffrey H. Moore Reviewed work(s): Source: Journal of the American Statistical Association, Vol. 36, No. 215 (Sep., 1941), pp.

More information

Review of Transpower s. electricity demand. forecasting methods. Professor Rob J Hyndman. B.Sc. (Hons), Ph.D., A.Stat. Contact details: Report for

Review of Transpower s. electricity demand. forecasting methods. Professor Rob J Hyndman. B.Sc. (Hons), Ph.D., A.Stat. Contact details: Report for Review of Transpower s electricity demand forecasting methods Professor Rob J Hyndman B.Sc. (Hons), Ph.D., A.Stat. Contact details: Telephone: 0458 903 204 Email: robjhyndman@gmail.com Web: robjhyndman.com

More information

Implications of Alternative Operational Risk Modeling Techniques *

Implications of Alternative Operational Risk Modeling Techniques * Implications of Alternative Operational Risk Modeling Techniques * Patrick de Fontnouvelle, Eric Rosengren Federal Reserve Bank of Boston John Jordan FitchRisk June, 2004 Abstract Quantification of operational

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

LOGISTIC REGRESSION ANALYSIS

LOGISTIC REGRESSION ANALYSIS LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic

More information

by Maria Heiden, Berenberg Bank

by Maria Heiden, Berenberg Bank Dynamic hedging of equity price risk with an equity protect overlay: reduce losses and exploit opportunities by Maria Heiden, Berenberg Bank As part of the distortions on the international stock markets

More information

Measurement with Ratios

Measurement with Ratios Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

LDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk

LDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk LDA at Work: Deutsche Bank s Approach to Quantifying Operational Risk Workshop on Financial Risk and Banking Regulation Office of the Comptroller of the Currency, Washington DC, 5 Feb 2009 Michael Kalkbrener

More information

The Rational Method. David B. Thompson Civil Engineering Deptartment Texas Tech University. Draft: 20 September 2006

The Rational Method. David B. Thompson Civil Engineering Deptartment Texas Tech University. Draft: 20 September 2006 The David B. Thompson Civil Engineering Deptartment Texas Tech University Draft: 20 September 2006 1. Introduction For hydraulic designs on very small watersheds, a complete hydrograph of runoff is not

More information

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1. **BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Chapter G08 Nonparametric Statistics

Chapter G08 Nonparametric Statistics G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................

More information

Interpretation of Somers D under four simple models

Interpretation of Somers D under four simple models Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms

More information

Introduction to time series analysis

Introduction to time series analysis Introduction to time series analysis Margherita Gerolimetto November 3, 2010 1 What is a time series? A time series is a collection of observations ordered following a parameter that for us is time. Examples

More information

Quality Assurance for Hydrometric Network Data as a Basis for Integrated River Basin Management

Quality Assurance for Hydrometric Network Data as a Basis for Integrated River Basin Management Quality Assurance for Hydrometric Network Data as a Basis for Integrated River Basin Management FRANK SCHLAEGER 1, MICHAEL NATSCHKE 1 & DANIEL WITHAM 2 1 Kisters AG, Charlottenburger Allee 5, 52068 Aachen,

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Measurement and Modelling of Internet Traffic at Access Networks

Measurement and Modelling of Internet Traffic at Access Networks Measurement and Modelling of Internet Traffic at Access Networks Johannes Färber, Stefan Bodamer, Joachim Charzinski 2 University of Stuttgart, Institute of Communication Networks and Computer Engineering,

More information

Regression Modeling Strategies

Regression Modeling Strategies Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions

More information

Global Seasonal Phase Lag between Solar Heating and Surface Temperature

Global Seasonal Phase Lag between Solar Heating and Surface Temperature Global Seasonal Phase Lag between Solar Heating and Surface Temperature Summer REU Program Professor Tom Witten By Abstract There is a seasonal phase lag between solar heating from the sun and the surface

More information

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments

More information

9. Model Sensitivity and Uncertainty Analysis

9. Model Sensitivity and Uncertainty Analysis 9. Model Sensitivity and Uncertainty Analysis 1. Introduction 255 2. Issues, Concerns and Terminology 256 3. Variability and Uncertainty In Model Output 258 3.1. Natural Variability 259 3.2. Knowledge

More information

Climate Extremes Research: Recent Findings and New Direc8ons

Climate Extremes Research: Recent Findings and New Direc8ons Climate Extremes Research: Recent Findings and New Direc8ons Kenneth Kunkel NOAA Cooperative Institute for Climate and Satellites North Carolina State University and National Climatic Data Center h#p://assessment.globalchange.gov

More information

Rainfall generator for the Meuse basin

Rainfall generator for the Meuse basin KNMI publication; 196 - IV Rainall generator or the Meuse basin Description o 20 000-year simulations R. Leander and T.A. Buishand De Bilt, 2008 KNMI publication = KNMI publicatie; 196 - IV De Bilt, 2008

More information

Forecasting in supply chains

Forecasting in supply chains 1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

Chapter 11 Monte Carlo Simulation

Chapter 11 Monte Carlo Simulation Chapter 11 Monte Carlo Simulation 11.1 Introduction The basic idea of simulation is to build an experimental device, or simulator, that will act like (simulate) the system of interest in certain important

More information

SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Time series Forecasting using Holt-Winters Exponential Smoothing

Time series Forecasting using Holt-Winters Exponential Smoothing Time series Forecasting using Holt-Winters Exponential Smoothing Prajakta S. Kalekar(04329008) Kanwal Rekhi School of Information Technology Under the guidance of Prof. Bernard December 6, 2004 Abstract

More information

The Variability of P-Values. Summary

The Variability of P-Values. Summary The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

More information

Comparison of resampling method applied to censored data

Comparison of resampling method applied to censored data International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison

More information