Statistical methods to expect extreme values: Application of POT approach to CAC40 return index

Similar documents
An Introduction to Extreme Value Theory

Applying Generalized Pareto Distribution to the Risk Management of Commerce Fire Insurance

COMPARISON BETWEEN ANNUAL MAXIMUM AND PEAKS OVER THRESHOLD MODELS FOR FLOOD FREQUENCY PREDICTION

Generating Random Samples from the Generalized Pareto Mixture Model

LOGNORMAL MODEL FOR STOCK PRICES

Dongfeng Li. Autumn 2010

Supplement to Call Centers with Delay Information: Models and Insights

Contributions to extreme-value analysis

A Simple Formula for Operational Risk Capital: A Proposal Based on the Similarity of Loss Severity Distributions Observed among 18 Japanese Banks

CATASTROPHIC RISK MANAGEMENT IN NON-LIFE INSURANCE

Nonparametric adaptive age replacement with a one-cycle criterion

Introduction to time series analysis

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Operational Risk Management: Added Value of Advanced Methodologies

Financial Time Series Analysis (FTSA) Lecture 1: Introduction

Simple Linear Regression Inference

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

The Variability of P-Values. Summary

Inference on the parameters of the Weibull distribution using records

Corrected Diffusion Approximations for the Maximum of Heavy-Tailed Random Walk

Modeling Individual Claims for Motor Third Party Liability of Insurance Companies in Albania

Review of Random Variables

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session STS040) p.2985

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

An analysis of the dependence between crude oil price and ethanol price using bivariate extreme value copulas

Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods

Extreme Value Theory for Heavy-Tails in Electricity Prices

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Volatility modeling in financial markets

The Dangers of Using Correlation to Measure Dependence

Survival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

2013 MBA Jump Start Program. Statistics Module Part 3

Underwriting risk control in non-life insurance via generalized linear models and stochastic programming

From the help desk: Bootstrapped standard errors

Factors affecting online sales

Statistics 104: Section 6!

Multivariate Normal Distribution

How to assess the risk of a large portfolio? How to estimate a large covariance matrix?

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Threshold Autoregressive Models in Finance: A Comparative Approach

LOOKING FOR A GOOD TIME TO BET

Cover. Optimal Retentions with Ruin Probability Target in The case of Fire. Insurance in Iran

The Power (Law) of Indian Markets: Analysing NSE and BSE Trading Statistics

Non Parametric Inference

Financial Assets Behaving Badly The Case of High Yield Bonds. Chris Kantos Newport Seminar June 2013

Exploratory Data Analysis

A Model of Optimum Tariff in Vehicle Fleet Insurance

Fairfield Public Schools

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013

Regression Analysis: A Complete Example

MATHEMATICAL METHODS OF STATISTICS

Java Modules for Time Series Analysis

An application of extreme value theory to the management of a hydroelectric dam

Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

SPARE PARTS INVENTORY SYSTEMS UNDER AN INCREASING FAILURE RATE DEMAND INTERVAL DISTRIBUTION

Optimal reinsurance with ruin probability target

Final Exam Practice Problem Answers

DATA INTERPRETATION AND STATISTICS

When to Refinance Mortgage Loans in a Stochastic Interest Rate Environment

Statistics in Retail Finance. Chapter 6: Behavioural models

Extreme Movements of the Major Currencies traded in Australia

Gamma Distribution Fitting

The Best of Both Worlds:

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm

ALGORITHMIC TRADING USING MACHINE LEARNING TECH-

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

CHAPTER 2 Estimating Probabilities

The Assumption(s) of Normality

FULL LIST OF REFEREED JOURNAL PUBLICATIONS Qihe Tang

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Extreme Value Theory with Applications in Quantitative Risk Management

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification

This content downloaded on Tue, 19 Feb :28:43 PM All use subject to JSTOR Terms and Conditions

A FUZZY LOGIC APPROACH FOR SALES FORECASTING

Getting Correct Results from PROC REG

Introduction to Regression and Data Analysis

The CUSUM algorithm a small review. Pierre Granjon

COMMON CORE STATE STANDARDS FOR

START Selected Topics in Assurance

Statistical Machine Learning

Functional Principal Components Analysis with Survey Data

Forecasting methods applied to engineering management

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Likelihood Approaches for Trial Designs in Early Phase Oncology

Least Squares Estimation

8. Time Series and Prediction

Prentice Hall Algebra Correlated to: Colorado P-12 Academic Standards for High School Mathematics, Adopted 12/2009

Transcription:

Statistical methods to expect extreme values: Application of POT approach to CAC40 return index A. Zoglat 1, S. El Adlouni 2, E. Ezzahid 3, A. Amar 1, C. G. Okou 1 and F. Badaoui 1 1 Laboratoire de Mathématiques Appliquées Département de mathématiques Faculté des Sciences Université Mohammed V-Agdal Rabat, Morocco. zoglat@fsr.ac.ma 2 Département de Mathématiques et Statistique, Université de Moncton, Moncton, New Brunswick. Canada. 3 Département d Economie Université Mohammed V-Agdal Rabat, Morocco. ABSTRACT In the past twenty years a new development in the extreme value theory has been done, especially for the Peaks Over Threshold (POT) approach. This approach, based on the analysis of the data exceeding a sufficiently high threshold, aims to improve the efficiency of the extreme quantile estimators. The selection of an appropriate threshold is one of the important concerns of the POT approach. Various threshold selection methods, namely Square Error Method (SEM), Automated Threshold Selection Method (ATSM), and Multiple Threshold Method (MTM), has been developed. Such approaches allow avoiding subjective drawbacks of empirical and graphical methods for optimal thresholds selection. The main objective of the present study is to compare the performances of these methods in the case of financial risk estimations related to the market turmoil. The main focus of this paper is to assess the performance of the POT approach, combined to the maximum likelihood and moment methods for parameter estimations. Results show that the MTM outperforms ATSM and SEM. It is confirmed that the inverse of CAC40 return index has a Fréchet distribution tail behavior, and the parameters are better estimated by the moments method. Keywords: Peaks Over Thresholds, Generalized Pareto Distribution, Square Error Method, Automated Threshold Selection Method, Multiple Threshold Method. Journal Of Economic literature Classification Number: C10, C13, C46.

1 Introduction Booms and stock market crashes are among the most surprising finance phenomena which affect investors, economical institutions and the whole financial system. The profusion of financial databases and the advent of computers have made possible all kinds of studies in the financial markets. However, most empirical studies and models concern only the standard properties of financial assets, and relatively little attention has been paid to extreme movements although they are of considerable importance. Indeed, they are related to default risk investors, bankruptcy risk of financial institutions, and the spread of difficulties from one financial entity to all institutions (systemic risk). In the last two decades, there has been an increasing interest in building statistical models for estimating the probability of rare and extreme events. These models, involving extreme value theory, are of a great interest in environmental sciences, engineering, finance and insurance, and many other disciplines (see Beirlant et al. (1996), Embrechts et al. (1997), Coles (2001), Beirlant et al. (2004), Reiss et al. (2005), Manfred et al. (2006)). Especially in finance, extreme price movement of a financial asset or a market index can be defined as the lowest and highest costs in an observed period. Extreme Value Theory shows that the asymptotic minimum and maximum returns have a definite shape that is independent of the return process itself. The extreme value theory deals with the probabilistic description of the extremes of a stochastic sequence. The fundamental results of Fisher and Tippett (1928) constitute the backbone of the classical extreme value theory. The fundamental theorem states that maxima of independent and identically distributed (i.i.d) random variables have one of the three extreme value distributions: Fréchet distribution, with infinite upper and heavy tail, Gumbel distribution, whose upper tail is also infinite, but lighter than the Fréchet distribution, Weibull distribution with finite upper tail. This classical extreme value approach, called Block Component Wise, was strongly criticized because the estimation of the distribution based on extracted blocks maxima, considered by this approach, involves a loss of information. An alternative to the Block Component Wise method is the Peaks-Over-Threshold (POT) model. In such approach, instead of modeling the maxima, the stochastic structure of the random exceedances over a high threshold value is considered. The POT, essentially related to the results of Pickands (1975), Balkema and de Haan (1974), is a widely used method (see Davidson et al. (1990), Dupuis (1999), Guillou et al. (2006),Suveges et al. (2010)). Balkema-de Haan-Pickands theorem states that under some regulatory conditions, the exceedances limiting distribution is a Generalized Pareto Distribution (GPD) (see Coles (2001), Zhang (2007)). The main steps of POT implementation are: 1. Test the Independent and Identically Distributed (iid) hypothesis: Data should be a sequence of iid random variables. 2. Select an appropriate threshold level.

3. Estimate the parameters using the most appropriate method for the considered excesses dataset. Note that in step 2, the threshold level should satisfy the bias-variance trade-off: for a relatively low threshold value, estimators would be biased, and a too high threshold value would lead to a reduction of the number of extreme observations, and thus to an overestimation of the variance. By setting a relatively low threshold, the risk is to introduce some central observations in the series of extremes. The tail index (shape) is in this case more accurate (less variance), but biased. Contrariwise, a relatively high threshold implies a less biased, but less robust, tail index. 2 methodology Let F be the distribution function of a non-negative random variable X. The distribution function F u of X above a certain threshold u is called the conditional excess distribution function and is defined by x u, F u (x) = P{X x X > u} = 1 1 F (x) 1 F (u). (2.1) The functions F and F u are related by the following equation x u, F (x) = (1 ζ u ) + ζ u F u (x), (2.2) where ζ u = 1 F (u) is the probability to observe exceedances over u. The Balkema-de Haan- Pickands theorem (Balkema and de Haan (1974), Pickands (1975)) states that, for a suitable large enough u, F u is well approximated by a Generalized Pareto Distribution (GPD) function. More precisely, we have that F u (x) = F u (x, α u, ξ) 1 (1 + ξ x u α u ) 1 ξ, ξ 0; F u (x) = F u (x, α u, ξ) 1 exp( x u α u ), ξ = 0. (2.3) where ξ is the shape parameter, u is the threshold value and α u is the scale parameter. The shape parameter controls the tail behavior of the distribution and the tendency to produce heavy extremes while the scale parameter stretches or contracts the distribution. The difficulty lies in finding the optimal threshold for GPD fitting. Various approaches have been suggested and applied by authors (Davidson and Smith (1990), Smith (1985), Lang et al. (1999), Dupuis (1999), Choulakain and Stephens (2001), Neves (2004), Thompson et al. (2009), Xiangxian and Wenlei (2009)) to detect the appropriate threshold. Some of these methods are graphical, some are numerical, and some others are combinations of graphical and numerical techniques. Graphical methods, used to set candidate thresholds, are based on expert judgment and thus present a great deal of subjectivity. They can however provide pertinent sets of candidate thresholds. Optimal values can then be chosen on the basis of some objective approaches. Among numerical methods we consider the Square Error Method (SEM), Automated Threshold Selection Method (ATSM), and Multiple Threshold Method (MTM). They are all based on mathematical criteria, so they help the user in choosing an adequate threshold on a quite objective consideration basis.

2.1 Mean Residual Life Plot (MRL plot) The MRL plot, also known as the mean excess plot, is one of the most commonly used graphical method. It has been used to analyze daily rainfall data (Coles (2001)), model large claims in non-life insurance (Beirlant et al. (2002)) and explore pulse rate data in a flexible extreme values mixture model (MacDonald et al. (2011)). The theoretical reasons behind this approach reside in the fact that when the distribution of exceedances over a threshold u 1 is a GPD, the distribution of exceedances over any threshold u 2 > u 1 is also a GPD with the same shape parameter ξ. Moreover, from (Coles (2001)), the corresponding scale parameters α u1 and α u2 satisfy the equation α u2 = α u1 ξ(u 2 u 1 ). (2.4) The MRL plot is a representation of the empirical estimate of the conditional expectation E(X u X > u) as a function of u. More precisely the MRL plot represents the points { (u, 1 n u where I u = {i : X i > u}, and n u is its cardinal. ) n (X i u) : u i I u max j=1 X j For an optimal threshold u, the underlying distribution function of the exceedances is a GPD, and the conditional mean excess is given, for u > u, by }, E(X u X > u) = α u 1 + ξ = α u ξ(u u ). 1 + ξ Hence, a good GPD fit occurs when the MRL plot is roughly linear. However, in practice, the use of an MRL plot is not always simple and detecting the linearity is a subjective task. The range of the graph linearity can be explored using a numerical approach to select the optimal threshold. 2.2 Square Error Method (SEM) Beirlant et al. (1996) suggested to choose the threshold that minimizes the mean square error (MSE) of the tail index Hill estimator. A comparative study between the different estimators of tail index was conducted by Beirlant et al. (2005). The mean square error is useful to compare several estimators, especially when one of them is biased. It is therefore natural to take as optimal threshold the value that minimizes the MSE of an estimator based on the exceedances (Guillou and Willems(2006), Xiangxian et al. (2009)). In this paper, we suggest an algorithm inspired by Beirlant s work. The main steps of this algorithm are summarized hereafter. Let u 1,...,u n be n equally spaced increasing candidate thresholds (obtained, for instance from some graphical approach). For j = 1,, n, let σ uj and ξ uj be estimators of the scale and shape parameters based on the exceedances over the threshold u j. Step1 Find N uj, the number of exceedances over u j.

Step2 Simulate ν independent samples of size N uj from the GPD with parameters σ uj, and ξ uj. The number ν of samples to simulate is fixed by the user according to estimation needs. Step3 For each α A = {0.05, 0.1, 0.15,, 0.95}, and each i = 1,, ν, calculate the quantile q(α,u i j ) of the ith simulated sample. Compute q(α,u sim j ) = 1 ν q(α,u i ν j ). Step4 For j = 1,, n, calculate the square error SE uj = ( q sim α A (α,u j ) qobs (α,u j )) 2, i=1 where q(α,u obs j ) is the observed analogous of qsim (α,u j ). The optimal threshold value is the u such that SE u = min SE uj. j 2.3 Automated threshold selection method (ATSM) This is a pragmatic, simple, and computationally inexpensive threshold selection method that was developed by Thompson et al. (2009). Using simulated data, they show the effectiveness of their method and compare it to another approach (used in the JOINSEA software). For the reader s convenience, we sketch the steps of the ATSM algorithm described in Thompson et al. (2009). Step1 Identify suitable values of equally spaced candidate thresholds u 1 < u 2. < < u n. For example, we can take u 1 as the median of the data and u n as their 98% quantile. The sample of exceedances above u 1 should be large enough to insure reliable estimation. For j = 1,, n, compute σ uj and ξ uj, the likelihood estimators of the scale and shape parameters obtained from the exceedances above the threshold u j. Step2 It is shown in Thompson et al. (2009) that if u is a suitable threshold, then for any u ν u ν 1 u the difference τ (uν) τ (uν 1 ), where τ (uj ) = σ uj ξ uj u j, is approximately normally distributed ( with ) mean 0. Consider u = u 1, and test the hypothesis that the sequence τ (uν) τ (uν 1 ) is from a mean 0 normal distribution. If this hypothesis 2 ν n is not rejected, then u 1 is a suitable threshold. Otherwise consider u = u 2, remove the first term of the sequence and conduct the test for the remaining sequence. If the hypothesis is rejected, repeat this procedure with the next candidate threshold. Step3 Step 2 is repeated until the test indicates that the remaining sequence of differences is consistent with a normal distribution with mean 0. The authors mentioned that this algorithm might not converge but that could rarely happen. 2.4 Multiple threshold method (MTM) This method was developed by Deidda (2010) to infer the parameters of the GPD underlying the exceedances of daily rainfall records over a wide range of thresholds. The motivation for

this method resides in the needs of an appropriate technique to overcome the difficulties arising from irregularly discretized rainfall records or the site-to-site variability of the exceedances distribution parameters. It is shown that the MTM, based on the the concept of parameters threshold-invariance, is particularly suitable for regional analysis where optimum thresholds may depend on the data collection site. As we expected, we found it also appropriate in our case study where the data are subject to different sources of perturbation. For the sake of clarity, we recall the equations established in Deidda (2010) and the concept of parameters threshold-invariance. Suppose that for a given threshold value u 0, the expression of the exceedances distribution F u (.) is given by Eq. (2.3). From Eq. (2.4), we have that α 0 = α u ξu. Substituting u for x and 0 for u in Eq. (2.2), we obtain u 0, F (u) = (1 ζ 0 ) + ζ 0 F 0 (u), and thus [ u 0, ζ u = ζ 0 1 F0 (u) ]. (2.5) Substituting F 0 (u) in Eq. (2.3) we get ( u ) 1 ( ξ u ) 1 ξ ζ u 1 + ξ u = ζ u 1 ξ u, ξ u 0; u 0, ζ 0 = α 0 α u ζ u exp u = ζ u exp u, ξ u = 0. α 0 α u (2.6) This last equation states that the ζ 0 reparameterization is threshold-invariant, although the probability ζ 0 of exceeding u obviously decreases as u increases (see Deidda (2010)). The MTM can be recapped by the following hierarchical steps: Step1 (ξ M estimate): Identify suitable values of equally spaced threshold candidates u 1 < u 2. < < u n. Take the MTM estimate ξ M of the shape parameter as the median of the ξ estimates on the suggested range of thresholds. Step2 (α0 M estimate): In order to filter out the variability of the α 0 estimates driven by the fluctuations of the ξ, the α u values are estimated conditionnally to ξ M estimate obtained at step 1 and use again the reparametrization with the new α u estimates and ξ = ξ M constant. Results are now denoted as α0 c to remark that they are conditioned to ξm. Finally, the MTM estimate α0 M of the scale parameter is the median of the new α0 c estimates within the range of thresholds. Step3 (ζ0 M estimate): In a similar way, we can reduce the variability of ζ 0 by introducing the ζ u estimates together with the MTM estimates ξ M and α0 M (obtained at step 1 and 2). Results are now denoted as ζ0 c to emphasize again that they are conditioned to ξm and α0 M. Finally, the MTM estimate ζ0 M is the median of the new ζ0 c estimates within the range of thresholds.

3 Case Study 3.1 Dataset overview In this article, extreme value theory is applied to model extreme events of the CAC40 index. Our aim is to control and measure the risk of volatility associated with an index widely present in managers portfolio. We apply extreme value theory to take into account rare events such as stock market crashes, crises and bubbles. The CAC40 is a benchmark French stock market index. It provides an idea of the French market trends because it represents a capitalization-weighted measure of the 40 most significant values among the 100 highest market caps on the Paris Bourse (now Euronext Paris). The CAC40 was officially born in June 15, 1988, following the crash of 1987 which amended the monopoly of trading. The value of CAC40 has experienced drastic changes since its creation. It reached its highest peak on September 4, 2000 at 6 944.77 points and its sharp decline in the stock market crash of 2008 where the CAC40 lost more than 43.5 percent of its value. Our analysis of daily CAC40 stock index covers the period from Marsh 3, 1990 to December 20, 2010. This period represents 5222 observations. Figure 1: Evolution of CAC40 index from Marsh 1 th, 1990 to December 20 th, 2010. Figure 1 highlights the volatile nature of CAC40 index, which justifies a study of these data extreme values. Since 2003, the index has steadily increased. On January 1, 2007 it rose above 5600 points, a level not reached since May 2001. As an attempt to explain the volatile nature of the CAC40 index, here are some notable dates in its evolution: October 1987: Stock market crash September 1998: Russian crisis September 2001: September 11 th attacks August 1990: Energy crisis End 1999/2000: Internet bubble March 2003: Outbreak of the war in Iraq End 2003 to November 2007: 4 consecutive years of increases In order to apply the Balkema-de Haan-Pickands theorem to model the tail of CAC40 stock index distribution, we should test the iid hypothesis. Since the data distribution is unknown, we

used nonparametric tests of independence and homogeneity. Application of turning points and Mann-Whitney tests show that the original data (CAC40 index) do not satisfy the independence and homogeneity conditions. In order to meet with the theoretical requirements, we considered the inverse of CAC40 return index. According to the same tests, the corresponding data are homogeneous and independent. The large coefficient of skewness (skewness =24.42), shows that the data distribution is spread to the right. We also find that the kurtosis is larger than 3 indicating a clearly leptokurtic distribution. We can thus say that the distribution of the inverse CAC40 return index is positively skewed. There are therefore good reasons to believe that the distribution of the inverse CAC40 return index can be adjusted to a Fréchet distribution type. 3.2 Results 3.2.1 The MRL plot of the inverse CAC40 return index To identify the range of candidate thresholds we use the MRL plot (Figure 2). The graph is approximately linear on the interval [0; 3000]. For thresholds larger than 3000, the graph shows a lot of instability. This is due to the small size of the data set. Indeed, only 64 observations among the initial 5222 are larger than 3000. Figure 2: Mean residual life plot of the inverse CAC40 return index The positive slope indicates that the tail index is positive, and therefore, it is expected that the exceedances fit a Fréchet distribution. The optimal threshold is in the range of linearity, i.e. the interval [0; 3000]. In order to apply the ATSM, MTM or SEM to detect the optimal threshold, we will adequately discretize the range by considering the set {u 0 = 0, u 1 =, u 2 = 2, u 3 = 3,, u n = n = 3000}, where = 0.1, 0.01 or 0.001. 3.2.2 Detection of the optimal threshold by SEM For each u j, we calculate the ξ uj and α uj estimates from the excesses above the threshold u j, then we simulate 1000 samples of size N uj (defined in section 2.2). The minimum Square

Error SE uj = (q(0.05,u sim j ) q(0.05,u obs j ) ) 2 + + (q sim (0.95,u j ) q obs (0.95,u j ) is achieved for u j 713. We note that for the SEM, both maximum likelihood and moment estimation methods lead to the same optimal threshold value. For the scale and the shape, we suggest to retain moment estimators because when the sample is small or contaminated by spurious data, the maximum likelihood could provide unrealistic estimators. Using the SEM and moment estimation, we can fit the following distribution ( F 713 (x) = F 713 (x, 3774.27, 0.44) = 1 ) 2, 1 + 0.44 x 713 3774.27 ) 1 0.44. The SEM optimal threshold value leads us to restrict the range of candidate thresholds. In the sequel, we use the range [700, 3000] instead of [0, 3000]. 3.2.3 Detection of the optimal threshold by ATSM Going through all the range of candidate thresholds u ν, we find that the distribution of the difference τ (uν) τ (uν 1 ) (see methodology section) does not fit a normal distribution with a mean 0, and ATSM returns a warning. Although Thompson et al (2009) mentioned that this latter situation (warning algorithm of ATSM) occurs rarely, we have encountered it for our case study. 3.2.4 Detection of the optimal threshold by MTM 3.2.4.a Estimation by the maximum likelihood method We obtain the ξ u estimator using the excesses above each threshold in the range interval [700, 3000]. Figure 3 displays the estimation of ξ as function of thresholds u in the range [700, 3000]. Figure 3: Estimation of ξ u as function of thresholds u in the range [700, 3000] The horizontal line in the figure illustrates ξ M, the median of ξ u estimators. The starting point of the shape parameter stabilization suggests u 1150 as an optimum threshold.

After determining the optimal threshold value provided by MTM (u = 1156), we can fit the following distribution ( F 1156 (x) = F 1156 (x, 6066.41, 0.43) = 1 1 + 0.43 x 1156 6066.41 ) 1 0.43. 3.2.4.b Estimation by the method of moments For the method of moments, we observe a stabilization of ξ u estimators for thresholds larger than u 1000. Figure 4: Estimation of ξ u as function of thresholds u in the range [700, 3000] Based on the scale and shape moment estimators corresponding to that optimal threshold, we can fit the following distribution ( F 1000 (x) = F 1000 (x, 2443.19, 0.45) = 1 1 + 0.45 x 1000 2443.19 ) 1 0.45. The estimated thresholds given by the proposed methods, to model the inverse of CAC40 return index excesses, are different. The Q-Q plot (Figure 5) shows that the distribution obtained by the MTM is the most appropriate to fit the studied dataset. Figure 5: Corresponding QQ plot for different retained methods

For both estimation methods, the MTM indicates that the inverse CAC40 return index excesses fit a heavy tailed distribution. These results are corroborated by the statistical characteristics such as the skewness and kurtosis coefficients. Figure 5 also shows that, for the MTM, the method of moments is more efficient than the maximum likelihood method. To confirm these graphical results, we use Adup test (supremum class version of the upper tail Anderson-Darling test) to test the null hypothesis the GPD is adequate versus the alternative the GPD is not adequate. We retain the model corresponding to the greatest p-value. The results of the Adup test are given below: Approach Estimation method Optimum threshold p-value MTM Moment u MT M,M = 1000 0.14 MTM Maximum likelihood u MT M,ML = 1156 0.06 SEM Moment u SEM,M = 713 < 2.210 16 According to ADup test, the retained distribution to model the inverse of CAC40 return index excesses belongs to a Fréchet distribution obtained by MTM and moment method estimation: ( F 1000 (x) = F 1000 (x, 2443.19, 0.45) = 1 1 + 0.45 x 1000 2443.19 ) 1 0.45 This finding confirms the dominant idea of financial series asymptotic distributions. In addition to skewness, leptokurtosis and lack of normality cited for the CAC40 return index series, we note the following facts for the inverse of the CAC40 return index: First, it is in Fréchet maximum domain of attraction. Second, it has a high propensity for exceeding relatively small values. 4 Conclusion The purpose of this paper is the practical implementation of the peaks over threshold (POT) method to estimate extreme value distribution. The main focus has been on the performance of the POT approach in combination with various threshold detection and estimation methods. Our case study (CAC40 stock return index) aims to illustrate these approaches. In this case the MTM gives satisfactory results for both moment and maximum likelihood estimation methods. References Balkema, A. and de Haan, L. 1974. Residual life time at a great age. Annals of Probability, 2(5), 792-804. Beirlant, J., Dierckx, G. and Guillou, A. 2005. Estimation of the extreme value index and regression on generalized quantile plots. Annals of Statistics, 11(6), 949-970. Beirlant, J., Goetgebeur, Y., Segers, J. and Teugels, J. 2004. Statistics of Extremes: Theory and Applications. Chichester, Wiley.

Beirlant, J., Dierckx, G., Guillou, A. and Starica, C. 2002. On exponential representations of log-spacings of extreme order statistics. Extremes, 5(2), 157-180. Beirlant, J., Teugels, J. L. and Vynckier, P. 1996. Practical Analysis of Extreme Values, Leuven: Leuven University Press. Choulakian, V. and Stephens, M. A. 2001. Goodness-of-fit tests for the generalized Pareto distribution. Technometrics, 43, 478-484. Coles, S. G. 2001. An Introduction to Statistical Modeling of Extreme Values. London: Springer- Verlag. Davison, A. C. and Smith, R. L. 1990. Models for exceedances over high thresholds. Journal of the Royal Statistical Society B, 52, 393-442. Deidda, R. 2010. A multiple threshold method for fitting the generalized Pareto distribution to rainfall time series. Hydrology and Earth System Sciences 14, 4957-4994. Dupuis, D. J. 1999. Exceedances over high thresholds: a guide to threshold selection Extremes, 1 (3), 251-261. Embrechts, P., Klppelberg C. and Mikosch, T. 1997. Modelling Extremal Events for Finance and Insurance. Berlin: Springer-Verlag. Fisher, R. and Tippet, L. 1928. Limiting forms of the frequency distribution of the largest or smallest member of a sample. Proceedings of the Cambridge Philosophical Society, 24, 180-190. Guillou, A. and Willems, P. 2006. Application de la théorie des valeurs extrmes en hydrologie Revue Statistique Appliquée, LIV (2), 5-31. Lang, M., Ouarda, T. B. M. J. and Bobée, B. 1999. Towards operational guidelines for overthreshold modeling, Jornal of Hydrology, 225, 103-117. MacDonald, A., Scarrott, C. J., Lee, D., Darlow, B., Reale, M. and Russell, G. A. 2011. Flexible Extreme Value Mixture Model, Computational Statistics and Data Analysis, 55 (6), 2137-2157. Manfred, G. and Kellezi, E. 2006. An Application of Extreme Value Theory for Measuring Financial Risk, Computational Economics, 27(1), 1-23. Neves,C. and Fraga Alves, M. I. 2004. Reiss and Thomas s Automatic selection of the number of extremes,computational Statistics and Data Analysis,47,689-704.

Pickands, J. 1975. 3(1):119-131. Statistical inference using extreme order statistics.annals of Statistics, Reiss, R. and Thomas, M. 2005. Statistical Analysis of Extreme Values (for Insurance, Finance, Hydrology and Other Fields). 3rd rev. edn. Basel: Birkhuser. Smith, R. L. 1985. Statistics of extreme values. Bulletin of the International Statistical Institute,Proceedings of the 45th Session (Amsterdam) Book 4,1-17. Suveges, M. and Davidson, A. C. 2010. Model Misspecification in peaks over threshold analysis, The Annals of Applied Statistics, 4(1),203-221. Thompson, P., Cai, Y., Reeve, D. and Stander, J. 2009. Automated threshold selection methods for extreme wave analysis, Journal of Time, Coastal Engineering,1013-1021. Xiangxian, Z. and Wenlei, G. 2009. A New Method to Choose the Threshold in the POT Model, ICISE(9), First International Conference on Information Science and Engineering.750-753. Zhang, J. 2007. Likelihood Moment Estimation for the Generalized Pareto Distribution. Journal of Statistic 49, 69-77.