Which Extreme Values Are Really Extreme?


 Silas Carroll
 1 years ago
 Views:
Transcription
1 Which Extreme Values Are Really Extreme? JESÚS GONZALO Uiversidad Carlos III de Madrid JOSÉ OLMO Uiversidad Carlos III de Madrid abstract We defie the extreme values of ay radom sample of size from a distributio fuctio F as the observatios exceedig a threshold ad followig a type of geeralized Pareto distributio (GPD) ivolvig the tail idex of F. The threshold is the order statistic that miimizes a KolmogorovSmirov statistic betwee the empirical distributio of the correspodig largest observatios ad the correspodig GPD. To formalize the defiitio we use a semiparametric bootstrap to test the correspodig GPD approximatio. Fially, we use our methodology to estimate the tail idex ad value at risk (VaR) of some fiacial idexes of major stock markets. keywords: bootstrap, extreme values, goodessoffit test, Hill estimator, Pickads theorem, VaR Risk maagemet is oe of the most importat iovatios of the 20th cetury i ecoomics. Durig the last decade fiacial markets have realized the importace of moitorig risk. The questio oe would like to aswer is: If thigs go wrog, how wrog ca they go? The variace used as a risk measure is uable to aswer this questio. Alterative measures regardig possible values out of the rage of available iformatio eed to be defied ad calculated. Extreme value theory (EVT) provides the tools to model the asymptotic distributio of the maximum of a sequece of radom variables {X }, ad i this sese this theory ca be very helpful i order to obtai a first impressio about how wrog thigs We thak participats at the Coferece o Extremal Evets i Fiace, New Frotiers i Fiacial Volatility Modellig, Witer Meetigs of the North America Ecoometric Society, ad the Departmet of Statistics, Uiversity of North Carolia Chapel Hill, especially to Ross Leadbetter ad Richard Smith for excellet commets. We are also deeply grateful to Christopher Geczy, Alfoso Novales, Michael Wolf, ad two aoymous referees for valuable suggestios ad commets. Fiacial support provided by a DGCYT grat (SEC ) is gratefully ackowledged. Address correspodece to Jesús Gozalo, Departmet of Ecoomics, Uiversidad Carlos III de Madrid, 28903, Getafe, Madrid, Spai, or Joural of Fiacial Ecoometrics, Vol. 2, No. 3, ª Oxford Uiversity Press 2004; all rights reserved. 2, DOI: /jjfiec/bh014
2 350 Joural of Fiacial Ecoometrics ca go. A deeper isight ito EVT allows us to kow ot oly the order of covergece of the maximum, but also the limitig distributio of the largest observatios of the sequece. These observatios are the mai igrediets of more iformative risk measures that have bee recetly itroduced, like value at risk (VaR) or expected shortfall. These measures are fuctios of extreme quatiles of the data distributio. Attemptig to model the tails of these distributios is troublesome ad stadard methodologies such as historical simulatio or the gaussia distributio do ot provide reliable approximatios at very high quatiles. O the other had, the methodology derived from EVT covers this gap ad produces a parametric framework to derive the VaR or ay fuctio of this extreme quatile. It is clear that the first task is to idetify which values are really extreme values. I practice this is doe by graphical methods such as the QQ plot, sample mea excess plot, or by other ad hoc methods that impose a arbitrary threshold (5%,10%,...) [see Embrechts, Klüppelberg, ad Mikosch (1997)]. These methods do ot propose ay formal computable method, ad moreover, they oly give very rough estimates of the set of extreme values. I this article we propose a formal way of idetifyig ad estimatig the extreme values of ay radom sample of size comig from a distributio fuctio, say F. These values are goig to be defied as the exceedaces of a threshold sequece {u } followig a type of geeralized Pareto distributio (GPD). The selectio of this threshold plays a cetral role i this defiitio ad i estimatig the parameters of the GPD. The sequece of extreme values depeds o the legth of the data sequece by the choice of {u }. Therefore we eed to itroduce a appropriate test to assess statistically whether the distributio fuctio of the set of extremes give by the threshold really satisfies the weak covergece to the GPD or ot, with parameters drive by F. I order to achieve this task, we propose a semiparametric bootstrap test ad study its asymptotic as well as its fiite sample performace. The fial purpose of our methodology is to achieve a reliable approximatio of F, payig special attetio to its tails. Our tail estimate provides accurate approximatios of the extreme quatiles of F, ad from them it is straightforward to calculate the risk measures itroduced i the fiacial literature. The article is structured as follows. I Sectio 1 we preset some geeral results of extreme value theory, focusig o the weak covergece of the largest observatios of a radom sequece. Sectio 2 itroduces differet approaches to select the threshold sequece ad gives a brief review of estimatio methods for the parameters of the GPD. Some simulatios show the performace of our approach i terms of tail idex estimatio. The complete defiitio of the sequece of extreme values is give i Sectio 3 by meas of a bootstrap hypothesis test. Mote Carlo simulatios provide the fiite sample performace of our proposed test. Sectio 4 presets a empirical applicatio where the risk of fiacial idexes of major stock markets is aalyzed via the tail idex ad VaR. Fially, Sectio 5 presets some cocludig remarks. Proofs are preseted i the appedix.
3 Gozalo & Olmo Which Extreme Values are Really Extreme? REVIEW OF EXTREME VALUE THEORY RESULTS The purpose of this sectio is to briefly itroduce the set of results of the socalled extreme value theory ecessary to develop the theory of the article. The departig poit is the study of the weak covergece for the sample maximum of a sequece of radom variables {X } with distributio fuctio F. Our itetio is to use the limitig distributio of this statistic to derive the weak covergece of the largest observatios of a radom sequece imposig a miimum set of assumptios o the distributio fuctio F. Let M ¼ max{x 1,..., X } be the sample maximum of the sequece ad let F be the commo distributio fuctio for {X }. Our first goal is to itroduce the coditios uder which M coverges weakly to a odegeerate distributio fuctio. Result 1 Let {X } be a idepedet ad idetically distributed (i.i.d.) sequece. Let 0 t 1ad suppose that {u } is a sequece of real umbers such that ð1 Fðu ÞÞ! t as!1: ð1þ The PfM u g!e t as!1: ð2þ Coversely, if Equatio (2) holds for some t, 0 t 1, the so does Equatio (1). The proof of this result is immediately derived from PfM u g¼f ðu Þ¼ 1 ð1 Fðu ÞÞ : ð3þ However, this result does ot guaratee the existece of a odegeerate distributio for M. Defie the right edpoit of a distributio fuctio as x F ¼ sup{x j F(x) < 1} þ1. It is clear that M! x F with probability 1 as! 1. Suppose ow that F has a jump at x F with x F < 1 (i.e., F(x F ) < 1 with Fðx F Þ¼lim x"xf FðxÞ), ad cosider a sequece {u } satisfyig Equatio (2) with 0 t 1. The either u < x F for ifiitely may values of ad (1 F(u ))! 1, oru > x F ad (1 F(u )) ¼ 0. Therefore we also eed some regularity coditio o the tail of F to avoid the existece of such jumps. Result 2 Let F be a distributio fuctio with right edpoit x F such that lim x"xf 1 FðxÞ 1 Fðx ¼ 1, ð4þ Þ ad let {u } be a sequece with u < x F ad (1 F(u ))! t. The 0 < t < 1. We will assume hereafter these regularity coditios as our miimum set of assumptios o the distributio fuctio F. The choice of the sequece {u } determies the value of t. Suppose v > u ad Equatio (2) holds, the (1 F(v ))! t 0 with t 0 < t. We ca write Equatio (2) as P{M u (x)}! e t(x), with u depedig o x. Moreover, there exist some
4 352 Joural of Fiacial Ecoometrics scalig sequeces a, b varyig accordig to F such that Pfa 1 ðm b Þxg!GðxÞ as!1, ð5þ with u (x) ¼ a x þ b ad G(x) ¼ e t(x) a distributio fuctio. This fuctio has bee fully characterized by Gedeko (1943) or de Haa (1976) via the aalysis of domais of attractio for the maximum, ad it ca be summarized as follows: Result 3 The distributio fuctio G(x) derived i Equatio (5) ca oly take three differet forms, Type I: (Gumbel) G(x) ¼ e e x, 1 < x < 1, 0 x 0, Type II: (Frechet) GðxÞ ¼ e x 1 j x > 0, j > 0 1 x 0, Type III: (Weibull) GðxÞ ¼ x < 0, j < 0 : e ð xþ 1 j The parameter j is the tail idex of F ad characterizes the tail behavior of the distributio fuctio. The three types ca be gathered i the socalled geeralised extreme value distributio, first proposed by vo Mises (1936), GðxÞ ¼e ð1þjx m s j, Þ 1 ð6þ where m is a locatio parameter, s a scale parameter, ad j 6¼ 0. This expressio boils dow to GðxÞ ¼e e ðx m s Þ whe j ¼ 0. Clearly tðxþ ¼ ð1 þ j x m s Þ 1 j i Equatio (5), ad hece ð1 Fðu ðxþþþ! ð1 þ j x m s Þ 1 j for all x, where a, b are suitable costats. This is the result we exploit i order to derive the weak covergece of the largest observatios determied by a threshold sequece u o ¼ a m þ b, with m satisfyig log G(m) ¼ 1. By doig that 1 Fðu ðxþþ 1 Fðu o Þ! 1 þ j x m s 1 This expressio ca be rewritte as Fðu ðxþþ Fðu o Þ 1 Fðu o Þ j, as!1: ð7þ! 1 1 þ j x m 1 j, ð8þ s for all x > m cotiuity poits. The threshold sequece satisfies u (x) ¼ u o þ a (x m), ad we ca defie F uo ða ðx mþþ ¼ Fðu o þ a ðx mþþ Fðu o Þ, ð9þ 1 Fðu o Þ as the coditioal excess distributio fuctio give u o with x > m. This takes us directly to the followig result: Result 4 Let y ¼ a (x m), the lim sup u o!x F ½0y<1Š jf uo ðyþ GPD j;sðuo ÞðyÞj ¼ 0, ð10þ
5 Gozalo & Olmo Which Extreme Values are Really Extreme? 353 with 8 GPD j;sðuoþðyþ ¼ 1 1 þ j y 1 j >< if j 6¼ 0 sðu o Þ, ð11þ >: 1 e y sðuoþ if j ¼ 0 the geeralized Pareto distributio ad s(u o ) ¼ sa. This result is kow as Pickads (1975) theorem. Pickads proposed a sequece u o take i the iterval [b, b þ1 ], with b the suitable sequece i Equatio (5). This approximatio for the distributio of the largest observatios regarded as the exceedaces of a threshold sequece ca be improved whe the tail of F decays at a polyomial rate. Suppose 1 FðxÞ ¼x 1 j LðxÞ with L(tx)/L(x)! 1asx! xf ad j > 0, the the distributio fuctio F satisfies 1 FðtxÞ lim x"xf 1 FðxÞ ¼ t 1 j, t> 0: ð12þ This type of distributio fuctio is regularly varyig at a rate 1 j ad the domai of attractio of the sample maximum is the Fréchet distributio [see Resick (1987) or de Haa (1976)]. The fuctio L(x) is said to be slowly varyig ad is itroduced to iclude the deviatios of F from the Pareto probability law. Whe these departures from the polyomial law are small, F uo ðyþ is better approximated by the Pareto distributio fuctio. Cosider a sequece u (x) ¼ u o x, where u o ¼ u (1) is the threshold sequece that satisfies 1 Fðu o Þ¼u 1 j o Lðu o Þ. The coditioal excess distributio fuctio defied by u o as F uo ðu ðxþþ¼ FðuðxÞÞ FðuoÞ 1 Fðu o Þ satisfies F uo ðu ðxþþ!1 u 1 ðxþ j, as!1, ð13þ u o for u (x) u o or equivaletly for x 1. This covergece holds for all cotiuity poits of F ad therefore for this case we ca rewrite the previous result as lim sup j F uo ðyþ PD j ðyþj¼0, ð14þ u o!x F ½u o y < 1Š with y ¼ u (x) ad PD j ðyþ ¼1 ð y u o Þ 1 j. Fially, the choice of the threshold sequece also has a effect o the error made by the approximatios claimed i Pickads theorem. This error arises from the asymptotic relatio (1 F(u ))! t ad from the approximatio of F (u )by the expoetial distributio. The latter approximatio is of order o( 1 ) sice 0 e x 1 x 1 0:3 1, for 0 x [see, e.g., Leadbetter, Lidgre, ad Rootzé (1983)]. Nevertheless, if F is cotiuous oe ca always obtai a equality i Equatio (2) by takig u ¼ F 1 ðe t Þ ad makig the approximatio errors vaish. However, sequeces of
6 354 Joural of Fiacial Ecoometrics type u (x) ¼ a x þ b, with a, b suitable costats are more appropriate to study the weak covergece of M. I these cases, the equality or ay uiform boud for all x are ot usually feasible i Equatio (5). 2 THRESHOLD CHOICES TO DEFINE THE EXTREME VALUES The last sectio has focused o fidig the asymptotic laws that rule the largest observatios of a radom sequece from a distributio fuctio F. This set of observatios is defied by meas of a threshold sequece ad the tail idex j that characterizes the correspodig geeralized Pareto or Pareto. The choice of this sequece is troublesome sice u o! x F whe!1, but at a appropriate rate. This order of covergece depeds o F represeted by the sequeces a ad b whe u (x) is of the form u (x) ¼ a x þ b. Hece the threshold sequece u o ca be defied by the scalig sequeces a, b ad the value of x satisfyig the coditio log G(x) ¼ 1, or equivaletly (1 F(u o ))! 1. For ease of otatio we will use hereafter u istead of u o to deote the threshold sequece satisfyig these coditios. This sequece is immediately derived by direct calculatios whe F is kow. Cosider as a example the case F(x) ¼ 1 e x. By cotiuity of F we ca choose u ðxþ ¼F 1 ð1 tðxþ Þ with t(x) > 0, ad hece u (x) ¼ log t(x) þ log. Equatio (2) is writte as PfM log tðxþþlog g!e tðxþ, ad the P{M log x}! e e x, with t(x)¼e x for all x 2 R. The scalig costats are a ¼ 1, b ¼log, ad hece the threshold sequece is u ¼ log, sice log G(0) ¼ 1. More examples ca be foud i Leadbetter, Lidgre, ad Rootzé (1983). I geeral, F is ukow, ad i this settig either the theoretical derivatio or the direct compariso of differet threshold choices is possible. This compariso is udertake by aalyzig the properties of the tail idex estimator of F, as most of these estimators for j are tied to a threshold choice. Therefore their biases ad variaces are iflueced by the effect of the selectio of u. There is a large amout of literature i tail idex estimatio [chapter VI of Embrechts, Klüppelberg, ad Mikosch (1997) gives a excellet review]. Amog these estimators, the most popular are Hill s estimator (1975) ad Pickads s estimator (1975). The former is give by ^j ðhþ ðu Þ¼ 1 k X i¼ kþ1 log x ðiþ, x ð kþ ð15þ with u ¼ x ( k), x ( k+1) x () deotig the icreasig order statistics ad k a iteger value i [1, ]. Pickads s estimator for the tail idex is ^j ðpþ ðu Þ¼ 1 logð2þ log x ð kþ1þ x ð 2kþ1Þ, ð16þ x ð 2kþ1Þ x ð 4kþ1Þ
7 Gozalo & Olmo Which Extreme Values are Really Extreme? 355 ad ^s ðpþ ðu Þ¼ x ð 2kþ1Þ x ð 4kþ1Þ R log2 0 e^j ðpþ ðx ð 4kþ1Þ Þt dt, ð17þ for the variace, with u ¼ x ( 4kþ1) ad k ¼ 1,..., /4. There are some features of both estimators that are worth metioig. These estimators are heavily depedet o the threshold choice u, ad both of them ca be derived uder the assumptio that F u is exactly Pareto with parameter j or geeralized Pareto with parameters j ad s(u ). Moreover, if F u ¼ PD j, Hill s estimator is the maximumlikelihood estimator of j iheritig the correspodig asymptotic properties: cosistecy ad ormal distributio. This approach is oly valid for regularly varyig distributio fuctios, that is, j > 0, otherwise the asymptotic properties of this estimator vary accordig to F [see Davis ad Resick (1984)]. Pickads s estimator for the tail idex is obtaied assumig F u ¼ GPD j;sðu Þ ad takig the iverse of the parametric GPD. This estimator is cosistet ad also coverges to a ormal distributio; but it is very sesitive to the choice of u. Alteratively, uder the latter parametric assumptio o F u we ca obtai the maximumlikelihood estimator for the parameter j ad s(u ) of the GPD. I this case there is ot a closed expressio for the maximumlikelihood estimators of these parameters, ad we have to rely o umerical procedures [see Press (1992)]. The maximumlikelihood estimator for the tail idex is cosistet ad asymptotically ormal for j > 1 2, as is discussed i Smith (1985). The threshold selectio is carried out by studyig the measquared error of these j estimators, as u is varies. However, some explicit form is required for the distributio fuctio F. Uder the assumptio 1 FðxÞ ¼Cx 1 j ½1 þ Dx b þ oðx b ÞŠ, ð18þ where j > 0, C > 0, b > 0, ad D is a real umber, Hall (1982) proposed estimators for the tail idex based o a optimal choice of itermediate order statistics as cadidates for the threshold sequece. Nevertheless, the pioeerig work for threshold selectio is Pickads (1975), where F satisfies the regularity coditios of Result 2, but ot ecessarily Equatio (18). The estimatio of the tail idex ad the threshold selectio are doe i a sigle step. Pickads proposed as a cadidate for the threshold the order statistic of a sample {x } that miimizes the distace d 1 ivolvig the distributio fuctios F u ; ad. The empirical GPD^j ðpþ ðu Þ;^s ðpþ ðu Þ coditioal excess distributio fuctio F u ;ðxþ with x > u is defied by F u ;ðxþ ¼ X i¼1 1 fu < x i xg P j¼1 1, ð19þ fx j >u g or equivaletly, via the trasformatio y¼a (x u ) > 0, by F u ;ðyþ ¼ X i¼1 1 f0<yi yg P j¼1 1 : ð20þ fy j >0g
8 356 Joural of Fiacial Ecoometrics The distace d 1 ca be writte as a fuctio of a variable u, oce is give, as d 1 F u;, ¼ sup j F GPD^j ðpþ ðuþ;^s ðpþ ðuþ u; ðyþj: ð21þ ðyþ GPD^j ðpþ ðuþ;^s ðpþ ðuþ 0y<1 The optimal threshold is the u ðpþ ¼ arg mi u d 1 F u;, GPD^j ðpþ ðuþ;^s ðpþ ðuþ, ð22þ with u takig values alog the ordered sample x (3/4) x (). More specifically, u ðpþ ¼ x ð kþ with k!1,!1, ad k ¼ o() to beefit of a icrease i the sample size. Alteratively we propose a versio of the distace d 1 where the umber of tail observatios is weighted differetly. This ew approach accouts for the estimatio pitfalls that derive from the lack of observatios whe u gets close to x F. Defiitio 1 Let F u, be the empirical versio of F u ad the distributio GPD^j ðmlþ ðuþ;^s ðmlþ ðuþ fuctio of the largest observatios with parameters estimated by maximum likelihood (Ml). Defie the weighted Pickads distace d WP as d WP F u;, GPD^j ðmlþ ðuþ;^s ðmlþ ðuþ ¼ k «sup j F u; ðyþj, ð23þ ðyþ GPD^j ðmlþ ðuþ;^s ðmlþ ðuþ 0y<1 with 0 «1 2 ad k ¼ P j¼1 1 fx j > ug. The parameter «determies the weight assiged by the distace d WP to the tail observatios defied by the correspodig u. Notice that this distace is the oe used by Pickads whe «¼ 0, ad the KolmogorovSmirov (KS) statistic [Kolmogorov (1933)] whe «¼ 1 2. The correspodig threshold choice is the order statistic that miimizes the distace, u ðwpþ ¼ arg mi u d WP F u;, GPD^j ðmlþ ðuþ;^s ðmlþ ðuþ, ð24þ with u takig values alog the ordered sample x (1) x (). The parameter «ca be useful to study the effect of differet weightig schemes i the threshold selectio; however, this is far beyod the scope of this article, where we will oly focus o the value «¼ 1 2 (KS statistic). It is clear that threshold values far from x F produce biased estimates of the tail idex. O the other had, u close to the right edpoit will result i iefficiet estimates of j. Goldie ad Smith (1987) ad Smith (1987) derive the asymptotic distributio fuctios of both the maximumlikelihood ad Hill estimators of the tail idex for a class of distributio fuctios such that 1 FðxÞ ¼x 1 j LðxÞ, where L(x) are slowly varyig fuctios of differet types. They also discuss i detail asymptotic bias ad variace for these estimators ad fid that departures of F from a Pareto distributio fuctio lead to biased ad iefficiet estimates of the tail idex for both estimators. As a result, a right choice of the threshold sequece turs out to be of critical importace i order to miimize the measquared error (MSE). Hall (1982) derives a aalytical expressio for the MSE of Hill s estimator whe F satisfies Equatio (18). All these results are achieved for determied
9 Gozalo & Olmo Which Extreme Values are Really Extreme? 357 classes of distributio fuctios. I cotrast, uder the regularity coditios of Result 2 it is ot possible to derive aalytically the MSE expressio for the tail idex estimator. Therefore we propose bootstrap cofidece itervals i order to measure the bias ad ucertaity of the differet tail idex estimators we cosidered. The aïve oparametric bootstrap is cosistet p sice the empirical distributio fuctio F is a cosistet estimator of F ad ffiffi k ð^j ðiþ ðuðlþ Þ jþ, i ¼ H, Ml, P, ad l ¼ P, WP, Ah (ad hoc) coverges weakly to a ormal distributio, with k beig the umber of exceedaces over u. The the bootstrap approximatio J (x, F )to the true samplig distributio fuctio J (x, F) of this statistic ca be used to produce cofidece regios, at the 1 a level, i the followig way, j 2 h^j ðiþ ðuðlþ Þ 1 pffiffi J 1 1 a k 2,F ðiþ, ^j ðuðlþ Þ 1 a pffiffi J i 1 k 2,F, ð25þ where J 1ð1 a,f Þ is the 1 a bootstrap quatile. To implemet Equatio (25), the bootstrap approximatio is estimated by ^J ðx, F Þ¼ 1 B X B j¼1 1 pffiffi f kð^j ðiþ j; ðuðlþ j; Þ ^j ðiþ ðu ðlþ ÞÞ xg, where B is the umber of bootstrap iteratios, j; ðuðlþ j; Þ the correspodig estimator for the bootstrap sample j, ad u ðlþ j; the correspodig threshold choice. The fiite sample performace of the differet estimators is aalyzed i Table 1. The threshold u is chose by both methods, Pickads ad weighted Pickads with e ¼ 1 2. To emphasize the importace of the threshold selectio to estimatig the tail idex, a ad hoc threshold (u ðahþ ¼ x ð Þ) is also icluded i the aalysis. The simulatio experimet of Table 1 is doe for differet Studet tdistributios, where the tail idex j is well approximated by the iverse of the degrees of freedom [see chapter III of Embrechts, Klüppelberg, ad Mikosch (1997). Before discussig the results of this Table 1 it is importat to otice that although F is kow, we replace it with F to calculate the bootstrap approximatio J (x, F ). The reaso for doig this is that the bootstrap procedure works ^j ðiþ ð26þ Table 1 Bootstrap cofidece itervals I. ^j ðmlþ ðu ðwpþ ^j ðpþ ðuðpþ ^j ðmlþ ðu ðahþ t 1 (j 1) t 5 (j 0.2) t 10 (j 0.1) t 30 (j 0) Þ [0.70, 1.69] [ 0.17, 0.24] [ 0.28, 0.39] [ 0.43, 0.68] Þ [0.29, 1.06] [ 0.39, 0.08] [ 0.63, 0.06] [ 0.64, 0.17] Þ [0.34, 1.75] [0.19, 0.91] [ 0.26, 0.33] [ 0.28, 0.57] Bootstrap cofidece itervals at a sigificace level a ¼ 0:05 for differet estimators of the tail idex: ^j ðmlþ ðu Þ with u estimated by d WP ad by u ðahþ ðpþ ¼ x ð Þ; ad ^j ðuðpþ Þ with u estimated by d 1. B ¼ 1000 bootstrap samples of size ¼ 1000 are draw from a sigle sequece geerated from t, with ¼ 1, 5, 10 ad 30.
10 358 Joural of Fiacial Ecoometrics Table 2 Bootstrap cofidece itervals II. ^j ðmlþ ðu ðwpþ ^j ðhþ ðu ðwpþ t 1 (j 1) t 5 (j 0.2) t 10 (j 0.1) t 30 (j 0) Þ [0.70, 1.69] [ 0.17, 0.24] [ 0.28, 0.39] [ 0.43, 0.68] Þ [0.82, 1.23] [0.08, 0.37] [ 0.42, 0.23] [0.04, 0.20] Bootstrap cofidece itervals at a sigificace level a ¼ 0.05 for differet estimators of the tail idex whe u ðwpþ ðmlþ is obtaied from GPD j;sðuþ ad from PD j, respectively. Note ^j ðu ðwpþ ðhþ Þ is ^j ðu ðwpþ Þ for the PD j case. B ¼ 1000 bootstrap samples of size ¼ 1000 are draw from a sigle sequece geerated from t, with ¼ 1, 5, 10, ad 30. eve whe F is ukow ad we oly have a realizatio from the radom sequece {X }. There are two clear results from Table 1: First, the cofidece itervals for our estimator cotai the true tail idex, somethig that does ot occur for Pickads s method; ad secod, the cofidece itervals estimated from the ad hoc threshold are wider tha the oes derived from our method whe j is sigificatly greater tha zero. Table 2 aalyzes i more detail the advatages of the weighted Pickads method for selectig u whe the data come from heavytailed distributios. I this case the GPD j;sðuþ is replaced by the PD j i Defiitio 1 ad Equatio (24). From Table 2 we coclude that whe we are dealig with heavytailed distributios (j > 0), our method is more efficiet with PD tha with GPD. These simulatio results are i lie with the theoretical fidigs derived i Smith (1987). 3 HYPOTHESIS TESTING Differet threshold choices defie differet sets of possible extreme values of a particular sequece {X }. I this article the observatios exceedig a certai threshold are cosidered extreme values oly if they are distributed as a GPD j;sðuþ, with j the tail idex of F. I order to check this coditio we propose a goodessoffit test for the followig hypothesis: H ;0 : the sample fðx 1 u Þ þ,..., ðx u Þ þ g is distributed as GPD j;sðu Þ versus a geeral alterative of the form H ;1 : the sample fðx 1 u Þ þ,..., ðx u Þ þ g is ot distributed as GPD j;sðu Þ with u 2 R, j the tail idex of F ad (x) þ ¼ max(x, 0). A atural goodessoffit test statistic is the KS statistic [for other goodessoffit criteria see Aderso ad Darlig (1952)], p R k ðy;j, sðu ÞÞ ¼ ffiffi k sup j P k ðyþ GPD j;sðuþðyþj, 0y<1 with k ¼ P j¼1 1 fx j >u g ad P k the empirical distributio fuctio of the observatios exceedig u. Whe the parameters are kow, the asymptotic distributio ð27þ
11 Gozalo & Olmo Which Extreme Values are Really Extreme? 359 of this test statistic is tabulated ad the critical values ca be derived. If the parameters are ukow, but cosistetly estimated, the bootstrap distributio fuctio is a reliable approximatio of the true samplig distributio of R k (y; j, s(u )). I this case it ca be proved [see Romao (1988)] that the bootstrap critical values are cosistet estimates of the actual oes. Our iterest, however, does ot lie i the defiitio of the extreme values of a particular sequece {X }, but i the defiitio of the extreme values of ay sequece of legth with distributio fuctio F. I this case a differet hypothesis test is eeded to determie whether the selected threshold is a good cadidate to defie the extremes of F give the sample size. More formally, the testig problem uder cosideratio is H 0 : F u ¼ GPD j;sðuþ versus a geeral alterative H 1 : F u 6¼ GPD j;sðuþ, with j beig the tail idex of F. Now we ca formally defie the set of extreme values of ay sequece with distributio fuctio F. Defiitio 2 Let {X } be ay sequece of a distributio fuctio F. The extreme values of ay sequece of legth from this distributio are give by the observatios exceedig the threshold u, ad satisfyig F u ¼ GPD j;sðu Þ. The test statistic i this case is a versio of the family of KS test statistics, p T ðy ; j, sðu ÞÞ ¼ ffiffiffi sup j F u ;ðyþ GPD j;sðu ÞðyÞj, ð28þ 0y<1 with y i ¼ (x i u ) þ, i ¼ 1,...,. This statistic depeds o u, j, ad s(u ). I order to derive the asymptotic distributio of Equatio (28) ad to assess the bootstrap approximatio, the followig results are required. Let Pfl < T tg U l ðtþ ¼ ð29þ PfT > lg be the coditioal excess distributio fuctio, with parameter l o [0, 1], of a uiform [0, 1] radom variable T. Its empirical couterpart X 1 fl<ti tg U l; ðtþ ¼ 1 P 1 i¼1 j¼1 1, ð30þ ft j >lg p with t 1,..., t ad t 2 [0, 1], defies a empirical process B ðtþ ¼ ffiffiffi ðul; ðtþ pffiffiffi U l ðtþþ similar to the uiform empirical process ðu ðtþ UðtÞÞ. It is well kow that the latter coverges weakly to the distributio of a meazero gaussia process Z U () [see chapter V of Pollard (1984)]. By a aalogue reasoig, it is p immediate to derive the probability law of the process S ðyþ ¼ ffiffiffi ðfu ;ðyþ F u ðyþþ, where the threshold u plays the role of the parameter l.
12 360 Joural of Fiacial Ecoometrics Theorem 1 Cosider a cotiuous ad strictly icreasig distributio fuctio F ad a threshold u, with u < x F. The empirical process S (y) coverges weakly to the distributio of a meazero gaussia process Z Fu ðþ with covariace fuctio covðz Fu ðy 1 Þ,Z Fu ðy 2 ÞÞ ¼ ðfðmiðy 1,y 2 ÞÞ Fðu ÞÞ ðfðy 1 Þ Fðu ÞÞðFðy 2 Þ Fðu ÞÞ ð1 Fðu ÞÞ 2, ð31þ with y 1, y 2 2 R. Moreover, uder the ull hypothesis H 0, this empirical process takes the p form ffiffiffi ðfu ;ðyþ GPD j;sðuþðyþþ ad the covariace fuctio becomes covðz Fu ðy 1 Þ; Z Fu ðy 2 ÞÞ ¼ GPD j;sðuþðmiðy 1 ; y 2 ÞÞ 1 Fðu Þ GPD j;sðuþðy 1 ÞGPD j;sðuþðy 2 Þ: ð32þ By the cotiuous mappig theorem, the limitig distributio fuctio, deoted by L(x, F), of the test statistic T is the distributio of the supremum of a meazero gaussia process with the covariace fuctio of Equatio (32). The proof is i the appedix. I order to test H 0, we should be usig the followig rejectio criteria: ft ðy ; j, sðu ÞÞ > L 1 ð1 a; FÞg, ð33þ where L 1 ð1 a,fþ is the 1 a quatile of the exact fiite sample distributio L (x, F) of the statistic T. This distributio L is clearly ukow, ad i practice has to be approximated by the asymptotic distributio L(x, F). This limitig distributio takes a complicated form ad depeds o the kowledge of F, o the parameters of the GPD, as well as o the threshold u. The uisace parameters depedecy forces us to look for a alterative method to approximate the distributio L (x, F). 3.1 Bootstrap Approximatio Let L (x, Q ) be the bootstrap distributio that approximates L (x, F), ad L 1 ð1 a, Q Þ the bootstrap quatile that approximates the correspodig fiite sample distributio quatile L 1 ð1 a,fþ. I order for the bootstrap to be cosistet, Q has to satisfy certai coditios. Lemma 1 Let Q be a estimator of F based o {x 1,..., x } that satisfies sup x2r j Q ðxþ FðxÞj! p 0 wheever F 2 H 0, ad let L(x, F), the limitig distributio of the test statistic T, be cotiuous ad strictly icreasig. The PfT > L 1 ð1 a,q Þg!a, as!1: ð34þ The aïve oparametric bootstrap from Q ¼ F fails to produce cosistet estimates of a distributio fuctio uder H 0 if F does ot belog to the ull. O the other had, the parametric bootstrap from the GPD j;sðuþ [see Equatio (27)] fails to capture the structure of F for the observatios smaller tha the threshold u.
13 Gozalo & Olmo Which Extreme Values are Really Extreme? 361 To fulfill the coditios of Lemma 1 correspodig to Q ad therefore to solve the two previously metioed problems, a semiparametric bootstrap methodology is itroduced. Defie ( F ðxþ x u Q ðxþ ¼ GPD j;sðu Þðx u ÞþF ðu Þð1 GPD j;sðu Þðx u ÞÞ x > u : ð35þ This distributio fuctio is derived from the coditioal probability theorem, sice PfX xg ¼PfX u gpfx x j X u gþpfx > u gpfx x j X > u g, ð36þ where P{X u } is cosistetly approximated by F (u ), ad uder the ull PfX x j X > u g¼gpd j;sðu ÞðyÞ with y ¼ x u. Deote fx g a bootstrap sample obtaied from Q ad cosider the trasformed bootstrap sample y i ¼ x i u with i ¼ 1,...,. The value of the test statistic is t ðy 1,...,y ; j, sðu ÞÞ ad for the sake of otatio is deoted as t ðy ; j, sðu ÞÞ. The bootstrap approximatio L (x, Q ) is the estimated by the empirical distributio of the B (umber of bootstrap samples) values of T, ^L ðx, Q Þ¼ 1 B X B j¼1 1 ft ;j ðy;j;sðuþþxg : ð37þ The 1 a quatile of ^L ðx, Q Þ is the order statistic t ;ðdð1 aþbeþ ðy ; j, sðu ÞÞ of the sequece ft ;j ðy ; j, sðu ÞÞg of B elemets, where dxe is the upper iteger part of x. The rejectio criteria Equatio (33) is replaced ow by ft ðy ; j, sðu ÞÞ > t ;ðdð1 aþbeþ ðy ; j, sðu ÞÞg, ð38þ ad hece for a sample {x }, the ull hypothesis is rejected if t (y 1,..., y ; j, s(u )) is i this rejectio regio. This meas that the coditioal excess distributio fuctio defied by u is ot a GPD j;sðu Þ, ad accordig to our defiitio these cadidates for extreme observatios are ot really extreme. Recall that util ow we have assumed the parameters to be kow. Nevertheless this coditio is rarely satisfied i practice. To make our test operatioal, we replace these parameters with their maximumlikelihood estimators, ad istead of Q, we defie its couterpart distributio fuctio ^Q : ( F ðxþ x u ^Q ðxþ¼ GPD^j ðmlþ ðu Þ, ^s ðmlþ ðu ðx u Þ ÞþF ðu Þð1 GPD^j ðmlþ ðu Þ, ^s ðmlþ ðu ðx u : Þ ÞÞ x>u ð39þ Notice that the ew bootstrap distributio pffiffi fuctio L ðx, ^Q Þ boils dow to L (x, Q ) for x u, ad for x > u, the former k coverges to the latter, where k is the umber of observatios of the tail defied by u. Moreover, if F belogs to the
Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask
Everythig You Always Wated to Kow about Copula Modelig but Were Afraid to Ask Christia Geest ad AeCatherie Favre 2 Abstract: This paper presets a itroductio to iferece for copula models, based o rak methods.
More informationConsistency of Random Forests and Other Averaging Classifiers
Joural of Machie Learig Research 9 (2008) 20152033 Submitted 1/08; Revised 5/08; Published 9/08 Cosistecy of Radom Forests ad Other Averagig Classifiers Gérard Biau LSTA & LPMA Uiversité Pierre et Marie
More informationStéphane Boucheron 1, Olivier Bousquet 2 and Gábor Lugosi 3
ESAIM: Probability ad Statistics URL: http://wwwemathfr/ps/ Will be set by the publisher THEORY OF CLASSIFICATION: A SURVEY OF SOME RECENT ADVANCES Stéphae Bouchero 1, Olivier Bousquet 2 ad Gábor Lugosi
More informationSOME GEOMETRY IN HIGHDIMENSIONAL SPACES
SOME GEOMETRY IN HIGHDIMENSIONAL SPACES MATH 57A. Itroductio Our geometric ituitio is derived from threedimesioal space. Three coordiates suffice. May objects of iterest i aalysis, however, require far
More informationThe Unicorn, The Normal Curve, and Other Improbable Creatures
Psychological Bulleti 1989, Vol. 105. No.1, 156166 The Uicor, The Normal Curve, ad Other Improbable Creatures Theodore Micceri 1 Departmet of Educatioal Leadership Uiversity of South Florida A ivestigatio
More informationHow Has the Literature on Gini s Index Evolved in the Past 80 Years?
How Has the Literature o Gii s Idex Evolved i the Past 80 Years? Kua Xu Departmet of Ecoomics Dalhousie Uiversity Halifax, Nova Scotia Caada B3H 3J5 Jauary 2004 The author started this survey paper whe
More informationSystemic Risk and Stability in Financial Networks
America Ecoomic Review 2015, 105(2): 564 608 http://dx.doi.org/10.1257/aer.20130456 Systemic Risk ad Stability i Fiacial Networks By Daro Acemoglu, Asuma Ozdaglar, ad Alireza TahbazSalehi * This paper
More informationTesting for Welfare Comparisons when Populations Differ in Size
Cahier de recherche/workig Paper 039 Testig for Welfare Comparisos whe Populatios Differ i Size JeaYves Duclos Agès Zabsoré Septembre/September 200 Duclos: Départemet d écoomique, PEP ad CIRPÉE, Uiversité
More informationHOW MANY TIMES SHOULD YOU SHUFFLE A DECK OF CARDS? 1
1 HOW MANY TIMES SHOULD YOU SHUFFLE A DECK OF CARDS? 1 Brad Ma Departmet of Mathematics Harvard Uiversity ABSTRACT I this paper a mathematical model of card shufflig is costructed, ad used to determie
More informationThe Arithmetic of Investment Expenses
Fiacial Aalysts Joural Volume 69 Number 2 2013 CFA Istitute The Arithmetic of Ivestmet Expeses William F. Sharpe Recet regulatory chages have brought a reewed focus o the impact of ivestmet expeses o ivestors
More informationJ. J. Kennedy, 1 N. A. Rayner, 1 R. O. Smith, 2 D. E. Parker, 1 and M. Saunby 1. 1. Introduction
Reassessig biases ad other ucertaities i seasurface temperature observatios measured i situ sice 85, part : measuremet ad samplig ucertaities J. J. Keedy, N. A. Rayer, R. O. Smith, D. E. Parker, ad M.
More informationTeaching Bayesian Reasoning in Less Than Two Hours
Joural of Experimetal Psychology: Geeral 21, Vol., No. 3, 4 Copyright 21 by the America Psychological Associatio, Ic. 963445/1/S5. DOI: 1.7//963445..3. Teachig Bayesia Reasoig i Less Tha Two Hours Peter
More informationCrowds: Anonymity for Web Transactions
Crowds: Aoymity for Web Trasactios Michael K. Reiter ad Aviel D. Rubi AT&T Labs Research I this paper we itroduce a system called Crowds for protectig users aoymity o the worldwideweb. Crowds, amed for
More informationType Less, Find More: Fast Autocompletion Search with a Succinct Index
Type Less, Fid More: Fast Autocompletio Search with a Succict Idex Holger Bast MaxPlackIstitut für Iformatik Saarbrücke, Germay bast@mpiif.mpg.de Igmar Weber MaxPlackIstitut für Iformatik Saarbrücke,
More informationA Kernel TwoSample Test
Joural of Machie Learig Research 3 0) 73773 Subitted 4/08; Revised /; Published 3/ Arthur Gretto MPI for Itelliget Systes Speastrasse 38 7076 Tübige, Geray A Kerel TwoSaple Test Karste M. Borgwardt Machie
More informationSignal Reconstruction from Noisy Random Projections
Sigal Recostructio from Noisy Radom Projectios Jarvis Haut ad Robert Nowak Deartmet of Electrical ad Comuter Egieerig Uiversity of WiscosiMadiso March, 005; Revised February, 006 Abstract Recet results
More informationTurning Brownfields into Greenspaces: Examining Incentives and Barriers to Revitalization
Turig Browfields ito Greespaces: Examiig Icetives ad Barriers to Revitalizatio Juha Siikamäki Resources for the Future Kris Werstedt Virgiia Tech Uiversity Abstract This study employs iterviews, documet
More informationare new doctors safe to practise?
Be prepared: are ew doctors safe to practise? Cotets What we foud 02 Why we ve writte this report 04 What is preparedess ad how ca it be measured? 06 How well prepared are medical graduates? 08 How has
More informationNo Eigenvalues Outside the Support of the Limiting Spectral Distribution of Large Dimensional Sample Covariance Matrices
No igevalues Outside the Support of the Limitig Spectral Distributio of Large Dimesioal Sample Covariace Matrices By Z.D. Bai ad Jack W. Silverstei 2 Natioal Uiversity of Sigapore ad North Carolia State
More informationSpinout Companies. A Researcher s Guide
Spiout Compaies A Researcher s Guide Cotets Itroductio 2 Sectio 1 Why create a spiout compay? 4 Sectio 2 Itellectual Property 10 Sectio 3 Compay Structure 15 Sectio 4 Shareholders ad Directors 19 Sectio
More informationON THE EVOLUTION OF RANDOM GRAPHS by P. ERDŐS and A. RÉNYI. Introduction
ON THE EVOLUTION OF RANDOM GRAPHS by P. ERDŐS ad A. RÉNYI Itroductio Dedicated to Professor P. Turá at his 50th birthday. Our aim is to study the probable structure of a radom graph r N which has give
More informationLeadership Can Be Learned, But How Is It Measured?
Maagemet Scieces for Health NO. 8 (2008) O C C A S I O N A L PA P E R S Leadership Ca Be Leared, But How Is It Measured? How does leadership developmet cotribute to measurable chages i orgaizatioal performace,
More informationBOUNDED GAPS BETWEEN PRIMES
BOUNDED GAPS BETWEEN PRIMES ANDREW GRANVILLE Abstract. Recetly, Yitag Zhag proved the existece of a fiite boud B such that there are ifiitely may pairs p, p of cosecutive primes for which p p B. This ca
More informationAdverse Health Care Events Reporting System: What have we learned?
Adverse Health Care Evets Reportig System: What have we leared? 5year REVIEW Jauary 2009 For More Iformatio: Miesota Departmet of Health Divisio of Health Policy P.O. Box 64882 85 East Seveth Place, Suite
More information4. Trees. 4.1 Basics. Definition: A graph having no cycles is said to be acyclic. A forest is an acyclic graph.
4. Trees Oe of the importat classes of graphs is the trees. The importace of trees is evidet from their applicatios i various areas, especially theoretical computer sciece ad molecular evolutio. 4.1 Basics
More informationDryad: Distributed DataParallel Programs from Sequential Building Blocks
Dryad: Distributed DataParallel Programs from Sequetial uildig locks Michael Isard Microsoft esearch, Silico Valley drew irrell Microsoft esearch, Silico Valley Mihai udiu Microsoft esearch, Silico Valley
More informationCatalogue no. 62557XPB Your Guide to the Consumer Price Index
Catalogue o. 62557XPB Your Guide to the Cosumer Price Idex (Texte fraçais au verso) Statistics Caada Statistique Caada Data i may forms Statistics Caada dissemiates data i a variety of forms. I additio
More informationNo One Benefits. How teacher pension systems are failing BOTH teachers and taxpayers
No Oe Beefits How teacher pesio systems are failig BOTH teachers ad taxpayers Authors Kathry M. Doherty, Sadi Jacobs ad Trisha M. Madde Pricipal Fudig The Bill ad Melida Gates Foudatio ad the Joyce Foudatio.
More informationNational Association of Community Health Centers
Natioal Associatio of Commuity Health Ceters Parterships betwee Federally Qualified Health Ceters ad Local Health Departmets for Egagig i the Developmet of a CommuityBased System of Care Prepared by Feldesma
More informationWhen the People Draw the Lines
Whe the People Draw the Lies A Examiatio of the Califoria Citizes redistrictig Commissio by Raphael J. Soeshei with Geerous Support from The James Irvie Foudatio Whe the People Draw the Lies A Examiatio
More information