Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask

Size: px
Start display at page:

Download "Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask"

Transcription

1 Everythig You Always Wated to Kow about Copula Modelig but Were Afraid to Ask Christia Geest ad Ae-Catherie Favre 2 Abstract: This paper presets a itroductio to iferece for copula models, based o rak methods. By workig out i detail a small, fictitious umerical example, the writers exhibit the various steps ivolved i ivestigatig the depedece betwee two radom variables ad i modelig it usig copulas. Simple graphical tools ad umerical techiques are preseted for selectig a appropriate model, estimatig its parameters, ad checkig its goodess-of-fit. A larger, realistic applicatio of the methodology to hydrological data is the preseted. DOI: 0.06/ASCE :4347 CE Database subject headigs: Frequecy aalysis; Distributio fuctios; Risk maagemet; Statistical models. Itroductio Hydrological pheomea are ofte multidimesioal ad hece require the joit modelig of several radom variables. Traditioally, the pairwise depedece betwee variables such as depth, volume, ad duratio of flows has bee described usig classical families of bivariate distributios. Perhaps the most commo models occurrig i this cotext are the bivariate ormal, logormal, gamma, ad extreme-value distributios. The mai limitatio of this approach is that the idividual behavior of the two variables or trasformatios thereof must the be characterized by the same parametric family of uivariate distributios. Copula models, which avoid this restrictio, are just begiig to make their way ito the hydrological literature; see, e.g., De Michele ad Salvadori 2002, Favre et al. 2004, Salvadori ad De Michele 2004, ad De Michele et al Restrictig attetio to the bivariate case for the sake of simplicity, the copula approach to depedece modelig is rooted i a represetatio theorem due to Sklar 959. The latter states that the joit cumulative distributio fuctio c.d.f. Hx,y of ay pair X,Y of cotiuous radom variables may be writte i the form Hx,y = CFx,Gy, x,y R where Fx ad Gymargial distributios; ad C:0, 2 0,copula. While Sklar 959 showed that C, F, ad G are uiquely determied whe H is kow, a valid model for X,Y arises from Eq. wheever the three igrediets are chose from give parametric families of distributios, viz. Professor, Dépt. de mathématiques et de statistique, Uiv. Laval, Québec QC, Caada GK 7P4. 2 Professor, Chaire e Hydrologie Statistique, INRS, Eau, Terre et Eviroemet, Québec QC, Caada GK 9A9. Note. Discussio ope util December, Separate discussios must be submitted for idividual papers. To exted the closig date by oe moth, a writte request must be filed with the ASCE Maagig Editor. The mauscript for this paper was submitted for review ad possible publicatio o August 29, 2006; approved o August 29, This paper is part of the Joural of Hydrologic Egieerig, Vol. 2, No. 4, July, ASCE, ISSN /2007/ /$ F F, G G, C C Thus, for example, F might be ormal with bivariate parameter =, 2 ; G might be gamma with parameter =,; ad C might be take from the Farlie Gumbel Morgester family of copulas, defied for each, by C u,v = uv + uv u v, u,v 0, 2 The mai advatage provided to the hydrologist by this approach is that the selectio of a appropriate model for the depedece betwee X ad Y, represeted by the copula, ca the proceed idepedetly from the choice of the margial distributios. For a itroductio to the theory of copulas ad a large selectio of related models, the reader may refer, e.g., to the moographs by Joe 997 ad Nelse 999, or to reviews such as Frees ad Valdez 998 ad Cherubii et al. 2004, i which actuarial ad fiacial applicatios are cosidered. While the theoretical properties of these objects are ow fairly well uderstood, iferece for copula models is, to a extet, still uder developmet. The literature o the subject is yet to be collated, ad most of it is ot writte with the ed user i mid, makig it difficult to decipher except for the most mathematically iclied. The aim of this paper is to preset, i the simplest terms possible, the successive steps required to build a copula model for hydrological purposes. To this ed, a fictitious data set of very small size will be used to illustrate the diagostic ad iferetial tools curretly available. Although ituitio will be give for the various techiques to be preseted, emphasis will be put o their implemetatio, rather tha o their theoretical foudatio. Therefore, computatios will be preseted i more detail tha usual, at the expese of exhaustive mathematical expositio, for which the reader will oly be give appropriate refereces. The pedagogical data set to be used throughout the paper is itroduced i the Depedece ad Raks sectio, where it will be explaied why statistical iferece cocerig depedece structures should always be based o raks. This will lead, i the Measurig Depedece sectio, to the descriptio of classical oparametric measures of depedece ad tests of idepedece. Exploratory tools for ucoverig depedece ad measurig it will be reviewed i the Additioal Graphical Tools for Detectig Depedece sectio. Poit ad iterval estimatio for JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 347

2 Table. Learig Data Set i X i Y i ,. At the other extreme, it ca also be show that i order for Y to be a determiistic fuctio of X, C must be either oe of the two copulas Wu,v = max0,u + v or Mu,v = miu,v which are usually referred to as the Fréchet Hoeffdig bouds i the statistical literature; see, e.g., Fréchet 95 or Nelse 999, p. 9. Whe C=W, Y is a decreasig fuctio of X, while Y is mootoe icreasig i X whe C = M. More geerally, ay copula C represets a model of depedece that lies somewhere betwee these two extremes, a fact that traslates ito the iequalities Wu,v Cu,v Mu,v, u,v 0, To get a feelig for the depedece betwee X ad Y, it is traditioal to look at the scatter plot of the pairs X,Y,...,X,Y. Such a represetatio is give i Fig. a for the followig fictitious radom sample of size =6 from the bivariate stadard ormal distributio with zero correlatio. This example will be used for illustratio purposes throughout the paper. Fig.. a Covetioal scatter plot of the pairs X i,y i ; b correspodig scatter plot of the pairs Z i,t i =e X i,e 3Y i depedece parameters from copula models will the be preseted i the Estimatio sectio. Recet goodess-of-fit techiques will be illustrated i the Goodess-of-Fit Tests sectio. The Applicatio sectio will discuss i detail a cocrete hydrological implemetatio of this methodology. This will lead to the cosideratio of additioal tools for the treatmet of extremevalue depedece structures i the Graphical Diagostics for Bivariate Extreme-Value Copulas sectio. Fial remarks will the be made i the Coclusio sectio. Depedece ad Raks Suppose that a radom sample X,Y,...,X,Y is give from some pair X,Y of cotiuous variables, ad that it is desired to idetify the bivariate distributio Hx,y that characterizes their joit behavior. I view of Sklar s represetatio theorem, there exists a uique copula C for which idetity, Eq., holds. Therefore, just as Fx ad Gy give a exhaustive descriptio of X ad Y take separately, the joit depedece betwee these variables is fully ad uiquely characterized by C. It is easy to see, for example, that X ad Y are stochastically idepedet if ad oly if C=, where u,v=uv for all u,v Learig Data Set Table shows six idepedet pairs of mutually idepedet observatios X i, Y i geerated from the stadard N0, distributio usig the statistical freeware R R Developmet Core Team For simplicity, ad without loss of geerality, the pairs were labeled i such a way that X X 6. While there is othig fudametally wrog with lookig at the patter of the pairs X i,y i for example, to look for liear associatio, it must be realized that this picture does ot oly icorporate iformatio about the depedece betwee X ad Y, but also about their margial behavior. To drive this poit home, cosider the trasformed pairs Z i = expx i, T i = exp3y i, i 6 whose scatter plot, show i Fig. b, is drastically differet from the origial oe. I effect, both pictures are distortios of the depedece betwee the pairs X,Y ad Z,T, which is characterized by the same copula, C, whatever it may be. More geerally, if ad are two icreasig trasformatios with iverses ad, the copula of the pair Z,T with Z=X ad T=Y is the same as that of X,Y. Let H * z,t = C * F * z,g * t 3 be the Sklar represetatio of the joit distributio of the pair Z,T. Sice the margial distributios of Z ad T are give by ad F * z =PZ z =PX z = F z G * t =PT t =PY t = G t 348 / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

3 Table 2. Raks for the Learig Data Set of Table i R i S i oe has H * z,t =PZ z,t t =PX z,y t =H z, t = CF z,g t =CF * z,g * t 4 for all choices of z,tr. It follows at oce from the compariso of Eqs. 3 ad 4 that C * =C. Expressed i differet terms, the above developmet meas that the uique copula associated with a radom pair X,Y is ivariat by mootoe icreasig trasformatios of the margials. Sice the depedece betwee X ad Y is characterized by this copula, a faithful graphical represetatio of depedece should exhibit the same ivariace property. Amog fuctios of the data that meet this requiremet, it ca be see easily that the pairs of raks R,S,...,R,S associated with the sample are the statistics that retai the greatest amout of iformatio; see, e.g., Oakes 982. Here, R i stads for the rak of X i amog X,...,X, ad S i stads for the rak of Y i amog Y,...,Y. These raks are uambiguously defied, because ties occur with probability zero uder the assumptio of cotiuity for X ad Y. Pairs of raks correspodig to the learig data set are give i Table 2. Displayed i Fig. 2a is the scatter plot of the pairs R i,s i correspodig to these X i,y i. Fig. 2b shows the graph of the pairs R * i,s * i associated with the Z i,t i. The result is obviously the same. It is the most judicious represetatio of the copula C that oe could hope for. Upo rescalig of the axes by a factor of /+, oe gets a set of poits i the uit square 0, 2, which form the domai of the so-called empirical copula Deheuvels 979, formally defied by C u,v = v R i + u, S i + with A deotig the idicator fuctio of set A. For ay give pair u,v, it may be show that C u,v is a rak-based estimator of the ukow quatity Cu,v whose large-sample distributio is cetered at Cu,v ad ormal. Measurig Depedece It was argued above that the empirical copula C is the best sample-based represetatio of the copula C, which is itself a characterizatio of the depedece i a pair X,Y. It would make sese, therefore, to measure depedece, both empirically ad theoretically, usig C ad C, respectively. It will ow be explaied how this leads to two well-kow oparametric measures of depedece, amely Spearma s rho ad Kedall s tau. Fig. 2. Displayed i a is a scatter plot of the pairs R i,s i of raks derived from the learig data set X i,y i,i6. As for b, it shows a scatter plot of the pairs R i *,S i * of raks derived from the trasformed data Z i,t i =expx i,exp3y i, i6. For obvious reasos, the two graphs are actually idetical. Spearma s Rho Mimickig the familiar approach of Pearso to the measuremet of depedece, a atural idea is to compute the correlatio betwee the pairs R i,s i of raks, or equivaletly betwee the poits R i /+,S i /+ formig the support of C. This leads directly to Spearma s rho, viz. where = R i R S i S, R i R 2 S i S 2 R = R i = + = S i = S 2 This coefficiet, which may be expressed more coveietly i the form JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 349

4 2 = + R i S i 3 + shares with Pearso s classical correlatio coefficiet, r, the property that its expectatio vaishes whe the variables are idepedet. However, is theoretically far superior to r, i that. E = ± occurs if ad oly if X ad Y are fuctioally depedet, i.e., wheever their uderlyig copula is oe of the two Fréchet Hoeffdig bouds, M or W; 2. I cotrast, Er = ± if ad oly if X ad Y are liear fuctios of oe aother, which is much more restrictive; ad 3. estimates a populatio parameter that is always well defied, whereas there are heavy-tailed distributios such as the Cauchy, for example for which a theoretical value of Pearso s correlatio does ot exist. For additioal discussio o these poits, see, e.g., Embrechts et al As it turs out, is a asymptotically ubiased estimator of =20, 2 uvdcu,v 3=20, 2 Cu,vdvdu 3 where the secod equality is a idetity origially prove by Hoeffdig 940 ad exteded by Quesada-Molia 992. To show this, oe may use the fact that uvdc u,v 3= 2 20, 2 R i S i + + 3= + ad that C C as. For more precise coditios uder which this result holds, see, e.g., Hoeffdig 948. Note i passig that uder the ull hypothesis H 0 :C= of idepedece betwee X ad Y, the distributio of is close to ormal with zero mea ad variace /, so that oe may reject H 0 at approximate level =5%, for istace, if z /2 =.96. Example For the observatios from the learig data set, a simple calculatio yields =/35=0.028, while r = Here, there is o reaso to reject the ull hypothesis of idepedece. For, if Z is a stadard ormal radom variable, the P-value of the test based o is 2PrZ 5/35=94.9%. Give a family C of copulas idexed by a real parameter, the theoretical value of is, typically, a mootoe icreasig fuctio of. A sufficiet coditio for this is that the copulas be ordered by positive quadrat depedece PQD, which meas that the implicatio C u,vc u,v is true for all u,v0,. The origial defiitio of PQD as a cocept of depedece goes back to Lehma 966; the same orderig, rediscovered by Dhaee ad Goovaerts 996 i a actuarial cotext, is ofte referred to as the correlatio or cocordace orderig i that field. I the Farlie Gumbel Morgester model, for example, oe has Fig. 3. Spearma s rho a ad Kedall s tau b as a fuctio of Pearso s correlatio i the bivariate ormal model 2 c u,v = uv C u,v =+ 2u 2v sice C is absolutely cotiuous i this case. A simple calculatio the yields 0 0 uvc u,vdvdu = ad, hece, =/3, as iitially show by Schucay et al As a secod example, if X,Y follows a bivariate ormal distributio with correlatio r, a somewhat itricate calculatio to be foud, e.g., i Kruskal 958, shows that arcsi FxGydHx,y 3= 6 r =2 2 where 0, 2 uvdc u,v =0 0 uvc u,vdvdu For those people accustomed to thikig i terms of r, the above formula may suggest that a serious effort would be required to thik of correlatio i terms of Spearma s rho i the traditioal bivariate ormal model. As show i Fig. 3a, however, the differece betwee ad r is miimal i this cotext. 350 / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

5 W =0, 2 C u,vdc u,v Usig Eq. 6 ad the fact that uder suitable regularity coditios, C C as, oe ca coclude with Hoeffdig 948 that is a asymptotically ubiased estimator of the populatio versio of Kedall s tau, give by =40, 2 Cu,vdCu,v Fig. 4. Two pairs of cocordat a ad discordat b observatios Kedall s Tau A secod, well-kow measure of depedece based o raks is Kedall s tau, whose empirical versio is give by = P Q 2 = 4 P where P ad Q =umber of cocordat ad discordat pairs, respectively. Here, two pairs X i,y i, X j,y j are said to be cocordat whe X i X j Y i Y j 0, ad discordat whe X i X j Y i Y j 0. Oe eed ot worry about ties, sice the borderlie case X i X j Y i Y j =0 occurs with probability zero uder the assumptio that X ad Y are cotiuous. The characteristic patters of cocordat ad discordat pairs are displayed i Fig. 4. It is obvious that is a fuctio of the raks of the observatios oly, sice X i X j Y i Y j 0 if ad oly if R i R j S i S j 0. Accordigly, is also a fuctio of C. To make the coectio, itroduce I ij = if X j X i,y j Y i 0 otherwise for arbitrary i j, ad let I ii = for all i,...,. Observe that P = 2 ji I ij + I ji = ji I ij = + I ij j= sice I ij +I ji = if ad oly if the pairs X i,y i ad X j,y j are cocordat. Now write W i = I ij = j= # j:x j X i,y j Y i so that if W =W + +W /, the P = + 2 W ad =4 W +3 The coectio with C the comes from the fact that by defiitio W i = C R i +, S i + hece 5 6 A alterative test of idepedece ca be based o, sice uder H 0, this statistic is close to ormal with zero mea ad variace 22+5/9. Thus, H 0 would be rejected at approximate level =5% if Example (Cotiued) For the observatios from the learig data set, a simple calculatio yields =/5= Here, there is o reaso to reject the ull hypothesis of idepedece. For, if Z is a stadard ormal radom variable, the P-value of the test based o is 2PrZ0.88=85.%. As for Spearma s rho, the theoretical value of Kedall s tau is a mootoe icreasig fuctio of the real parameter wheever a family C of copulas is ordered by positive quadrat depedece. I the Farlie Gumbel Morgester model, for example, oe has C u,vdc u,v C u,vc u,vdvdu 0, =0 0 2 which reduces to /8+/4, hece =2/9, as per Schucay et al For the bivariate ormal model with correlatio r, Kruskal 958 has show that =4 Hx,ydHx,y = 2 arcsir As show i Fig. 3b, is also early a liear fuctio of r i this special case. Other Measures ad Tests of Depedece Although Spearma s rho ad Kedall s tau are the two most commo statistics with which depedece is measured ad tested, may alterative rak-based procedures have bee proposed i the statistical literature. Most of them are based o expressios of the form Ju,vdC u,v where J is some suitably regular score fuctio. Thus, while Ju,v=uv is the basis of Spearma s statistic, as see earlier, the choice Ju,v= u v, e.g., yields the va der Waerde statistic. Geest ad Verret 2005, who review this literature, explai how each J should be chose so as to yield the most powerful testig procedure agaist a specific class of copula alteratives. JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 35

6 I the absece of privileged iformatio about the suspected departure from idepedece, however, omibus procedures such as those based o ad usually perform well. See Deheuvels 98 or Geest ad Rémillard 2004 for other geeral tests based o the empirical copula process C = C C. Additioal Graphical Tools for Detectig Depedece Besides the scatter plot of raks, two graphical tools for detectig depedece have recetly bee proposed i the literature, amely, chi-plots ad K-plots. These will be briefly described i tur. Chi-Plots Chi-plots were origially proposed by Fisher ad Switzer 985 ad more fully illustrated i Fisher ad Switzer 200. Their costructio is ispired from cotrol charts ad based o the chisquare statistic for idepedece i a two-way table. Specifically, itroduce ad H i = # j i:x j X i,y j Y i = W i F i = # j i:x j X i G i = # j i:y j Y i Notig that these quatities deped exclusively o the raks of the observatios, Fisher ad Switzer propose to plot the pairs i, i, where ad i = H i F i G i Fi F i G i G i i = 4 sig F ig i max F 2 i,g 2 i where F i=f i /2, G i=g i /2 for i,...,. To avoid outliers, they recommed that what should be plotted are oly the pairs for which i Fig. 5 shows the resultig graph for the learig data set of Tables ad 2. The coordiates of the poits ad the itermediate calculatios that lead to them are summarized i Table 3. Note that, i geeral, betwee two ad four poits may be lost due to divisio by zero; such is the case here for three poits. Give that the origial data set cosisted of six observatios oly, this leaves oly 6 3=3 poits o the graph, which is obviously ot particularly revealig. However, the real-life applicatios cosidered i the Applicatio sectio ad by Fisher ad Switzer 985, 200 provide more covicig evidece of the usefuless of this tool. Fisher ad Switzer 985, 200 argue that i, i,. While i =measure of distace betwee the pair X i,y i ad the ceter of the scatter plot, i =siged square root of the traditioal chi-square test statistic for idepedece i the two-way table geerated by coutig poits i the four regios delieated by the lies x=x i ad y=y i. Sice oe would expect H i F i G i for all i uder idepedece, values of i that fall too far from zero are idicative of departures from that hypothesis. To help idetify such departures, Fisher ad Switzer 985, 200 suggest that cotrol limits be draw at ±c p /, where cp is selected so that approximately 00p% of the pairs i, i lie betwee the lies. Through simulatios, they foud that the c p values.54,.78, ad 2.8 correspod to p=0.9, 0.95, ad 0.99, respectively. K-Plots Aother rak-based graphical tool for visualizig depedece was recetly proposed by Geest ad Boies It is ispired by the familiar otio of QQ-plot. Specifically, their techique cosists i plottig the pairs W i:,h i for i,...,, where H H are the order statistics associated with the quatities H,...,H itroduced i the Chi-Plots subsectio. As for W i:,itisthe expected value of the ith statistic from a radom sample of size from the radom variable W=CU,V=HX,Y uder the ull hypothesis of idepedece betwee U ad V or betwee X ad Y, which is the same. The latter is give by where Fig. 5. Chi-plot for the learig data set W i: = wk 0 wk 0 w i K 0 w i dw i 0 Table 3. Computatios Required for Drawig the Chi-Plot Associated with the Learig Data Set of Table i H i F i G i i i / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

7 Table 4. Coordiates of Poits Displayed o the K-Plot Associated with the Learig Data Set of Table i W i: H i thus rely oly o the raks of the observatios, which are the best summary of the joit behavior of the radom pairs. Fig. 6. K-plot for the learig data set. Superimposed o the graph are a straight lie correspodig to the case of idepedece ad a smooth curve K 0 w associated with perfect positive depedece. Estimate Based o Kedall s Tau To fix ideas, suppose that the uderlyig depedece structure of a radom pair X,Y is appropriately modeled by the Farlie Gumbel Morgester family C defied i Eq. 2. I this case, is real ad as see i the Kedall s Tau subsectio there exists a immediate relatio i this model betwee the parameter ad the populatio value of Kedall s tau, amely = 2 9 K 0 w =PUV w PU w =0 vdv w w dv =0 +w v dv = w w logw ad k 0 =correspodig desity. The values of W :6,...,W 6:6 required to produce Fig. 6 ca be readily computed usig ay symbolic calculator, such as Maple. They are give i Table 4. The iterpretatio of K-plots is similar to that of QQ-plots: just as curvature is problematic, e.g., i a ormal QQ-plot, ay deviatio from the mai diagoal is a sig of depedece i K-plots. Positive or egative depedece may be suspected i the data, depedig whether the curve is located above or below the lie y=x. Roughly speakig, the further the distace, the greater the depedece. I this costructio, perfect egative depedece i.e., C=W would traslate ito a strig of data poits aliged o the x-axis. As for perfect positive depedece i.e., C=M, it would materialize ito data aliged o the curve K 0 w show o the graph. As for the chi-plot, the liearity or lack thereof i the K-plot displayed i Fig. 6 is hard to detect, give the extremely small size of the learig data set. However, see the Applicatio sectio ad Geest ad Boies 2003 for more compellig illustrartios of K-plots. Estimatio Now suppose that a parametric family C of copulas is beig cosidered as a model for the depedece betwee two radom variables X ad Y. Give a radom sample X,Y,...,X,Y from X,Y, how should be estimated? This sectio reviews differet oparametric strategies for tacklig this problem, depedig o whether is real or multidimesioal. Oly rak-based estimators are cosidered i the sequel. This methodological choice is justified by the fact, highlighted earlier, that the depedece structure captured by a copula has othig to do with the idividual behavior of the variables. A fortiori, ay iferece about the parameter idexig a family of copulas should Give a sample value of computed from Eq. 5 or 6, a simple ad ituitive approach to estimatig would the cosist of takig = 9 2 Sice is rak-based, this estimatio strategy may be costrued as a oparametric adaptatio of the celebrated method of momets. More geerally, if =g for some smooth fuctio g, the =g may be referred to as the Kedall-based estimator of. A small adaptatio of Propositio 3. of Geest ad Rivest 993 implies that where ad 4S N0, S 2 = W i + W i 2W 2 W i = I ji = j= # j:x i X j,y i Y j Therefore, a applicatio of Slutsky s theorem, also kow as the Delta method, implies that as 2 N, 4Sg Accordigly, a approximate 00 % cofidece iterval for is give by ± z /2 4Sg For a alterative cosistet estimator of the asymptotic variace of, see for istace, Samara ad Radles 988. JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 353

8 Table 5. Itermediate Values Required for the Computatio of the Stadard Error Associated with Kedall s Tau i W i W i Example (Cotiued) For the learig data set of Table, it was see earlier that =/5, hece =0.3. Usig the itermediate quatities summarized i Table 5, oe fids S 2 =0.043, hece a approximate 95% cofidece iterval for this estimatio is,, sice g9/2, ad hece,.964sg / =2.99. While the size of the stadard error may appear exceedigly coservative, this result is ot surprisig, cosiderig that the sample size is =6. The popularity of as a estimator of the depedece parameter stems i part from the fact that closed-form expressios for the populatio value of Kedall s tau are available for may commo parametric copula models. Such is the case, i particular, for several Archimedea families of copulas, e.g., those of Ali et al. 978, Clayto 978, Frak 979, Gumbel Hougaard Gumbel 960, etc. Specifically, a copula C is said to be Archimedea if there exists a covex, decreasig fuctio :0, 0, such that =0 ad Cu,v = u + v is valid for all u,v0,. As show by Geest ad MacKay 986 t =+40 t dt 7 Table 6 gives the geerator ad a expressio for for the three most commo Archimedea models. Algebraically closed formulas are available for various other depedece models, e.g., extreme-value or Archimax copulas. See, for example, Ghoudi et al. 998 or Capéraà et al Estimate Based o Spearma s Rho Whe the depedece parameter is real, a alterative rakbased estimator that remais i the spirit of the method of momets cosists of takig = h where =h represets the relatioship betwee the parameter ad the populatio value of Spearma s rho. I the cotext of the Farlie Gumbel Morgester family of copulas, for example, it was see earlier that =/3, so that =3 would be a alterative oparametric estimator to =9 /2. Now it follows from stadard covergece results about empirical processes to be foud, e.g., i Chapter 5 of Gaessler ad Stute 987, that N, 2 where the asymptotic variace 2 depeds o the uderlyig copula C i a way that has bee described i detail by Borkowf Arguig alog the same lies as i the Estimate Based o Table 6. Three Commo Families of Archimedea Copulas, Their Geerator, Their Parameter Space, ad a Expressio for the Populatio Value of Kedall s Tau Family Geerator Parameter Kedall s tau Clayto t / /+2 Frak Kedall s Tau subsectio, it ca the be see that uder suitable regularity coditios o h N, h 2 where 2 =suitable estimator of 2. A approximate 00 % cofidece iterval for is the give by ± z /2 h Substitutig C for C i the expressios reported by Borkowf 2002, a very atural, cosistet estimate for 2 is give by where ad 2 = 44 9A 2 + B +2C +2D +2E C = 3 j= k= R i S i A = + + B = D = 2 j= E = 2 j= log e t e R 4/+4D / Gumbel Hougaard logt / Note: Here, D = 0 x//e x dx is the first Debye fuctio. R i R i + S i 2 S i R k R i,s k S j + 4 A S i + max S R j i + +, R j + max R i R S j i + + +, S j + Example (Cotiued) For the learig data set of Table, it was see earlier that =/35, hece =3/ Burdesome but simple calculatios yield =7.77, hece a approximate 95% cofidece iterval for this estimatio is,, sice h3, ad hece,.96 h / =8.66. Here agai, the size of the stadard error is quite large, as might be expected give that =6. Maximum Pseudolikelihood Estimator I classical statistics, maximum likelihood estimatio is a wellkow alterative to the method of momets that is usually more / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

9 efficiet, particularly whe is multidimesioal. I the preset cotext, a adaptatio of this approach to estimatio is required if iferece cocerig depedece parameters is to be based exclusively o raks. Such a adaptatio was described i broad terms by Oakes 994 ad was later formalized ad studied by Geest et al. 995 ad by Shih ad Louis 995. The method of maximum pseudolikelihood, which requires that C be absolutely cotiuous with desity c, simply ivolves maximizig a rak-based, log-likelihood of the form = logc R i +, S i 8 + The latter is exactly the expressio oe gets whe the ukow margial distributios F ad G i the classical log-likelihood = logc FX i,gy i are replaced by rescaled versios of their empirical couterparts, i.e. ad F x = X i x + G y = Y i y + That this substitutio yields formula 8 is immediate, oce it is realized that F X i =R i /+ ad G Y i =S i /+ for all i,...,. This method may seem superficially less attractive tha the iversio of Kedall s tau or Spearma s rho, both because it ivolves umerical work ad requires the existece of a desity c. At the same time, however, it is much more geerally applicable tha the other methods, sice it does ot require the depedece parameter to be real. The procedure for estimatig a multivariate ad computig associated approximate cofidece regio is described by Geest et al For simplicity, it is oly preseted here i the case where is real; however, see the Applicatio sectio for the bivariate case. Lettig ċ u,v=c u,v/, Geest et al. 995 show uder mild regularity coditios that the root ˆ of the equatio = ċ R i = +, S i + c R i +, S i + =0 is uique. Furthermore ˆ N, 2 where 2 depeds exclusively o the true uderlyig copula C as per Propositio 2. of Geest et al As metioed by these authors, a cosistet estimate of 2 is give by where ˆ 2 = ˆ 2 2 /ˆ ad ˆ 2 = ˆ 2 = M i M 2 N i N 2 are sample variaces computed from two sets of pseudoobservatios with meas M =M + +M / ad N =N + +N /, respectively. To compute the pseudo-observatios M i ad N i, oe should proceed as follows: Step : Relabel the origial data X,Y,...,X,Y i such a way that X X ; as a cosequece oe the has R =,...,R =. Step 2: Write L,u,v=log c u,v ad compute L, L u, ad L v, which are the derivatives of L with respect to, u, ad v, respectively. Step 3: For i,...,, set i N i = L ˆ, +, S i + Step 4: For i,...,, let also M i = N i j L ˆ, j=i +, S j j uˆ, +L +, S j + j L ˆ, S j S i +, S j j vˆ, +L +, S j + Example (Cotiued) Suppose that a Farlie Gumbel Morgester copula model is beig cosidered for the learig data set of Table. I this case ad c u,v =+ 2u 2v ċ u,v c u,v = 2u 2v + 2u 2v Accordigly, the log-pseudolikelihood associated with this model is give by = log+ 2R i + 2S i + ad the correspodig pseudoscore fuctio is 2 R i + 2 S i + = + 2 R i + 2 S i + + 2R = i + 2S i R i + 2S i These two fuctios are plotted i Fig. 7 with =6 ad the values of R i ad S i give i Table 2. Upo substitutio, oe gets ˆ = as the uique root of the equatio JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 355

10 Table 7. Values of the Costats N i ad M i Required to Compute a Approximate Cofidece Iterval for the Maximum of Pseudolikelihood Estimator ˆ i N i M i oe gets ˆ 2 =0.0677/0.0707=0.958 ad.96ˆ / = The cofidece iterval for the maximum likelihood estimator is give by 0.684, Fig. 7. Graphs of a ad b for the learig data set of Table whe the assumed model is the Farlie Gumbel Morgester family of copulas 5 = =0 I the preset case 2u 2v L,u,v = + 2u 2v 2 2v L u,u,v = + 2u 2v 2 2u L v,u,v = + 2u 2v Usig the itermediate calculatios summarized i Table 7, Other Estimatio Methods Although they are the most commo, estimators based o the maximizatio of the pseudolikelihood ad o the iversio of either Kedall s tau or Spearma s rho are ot the oly rakbased procedures available for selectig appropriate values of depedece parameters i a copula-based model. Tsukahara 2005, for example, recetly ivestigated the behavior ad performace of two ew classes of estimators derived from miimum-distace criteria ad a estimatig-equatio approach. I his simulatios, however, the maximum pseudolikelihood estimator tured out to have the smallest mea-squared error. Circumstaces uder which the latter approach is asymptotically semiparametrically efficiet were delieated by Klaasse ad Weller 997 ad by Geest ad Werker See Biau ad Wegkamp 2005 for aother rak-based, miimum-distace method for depedece parameter estimatio. I all fairess, it should be metioed that the exclusive reliace o raks for copula parameter estimatio advocated here does ot make complete cosesus i the statistical commuity. I his book, Joe 997, Chap. 0 recommeds a parametric twostep procedure ofte referred to as the iferece from margis or IFM method. As i the pseudolikelihood approach described above, the estimate of is obtaied through the maximizatio of a fuctio of the form = logc Fˆ X i,ĝy i However, while the rak-based method takes Fˆ =F ad Ĝ=G, Joe 997 substitutes Fˆ =F ad Ĝ=G, where F ad G =suitable parametric families for the margis, ad ad =stadard maximum likelihood estimates of their parameters, derived from the observed values of X ad Y, respectively. Cherubii et al. 2004, Sectio 5.3 poit out that the IFM method may be viewed as a special case of the geeralized method of momets with a idetity weight matrix. Joe 2005 quatifies the asymptotic efficiecy of the approach i differet circumstaces. Although they usually perform well, the estimates of the associatio parameters derived by the IFM techique clearly deped o the choice of F ad G, ad thus always ru the risk of beig uduly affected if the models selected for the margis tur out to be iappropriate see e.g., Kim et al For completeess, it may be worth metioig that aother developig body of literature proposes the use of kerel methods to derive a smooth estimate of a copula or its desity, without assumig ay specific parametric form for it. See, e.g., Gijbels ad Mieliczuk 990 or Fermaia ad Scaillet / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

11 Goodess-of-Fit Tests I typical modelig exercises, the user has a choice betwee several differet depedece structures for the data at had. To keep thigs simple, suppose that two copulas C ad D were fitted by some arbitrary method. It is the atural to ask which of the two models provides the best fit to the observatios. Both iformal ad formal ways of tacklig this questio will be discussed i tur. Graphical Diagostics Whe dealig with bivariate data, possibly the most atural way of checkig the adequacy of a copula model would be to compare a scatter plot of the pairs R i /+,S i /+ i.e., the support of the empirical copula C with a artificial data set of the same size geerated from C. To avoid arbitrariess iduced by samplig variability, however, a better strategy cosists of geeratig a large sample from C, which effectively amouts to portrayig the associated copula desity i two dimesios. Simple simulatio algorithms are available for most copula models; see, e.g., Devroye 986, Chap., or Whela 2004 for Archimedea copulas. I the bivariate case, a good strategy for geeratig a pair U,V from a copula C proceeds as follows: Step : Geerate U from a uiform distributio o the iterval 0,. Step 2: Give U = u, geerate V from the coditioal distributio Q u v =PV vu = u = u Cu,v by settig V=Q u U *, where U * =aother observatio from the uiform distributio o the iterval 0,. Whe a explicit formula does ot exist for Q u, the value v=q u u * ca be determied by trial ad error or more effectively usig the bisectio method; see Devroye 986, Chap. 2. Thus, for the Farlie Gumbel Morgester family of copulas, oe fids Q u v = v + v v 2u for all u,v0,, ad hece = u* if b = 2u =0 Q u u * b + b bu * if b = 2u 2b Fig. 8a displays 00 pairs U i,v i geerated with this algorithm, takig =ˆ = as deduced from the method of maximum pseudolikelihood. The six poits of the learig data set, represeted by crosses, are superimposed. Give the small size of the data set, it is hard to tell from this graph whether the selected model accurately reproduces the depedece structure revealed by the six observatios. To show the effectiveess of the procedure, the same exercise was repeated i Fig. 8b, usig a Clayto copula with =0. Here, the iappropriateess of the model is apparet, as might have bee expected from the fact that =5/6 for this copula, while =/5. Aother optio, which is related to K-plots, cosists of comparig the empirical distributio K of the variables W,...,W itroduced previously with K, i.e., the theoretical distributio of W=C U,V, where the pair U,V is draw from C. Oe possibility is to plot K ad K o the same graph to see Fig. 8. a Scatter plot of 00 pairs U i,v i simulated from the Farlie Gumbel Morgester with parameter = b Similar plot, geerated from the Clayto copula with =5/6. O both graphs, the six poits of the learig data set are idicated with a cross. how well they agree. Alteratively, a QQ-plot ca be derived from the order statistics W W by plottig the pairs W i:,w i for i,...,. I this case, however, W i: is the expected value of the ith order statistic from a radom sample of size from K, rather tha from K 0, as was the case i the K-plot. I other words W i: = wk wk w i K w i dw 9 i 0 where K w=pc U,Vw ad k =dk w/dw. These two graphs are preseted i Fig. 9 for the learig data set ad Clayto s copula with parameter =ˆ =0.449, obtaied by the method of maximum pseudolikelihood. As implied by the data i Table 5, K is a scale fuctio with steps of JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 357

12 Table 8. Coordiates of the QQ-Plot Displayed i Fig. 9b i W i: W i /6 /6 2/6 2/6 4/6 4/6 additioal iformatio about the BIPIT ad its properties ad applicatios, refer to Geest ad Rivest 200 ad Nelse et al Example (Cotiued) Fig. 9b shows a QQ-plot for visual assessmet of the adequacy of the Clayto model for the learig data set. The coordiates of the poits o the graph are give i Table 8. The y-coordiates were obtaied by umerical itegratio, upo substitutio of the specific choice of K give i Eq. 0 ito the geeral formula 9. By costructio, this geeralized K-plot is desiged to yield a approximate straight lie, whe the model is adequate ad the data sufficietly umerous to make a visual assessmet. The effectiveess of the two diagostic tools described above will be demostrated more covicigly i the Applicatio sectio. Formal Tests of Goodess-of-Fit Formal methodology for testig the goodess-of-fit of copula models is just emergig. To the writers kowledge, the first serious effort to develop such a procedure was made by Wag ad Wells 2000 i the cotext of Archimedea models. Ispired by Geest ad Rivest 993, these authors proposed to compute a Cramér vo Mises statistic of the form Fig. 9. a Graphs of K ad K for the learig data set ad Clayto s copula with =ˆ = b Geeralized K-plot providig a visual check of the goodess-of-fit of the same model o these data. height /3 at w=/6, 2/6, ad 4/6. This is portrayed i dotted lies i Fig. 9a. The solid lie which is superimposed is K w = w + w w, w 0, 0 Sice K K ad K K as show by Geest ad Rivest 993, the two curves should look very similar whe the data are sufficietly abudat ad the model is good, i.e., whe K=K. More geerally, see Barbe et al. 996 for a study of the largesample behavior of the empirical process K K. I the preset case, the formula for K is easily deduced from the fact, established by Geest ad Rivest 993, that if C is a Archimedea copula with geerator, the distributio fuctio of W=CU,V=HX,Y, called the bivariate probability itegral trasform BIPIT, is give by S = K w K w 2 dw where 0, is a arbitrary cutoff poit. While Theorem 3 i their paper idetifies the limitig distributio of S, the latter is aalytically uwieldy. Furthermore, the bootstrap procedure they propose i replacemet is, of their ow admissio, ieffective. As a result, P-values for the statistic caot be computed. Whe faced with a choice betwee several copulas, therefore, Wag ad Wells 2000 thus ed up recommedig that the model yieldig the smallest value of S be selected. Recetly, Geest et al itroduced two variats of the S statistic ad of the bootstrap procedure of Wag ad Wells 2000 that allow overcomig these limitatios. I additio to beig much simpler to compute tha S ad idepedet of the choice of, the statistics proposed by Geest et al ca be used to test the adequacy of ay copula model, whether Archimedea or ot. More importatly still, P-values associated with these statistics are relatively easy to obtai by bootstrappig. Specifically, the statistics cosidered by Geest et al are of the form Kw = w w w, w 0, It may be observed i passig that idetity 7 is a straightforward cosequece of this result ad the fact that EW=+/4. For ad S =0 K w 2 k wdw 358 / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

13 Table 9. P-Values Estimated by Parametric Bootstrap for Testig the Goodess-of-Fit of the Clayto Copula Model o the Learig Data Set Usig the Cramér vo Mises ad the Kolmogorov Smirov Statistics S ad T T = sup K w 0w where K w= K w K w. Although prima facie these expressios seem just as complicated as S, Geest et al show that i fact, they ca be easily computed as follows: K j j + ad S = 3 + K 2 j= T = j= K K j 2 max i=0,;0j K j + K j 2 K j j K j + i The bootstrap methodology required to compute associated P-values proceeds as follows, say i the case of S : Step : Estimate by a cosistet estimator. Step 2: Geerate N radom samples of size from C ad, for each of these samples, estimate by the same method as before ad determie the value of the test statistic. * * Step 3: If S :N S N:N deote the ordered values of the test statistics calculated i Step 2, a estimate of the critical value of the test at level based o S is give by ad P-value based o a ru of Statistic N=00,000 N=0,000 N=00 N=00 S T * S N :N N # j:s * j S yields a estimate of the P-value associated with the observed value S of the statistic. Here, x simply refers to the iteger part of xr. Obviously, the larger N, the better. I practice, N = 0,000 seems perfectly adequate, although oe could certaily get by with less, if limited i time or computig power. A additioal complicatio occurs whe K caot be writte i algebraic form. I that case, a double bootstrap procedure must be called upo, for which the reader is referred to Geest ad Rémillard Example (Cotiued) Suppose that Clayto s copula model has bee fitted to the learig data set usig some cosistet estimator. To test the adequacy of this depedece structure, oe could the compute the distace betwee K ad K w = w + w w usig either S or T. The correspodig P-values could the be foud via the parametric bootstrap procedure described above. I order to get valid results, however, ote that the same estimatio method must be used at every iteratio of this umerical algorithm. To reduce the itesity of the computig effort, the estimator obtaied through the iversio of Kedall s tau is ofte the most coveiet choice, particularly for Archimedea models. Whe the depedece parameter of Clayto s model is estimated i this fashio, oe gets = =0.43. The observed values of these statistics are the easily foud to be S = 0.272, T =.053 Table 9 reports the simulated P-values obtaied via parametric bootstrappig for oe ru of N=00,000, oe ru of N=0,000, ad two rus of N = 00. The discrepacy observed betwee P-values derived from the two rus at N = 00 illustrates the importace of takig N large eough to isure reliable coclusios. As ca be see from Table 9, takig N=00,000 istead of N=0,000 did ot chage the estimated P-values much, which is reassurig. Notwithstadig these differeces, either of the two tests leads to the rejectio of Clayto s model. Give the sample size, this is of course usurprisig. Oe drawback of this geeral strategy to goodess-of-fit testig is that as the umber of variables icreases, the uivariate summary represeted by the probability itegral trasformatio W=CU,...,U d =HX,...,X d ad its distributio fuctio Kw is less ad less represetative of the multivariate depedece structure embodied i C. For bivariate or trivariate applicatios such as are commo i hydrology, there is, however, aother more serious difficulty associated with a test based o S, S,orT. This arises from the fact that a give theoretical distributio K ca sometimes correspod to two differet copulas. I other words, it may happe that K is ot oly the distributio fuctio of W=CU,V but also that of W =C U,V, where U,V is distributed as C. I fact, Nelse et al show that uless C belogs to the Bertio family of copulas Bertio 977; Fredricks ad Nelse 2002, there always exists C i that class such that K=K ad CC. To illustrate the difficulties associated with the lack of uiqueess of K, cosider the class of bivariate extreme-value copulas, which are of the form Cu,v = exploguva loguv logu where A:0, /2,=some covex mappig such that Atmaxt, t for all t0,. See, e.g., Geoffroy 958, Sibuya 960, or Ghoudi et al The populatio value of Spearma s rho for this class of copulas ca be writte as A Aw + =20 2 dw 3 Also, as show by Ghoudi et al. 998, the distributio fuctio of W=CU,V for C i this class is give by where K A w = w A w logw JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 359

14 A =0 w w Awdw Aw wheever the secod derivative of A is cotiuous. I particular, ote that K A does ot deped o the whole fuctio A, but oly o the populatio value of Kedall s tau iduced by A. For this reaso, formal ad iformal goodess-of-fit procedures based o a compariso of K ad K could ot possibly distiguish, e.g., betwee two extreme-value copulas whose real-valued parameters would be estimated through iversio of Kedall s tau. I statistical parlace, the above-metioed tests are ot cosistet. As already metioed by Fermaia 2005 ad by Geest et al. 2006, a obvious way to circumvet the cosistecy issue would be to base a goodess-of-fit test directly o the distace betwee C ad C. Sice the limitig distributio of the process C C is very complex, however, this strategy could oly be implemeted through a itesive use of the parametric bootstrap. For additioal iformatio i this regard, refer to the Applicatio sectio ad to Geest ad Rémillard The oly other geeral solutio available to date ivolves kerel estimatio of the copula desity, as developed i Fermaia A advatage of his statistic is that it has a stadard chi-square distributio i the limit. The implemetatio of the procedure, however, ivolves arbitrary choices of a kerel, its widow, ad a weight fuctio. As a result, some objectivity is lost. Fially, sice extreme-value copula models are likely to be useful i frequecy aalysis; diagostic ad selectio tools specifically suited to that case will be discussed i the cotext of the hydrological applicatio to be cosidered ext. Applicatio The Harricaa watershed is located i the orthwest regio of the provice of Québec. The Harricaa River origiates from several lakes ear Val d Or ad empties ito James Bay about 553 km orth. The ame of the river takes its origi from the Algoqui word Naikaa meaig the mai way. The daily discharges of the Harricaa River at Amos measured at Eviromet Caada Statio Number 04NA00 have bee used several times i the hydrology literature sice the data are available from 94 to preset; see, e.g., Bobée ad Ashkar 99 ad Bâ et al The mai characteristics of the watershed are the followig: draiage area of 3,680 km 2 at the gaugig statio, mea altitude 380 m, 23% of lakes ad swamp, ad 72% of forest. Sprig represets the high flow seaso due to the cotributio of seasoal sowmelt to river ruoff. Geerally, a combiatio of sowmelt ad raifall evets geerates the aual floods. Fig. 0. QQ-plots showig the fit of margial models for peak a ad volume b for the Harricaa River data Data The data cosidered for the applicatio cosist of the maximum aual flow X i m 3 /s ad the correspodig volume Y i hm 3 for =85 cosecutive years, startig i 95 ad edig i 999. Usig stadard uivariate modelig techiques, the preset writers came to the coclusio that the aual flow X could be appropriately modeled by a Gumbel extreme-value distributio Fˆ with mea 89 m 3 /s ad stadard error 5.5 m 3 /s. As for volume Y, it is faithfully described by a gamma distributio Ĝ with mea, hm 3 ad stadard error hm 3. Fig. 0a ad b show QQ-plots attestig to the good fit of these margial distributios to the observed values of X ad Y, respectively. Sice the focus of the preset study is o modelig the depedece betwee the two variables, othig further will be said about the fit of their margial distributios. As previously emphasized, the choice of margis is immaterial ayway, at least isofar as iferece o the depedece structure of the data is based o raks. Assessmet of Depedece Before a copula model for the pair X,Y is sought, visual tools were used to check for the presece of depedece. The scatter plot of ormalized raks show i Fig. suggests the presece of positive associatio betwee peak flow ad volume, as might be expected. This is cofirmed by the chi-plot ad the K-plot, reproduced i Fig. 2a ad 2b, respectively. As ca be see, most of the poits fall outside the cofidece bad of the chiplot. A obvious curvature is also apparet i the K-plot. Both graphs poit to the existece of a positive relatioship betwee the two variables. To quatify the degree of depedece i the pair X,Y, sample values of Spearma s rho ad Kedall s tau were 360 / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

15 Fig.. Scatter plot of the raks for the Harricaa River data computed, alog with the P-values of the associated tests of idepedece. For, oe fids = ad = 6.38 so that the P-value of the test is 2PrZ =2PrZ6.380%, where Z cotiues to deote a ormal radom variate with zero mea ad uit variace. For, oe gets 9 = ad22 +5 = so that the P-value of this test is eve smaller: 2PrZ %. Fig. 2. Chi-plot a ad K-plot b for the aual peaks ad correspodig volumes of the Harricaa River Choice of Models I order to model the depedece betwee the aual peak X ad the volume Y of the Harricaa River, some 20 families of copulas were cosidered, which could be classified ito four broad categories:. Oe-, two-, ad three-parameter Archimedea models, icludig the traditioal Ali Mikhail Haq, Clayto, Frak Nelse 986; Geest 987, ad Gumbel Hougaard families of copulas ad their extesio described by Geest et al. 998, but also the system of Kimeldorf ad Sampso 975, the class of Joe 993, ad the BB BB3 ad BB6 BB7 classes described i the book of Joe 997, pp ; 2. Extreme-value copulas, icludig besides the Gumbel Hougaard system metioed just above Joe s BB5 family ad the classes of copulas itroduced by Galambos 975, Hüsler ad Reiss 989, ad Taw 988; 3. Meta-elliptical copulas described, e.g., i Fag et al or Abdous et al. 2005, most otably the ormal, the Studet, ad the Cauchy copulas; ad 4. Other miscellaeous families of copulas, such as those of Farlie Gumbel Morgester ad Plackett 965. Some of these families of copulas could be elimiated off had, give that the degrees of depedece they spa were isufficiet to accout for the associatio observed i the data set. This was the case, e.g., for the Ali Mikhail Haq ad Farlie Gumbel Morgester systems. To help sieve through the remaiig models, the use was made of tools described i the Graphical Diagostics subsectio. Give a family C of copulas, a estimate of its parameter was first obtaied by the method of maximum pseudolikelihood, ad the 0,000 pairs of poits were geerated from C. Fig. 3 shows the five best coteders alog with the traditioal bivariate ormal model. As a further graphical check, the margis of the 0,000 radom pairs U i,v i from each of the six estimated copula models C were trasformed back ito the origial uits usig the margial distributios Fˆ ad Ĝ idetified i the Data subsectio for volume ad peak. The resultig scatter plots of pairs X i,y i =Fˆ U i,ĝ V i are displayed i Fig. 4, alog with the actual observatios. While Fig. 3 provides a graphical test of the goodess-of-fit of the depedece structure take i isolatio, Fig. 4 makes it possible to judge globally the viability of the complete model for frequecy aalysis. Keepig i mid the predictive ability that the fial model must have, it was decided to discard the bivariate ormal copula structure, due to the obvious lack of fit of the resultig model i the upper part of the distributio. Hece, of the five depedece structures retaied for the fier aalysis, four were extreme-value copulas, i.e., the BB5, Galambos, Gumbel Hougaard, ad Hüsler Reiss families. The fifth was the two-parameter BB Archimedea model. As ca be see from Table 0, the Gumbel Hougaard family obtais as a JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 36

16 Fig. 3. Simulated radom sample of size 0,000 from six chose families C of copulas with parameter estimated by the method of maximum pseudolikelihood usig the peak volume Harricaa River data, whose pairs of raks are idicated by a X special case of the BB system whe 0, while settig 2 = actually yields the Kimeldorf Sampso family. Likewise, the Galambos ad Gumbel Hougaard distributios are special cases of Family BB5 correspodig, respectively, to = ad 2 0. Estimatio Table gives parameter values for each of the five models, based o maximum pseudolikelihood. For oe-parameter models, 95% cofidece itervals were computed as explaied above. For Fig. 4. Same data as i Fig. 3, upo trasformatio of the margial distributios as per the selected models for the peak ad the volume of the Harricaa River data, whose pairs of observatios are idicated by a X 362 / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

17 Table 0. Defiitio of the Five Chose Families of Copulas with Their Parameter Space Copula C u,v Parameters Gumbel Hougaard exp ũ +ṽ / Galambos uv expũ +ṽ / 0 Hüsler Reiss exp ũ + 2 log ũ ṽ ṽ + 2 log ṽ ũ 0 BB +u 2+v 2 / 2 / 0, 2 BB5 exp ũ +ṽ ũ 2 +ṽ 2 / 2 /, 2 0 Note: With ũ= logu, ṽ= logv ad stadig for the cumulative distributio fuctio of the stadard ormal. q-parameter models with q2, the determiatio of the cofidece regios relies o a estimatio of the limitig variace covariace matrix B B of the estimator ˆ =ˆ,...,ˆ q of =,..., q. Followig Geest et al. 995, the estimate of B is simply the empirical qq variace covariace matrix of the variables N,...,N q, for which a set of pseudo-observatios is available, amely i S i N pi = L pˆ, i,..., + +, where L p deotes the derivative of L,u,v=logc u,v with respect to p. Here, it is assumed that the origial data have bee relabeled so that X X. Likewise, =qq variace covariace matrix of the variables M,...,M q, for which the pseudo-observatios are M pi = N pi j L pˆ, j=i +, S j j uˆ, +L +, S j + j L pˆ, S j S i +, S j j vˆ, +L +, S j + for i,...,. A alterative, possibly more efficiet way of estimatig the iformatio matrix B is give by the Hessia matrix associated with L,u,v at ˆ, amely, the qq matrix whose p,r etry is give by i L p rˆ, +, S i + where L p r stads for the cross derivative of L,u,v with respect to both p ad r. I Table, the cofidece itervals for Models BB ad BB5 were derived usig the latter approach, as it produced somewhat arrower itervals. Goodess-of-Fit Testig As a secod step towards model selectio, oe should look at the geeralized K-plot correspodig to the five families uder cosideratio. The graphs correspodig to the BB are displayed i Fig. 5. For reasos give i the Graphical Diagostics subsectio, the graphs correspodig to the BB5, Gumbel Hougaard, Galambos, ad Hüsler Reiss copulas are idetical, sice they are all extreme-value depedece structures. The graphs appear i Fig. 6. The plots displayed i Figs. 5 ad 6 suggest that both the BB ad extreme-value copula structures are adequate for the data at had. A similar coclusio is draw from the formal goodess-of-fit tests based o S ad T, as idicated i Table 2. Agai, the geeralized K-plot ad the formal goodess-of-fit tests correspodig to the Galambos, Hüsler Reiss, ad BB5 extremevalue copula models yield exactly the same results, as evideced i Table 2. I a attempt to distiguish betwee the extreme-value copula structures, a cosistet goodess-of-fit test could be costructed from the process C C, as evoked but dismissed by Fermaia 2005, o accout of the uwieldy ature of its limit. However, this difficulty ca be overcome easily with the use of a parametric or double parametric bootstrap, whose validity i this cotext has recetly bee established by Geest ad Rémillard The bootstrap procedure is exactly the same as described i the Formal Tests of Goodess-of-Fit sectio, but with S replaced by the Cramér vo Mises statistic CM = C R i +, S i = W i C R i + C R i +, S i +2 +, S i +2 This bootstrap-based goodess-of-fit test was applied for each Table. Maximum Pseudolikelihood Parameter Estimates ad Correspodig 95% Cofidece Iterval for Five Families of Copulas, Based o the Harricaa River Data 95% cofidece Copula Estimates iterval CI Gumbel Hougaard ˆ =2.6 CI=.867,2.455 Galambos ˆ =.464 CI=.62,.766 Hüsler Reiss ˆ =2.027 CI=.778,2.275 BB ˆ =0.48, ˆ 2=.835 CI=0.022, ,2.25 BB5 ˆ =.034, ˆ 2=.244 CI=.000, ,.294 JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 363

18 Fig. 5. a Graphs of K ad K for the BB copula with ˆ =0.48, ˆ =.835, based o the Harricaa River data. b Geeralized K-plot providig a visual check of the goodess-of-fit of the same model for these data. Fig. 6. a Graphs of K ad K for the Gumbel Hougaard copula with =ˆ =2.6, based o the Harricaa River data. b Geeralized K-plot providig a visual check of the goodess-of-fit of the same model for these data. of the five families of copulas still uder cosideratio. The results are summarized i Table 3. As it tured out, oe of the models could be rejected o this basis either. The Gumbel Hougaard ad Galambos copulas beig embedded i two-parameter models BB ad BB5, yet aother optio for choosig betwee them would be to call o a pseudolikelihood ratio test procedure recetly itroduced by Che ad Fa Their approach, ispired by a semiparametric adaptatio of the Akaike Iformatio Criterio, makes it possible to measure the trade-off betwee goodess-of-fit ad model parsimoy. Suppose it is desired to compare two ested copula models, say C=C, ad D=C,0. Let ˆ,ˆ represet the maximum pseudolikelihood estimator of,r 2 uder model C, ad write for the maximum pseudolikelihood estimator of R uder the submodel D. The test statistic proposed by Che ad Fa 2005 the rejects the ull hypothesis H 0 := 0 that model D is preferable to model C wheever CF =2 logc, 0 c ˆ,ˆ R i R i +, S i + +, S i + is sufficietly small. To determie a P-value for this test, oe must resort to a oparametric bootstrap procedure, which proceeds as follows. For some large iteger N ad each k,...,n, do the followig: Step : Draw a bootstrap radom sample X,Y,...,X,Y with replacemet from X,Y,...,X,Y. Step 2: Use the method of maximum pseudolikelihood to determie estimators ˆ,ˆ ad of, ad, 0 uder models C ad D, respectively. Step 3: Compute the Hessia matrices B ad B 2 associated with logc, u,v ad logc,0 u,v at ˆ,ˆ ad, respectively. Step 4: Determie the value of 364 / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

19 Table 2. Results of the Bootstrap Based o the Cramér vo Mises Statistic S ad Kolmogorov Smirov Statistic T : Observed Statistic, Critical Value q Correspodig to =5% ad Approximate P-Value, Based o N=0,000 Parametric Bootstrap Samples Copula S q95% P-value T q95% P-value Gumbel Hougaard Galambos Hüsler Reiss BB BB CF,k = B 2 ˆ ˆ,ˆ ˆ B ˆ ˆ,ˆ ˆ T The P-value associated with the test of Che ad Fa 2005 is the give by N CF,k CF N k= The coclusios draw from this aalysis ot reported here are cosistet with Table, which idicate that the iterval estimates for the BB5 family are compatible with the Galambos model sice = is a possible value but ot with the Gumbel Hougaard because 2 =0 is excluded from its 95% cofidece iterval. Likewise, the parameter itervals for Family BB suggest that either the Gumbel Hougaard or the Kimeldorf Sampso families are adequate for the data at had. Additioal tools that may help to distiguish betwee bivariate extremevalue models will be preseted i the ext sectio. Graphical Diagostics for Bivariate Extreme-Value Copulas I the bivariate case, extreme-value copulas are characterized by the depedece fuctio A, asieq.. Whe the margial distributios F ad G of H are kow, a cosistet estimator A of A has bee proposed by Capéraà et al It is give by where t H z z A t = exp0 z z dz, t 0, H t = Z i t is the empirical distributio fuctio of the radom sample Z,...,Z with Z i =logfx i /logfx i GY i for i,..., Table 3. Results of the Bootstrap Based o the Cramér vo Mises Statistics CM : Observed Statistic, Critical Value q Correspodig to =5% ad Approximate P-Value, Based o N=0,000 Parametric Bootstrap Samples Copula CM q95% P-value Gumbel Hougaard Galambos Hüsler Reiss BB BB These authors showed that if Z Z are the associated ordered statistics, the A ca be writte i closed form as pt if 0 t Z A t = tq t i/ t i/ Q pt Q i if Z i t Z i+ tq pt if Z t i terms of a weight fuctio p so that p0= p= ad quatities Q i = i k= Z k / Z k /, i,..., The asymptotic behavior of the process loga loga is give by Capéraà et al. 997 uder mild regularity coditios, ad could be used to perform a goodess-of-fit test, say, usig the Cramér vo Mises statistic 0 loga t/a t 2 dt Whe the margis are ukow, however, as is most ofte the case i practice, it would seem reasoable to use a variat  of the same estimator, with Z i replaced by the pseudo-observatio Ẑ i = log R i +/log R i + S i i +, Before a proper test ca be developed, it will be ecessary to examie the asymptotic behavior of the process logâ t loga t This may be the object of future work. For additioal discussio o this geeral theme, refer to Abdous ad Ghoudi For the time beig, a useful graphical diagostic tool for extreme-value copulas may still cosist of drawig  ad A o the same plot. Fig. 7 shows such a plot for the four families of extreme-value copulas retaied for this study. Here, the weight fuctio used was pt= t. The reaso for which o goodessof-fit test could distiguish betwee these models is obvious from the graph: the geerators of the four families are ot oly fairly close to A, they are practically idetical. Coclusio Usig both a learig data set ad 85 aual records of volume ad peak from the Harricaa watershed, this paper has illustrated the various issues ivolved i characterizig, measurig, ad modelig depedece through copulas. The mai emphasis was JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007 / 365

20 i hydrology; see, e.g., De Michele ad Salvadori 2002, Favre et al. 2004, Salvadori ad De Michele 2004, De Michele et al. 2005, ad some of the papers i the curret issue of the Joural of Hydrologic Egieerig. It is hoped that this special issue, ad the preset paper, i particular, ca help foster the use of copula methodology i this field of sciece. Ackowledgmets Fig. 7. Plot of A solid lies ad A dashed lie for the followig families of extreme-value copulas: Gumbel Hougaard, Galambos, Hüsler Reiss, ad BB5 put o iferece ad testig procedures, may of which have just bee developed. Although the presetatio was limited to the case of two variables, most of the tools described here exted to the multidimesioal case. As the umber of variables icreases, of course, the itricacies of modelig become more complex, ad eve the costructio of appropriate copula models still poses serious difficulties. From the aalysis preseted here, it would appear that several copula families provide acceptable models of the depedece i the Harricaa River data. Not surprisigly, several of them are of the extreme-value type, ad it is ulikely that choosig betwee them would make ay serious differece for predictio purposes. If forced to express a preferece, a aalyst would probably wat to call o additioal criteria, such as model parsimoy, etc. As evideced by the material i the previous sectio, the theory surroudig goodess-of-fit testig for this particular class of copulas is still icomplete. For the data at had, Fig. 7 suggests that a asymmetric extreme-value copula model might possibly provide a somewhat better fit. Examples of such models metioed by Taw 988 are the asymmetric mixed ad logistic models. I the former At = t 3 + t 2 + t + with 0mi,+3 ad max+,+2; i the latter At = r t r + r t r /r + t + where 0, r. Khoudraji s device, described amog others i Geest et al. 998, could be used to geerate other asymmetric copula models whether extreme-value or ot. The problem of fittig a asymmetric copula to the Harricaa River data is left to the reader as a kowledge itegratio activity, ad the data set is available from the writers for that purpose. Users should keep i mid, however, that i copula matters as i ay other statistical modelig exercise, the pursuit of perfectio is illusory ad a balace should always be struck betwee fit ad parsimoy. The statistical literature o copula modelig is still growig rapidly. I recet years, umerous successful applicatios of this evolvig methodology have bee made, most otably i survival aalysis, actuarial sciece, ad fiace, but also quite recetly Partial fudig i support of this work was provided by the Natural Scieces ad Egieerig Research Coucil of Caada, the fods québécois de la recherche sur la ature et les techologies, the Istitut de fiace mathématique de Motréal, ad Hydro- Québec. Refereces Abdous, B., Geest, C., ad Rémillard, B Depedece properties of meta-elliptical distributios. Statistical modelig ad aalysis for complex data problems, Spriger, New York, 5. Abdous, B., ad Ghoudi, K No-parametric estimators of multivariate extreme depedece fuctios. J. Noparam. Stat., 78, Ali, M. M., Mikhail, N. N., ad Haq, M. S A class of bivariate distributios icludig the bivariate logistic. J. Multivariate Aal., 83, Bâ, K. M., Díaz-Delgado, C., ad Cârsteau, A Cofidece itervals of quatile i hydrology computed by a aalytical method. Natural Hazards, 24, 2. Barbe, P., Geest, C., Ghoudi, K., ad Rémillard, B O Kedall s process. J. Multivariate Aal., 582, Bertio, S Sulla dissomigliaza tra mutabili cicliche. Metro, 35 2, Biau, G., ad Wegkamp, M. H A ote o miimum distace estimatio of copula desities. Stat. Probab. Lett., 732, Bobée, B., ad Ashkar, F. 99. The gamma family ad derived distributios applied i hydrology, Water Resources, Littleto, Colo. Borkowf, C. B Computig the oull asymptotic variace ad the asymptotic relative efficiecy of Spearma s rak correlatio. Comput. Stat. Data Aal., 393, Capéraà, P., Fougères, A.-L., ad Geest, C A oparametric estimatio procedure for bivariate extreme value copulas. Biometrika, 843, Capéraà, P., Fougères, A.-L., ad Geest, C Bivariate distributios with give extreme value attractor. J. Multivariate Aal., 72, Che, X., ad Fa, Y Pseudo-likelihood ratio tests for semiparametric multivariate copula model selectio. Ca. J. Stat., 333, Cherubii, U., Luciao, E., ad Vecchiato, W Copula methods i fiace, Wiley, New York. Clayto, D. G A model for associatio i bivariate life tables ad its applicatio i epidemiological studies of familial tedecy i chroic disease icidece. Biometrika, 65, 4 5. De Michele, C., ad Salvadori, G A geeralized Pareto itesity duratio model of storm raifall exploitig 2-copulas. J. Geophys. Res., 08D2,. De Michele, C., Salvadori, G., Caossi, M., Petaccia, A., ad Rosso, R Bivariate statistical approach to check adequacy of dam spillway. J. Hydrol. Eg., 0, Deheuvels, P La foctio de dépedace empirique et ses propriétés: U test o paramétrique d idépedace. Bull. Cl. Sci., Acad. R. Belg., 656, / JOURNAL OF HYDROLOGIC ENGINEERING ASCE / JULY/AUGUST 2007

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot

ARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorov-type test for monotonicity of regression. Cecile Durot STAPRO 66 pp: - col.fig.: il ED: MG PROD. TYPE: COM PAGN: Usha.N -- SCAN: il Statistics & Probability Letters 2 2 2 2 Abstract A Kolmogorov-type test for mootoicity of regressio Cecile Durot Laboratoire

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Probabilistic Engineering Mechanics. Do Rosenblatt and Nataf isoprobabilistic transformations really differ?

Probabilistic Engineering Mechanics. Do Rosenblatt and Nataf isoprobabilistic transformations really differ? Probabilistic Egieerig Mechaics 4 (009) 577 584 Cotets lists available at ScieceDirect Probabilistic Egieerig Mechaics joural homepage: wwwelseviercom/locate/probegmech Do Roseblatt ad Nataf isoprobabilistic

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

Class Meeting # 16: The Fourier Transform on R n

Class Meeting # 16: The Fourier Transform on R n MATH 18.152 COUSE NOTES - CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Contingencies Core Technical Syllabus Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

Ekkehart Schlicht: Economic Surplus and Derived Demand

Ekkehart Schlicht: Economic Surplus and Derived Demand Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/

More information

Entropy of bi-capacities

Entropy of bi-capacities Etropy of bi-capacities Iva Kojadiovic LINA CNRS FRE 2729 Site école polytechique de l uiv. de Nates Rue Christia Pauc 44306 Nates, Frace iva.kojadiovic@uiv-ates.fr Jea-Luc Marichal Applied Mathematics

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps Swaps: Costat maturity swaps (CMS) ad costat maturity reasury (CM) swaps A Costat Maturity Swap (CMS) swap is a swap where oe of the legs pays (respectively receives) a swap rate of a fixed maturity, while

More information

TO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC

TO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC TO: Users of the ACTEX Review Semiar o DVD for SOA Eam MLC FROM: Richard L. (Dick) Lodo, FSA Dear Studets, Thak you for purchasig the DVD recordig of the ACTEX Review Semiar for SOA Eam M, Life Cotigecies

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

Copulas and bivariate risk measures : an application to hedge funds

Copulas and bivariate risk measures : an application to hedge funds Copulas ad bivariate risk measures : a applicatio to hedge fuds Rihab BEDOUI Makram BEN DBABIS Jauary 2009 Abstract With hedge fuds, maagers develop risk maagemet models that maily aim to play o the effect

More information

Present Values, Investment Returns and Discount Rates

Present Values, Investment Returns and Discount Rates Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

3. Greatest Common Divisor - Least Common Multiple

3. Greatest Common Divisor - Least Common Multiple 3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,

NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff, NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical

More information

Exploratory Data Analysis

Exploratory Data Analysis 1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

How to read A Mutual Fund shareholder report

How to read A Mutual Fund shareholder report Ivestor BulletI How to read A Mutual Fud shareholder report The SEC s Office of Ivestor Educatio ad Advocacy is issuig this Ivestor Bulleti to educate idividual ivestors about mutual fud shareholder reports.

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find 1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Data Analysis and Statistical Behaviors of Stock Market Fluctuations 44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

Chapter 5: Inner Product Spaces

Chapter 5: Inner Product Spaces Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

AP Calculus BC 2003 Scoring Guidelines Form B

AP Calculus BC 2003 Scoring Guidelines Form B AP Calculus BC Scorig Guidelies Form B The materials icluded i these files are iteded for use by AP teachers for course ad exam preparatio; permissio for ay other use must be sought from the Advaced Placemet

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

7. Concepts in Probability, Statistics and Stochastic Modelling

7. Concepts in Probability, Statistics and Stochastic Modelling 7. Cocepts i Probability, Statistics ad Stochastic Modellig 1. Itroductio 169. Probability Cocepts ad Methods 170.1. Radom Variables ad Distributios 170.. Expectatio 173.3. Quatiles, Momets ad Their Estimators

More information