SOME HYPOTHESIS TESTS FOR THE COVARIANCE MATRIX WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE


 Albert Knight
 1 years ago
 Views:
Transcription
1 The Aals of Statistis 2002, Vol. 30, No. 4, SOME HYPOTHESIS TESTS FOR THE COVARIANCE MATRIX WHEN THE DIMENSION IS LARGE COMPARED TO THE SAMPLE SIZE BY OLIVIER LEDOIT AND MICHAEL WOLF 1 UCLA ad Credit Suisse First Bosto, ad Uiversitat Pompeu Fabra This paper aalyzes whether stadard ovariae matrix tests work whe dimesioality is large, ad i partiular larger tha sample size. I the latter ase, the sigularity of the sample ovariae matrix makes likelihood ratio tests degeerate, but other tests based o quadrati forms of sample ovariae matrix eigevalues remai welldefied. We study the osistey property ad limitig distributio of these tests as dimesioality ad sample size go to ifiity together, with their ratio overgig to a fiite ozero limit. We fid that the existig test for spheriity is robust agaist high dimesioality, but ot the test for equality of the ovariae matrix to a give matrix. For the latter test, we develop a ew orretio to the existig test statisti that makes it robust agaist high dimesioality. 1. Itrodutio. May empirial problems ivolve largedimesioal ovariae matries. Sometimes the dimesioality p is eve larger tha the sample size, whih makes the sample ovariae matrix S sigular. How to odut statistial iferee i this ase? For oreteess, we fous o two ommo testig problems i this paper: 1) the ovariae matrix is proportioal to the idetity I spheriity); 2) the ovariae matrix is equal to the idetity I. The idetity a be replaed with ay other matrix 0 by multiplyig the data by 1/2 0. Followig muh of the literature, we assume ormality. For both hypotheses the likelihood ratio test statisti is degeerate whe p exeeds ; see, for example, Muirhead 1982), Setios 8.3 ad 8.4, or Aderso 1984), Setios 10.7 ad This steers us toward other test statistis that do ot degeerate, suh as U = 1 [ )2 ] p tr S 1) 1/p) trs) I ad V = 1 p tr[ S I) 2] where tr deotes the trae. Joh 1971) proves that the test based o U is the loally most powerful ivariat test for spheriity, ad Nagao 1973) derives V as the equivalet of U for the test of = I. The asymptoti framework where U ad V have bee studied assumes that goes to ifiity while p remais fixed. It treats terms of order p/ like terms of order 1/, whih is iappropriate if p is Reeived May 1998; revised November Supported by DGES Grat BEC AMS 2000 subjet lassifiatios. Primary 62H15; seodary 62E20. Key words ad phrases. Coetratio asymptotis, equality test, spheriity test. 1081
2 1082 O. LEDOIT AND M. WOLF of the same order of magitude as. The robustess of tests based o U ad V agaist high dimesioality is heretofore ukow. We study the asymptoti behavior of U ad V as p ad go to ifiity together with the ratio p/ overgig to a limit 0, + ) alled the oetratio. The sigular ase orrespods to a oetratio above oe. The robustess issue boils dow to power ad size: is the test still osistet? Is the limitig distributio uder the ull still a good approximatio? Surprisigly, we fid opposite aswers for U ad V. The power ad the size of the spheriity test based o U tur out to be robust agaist p large, ad eve larger tha. Butthetestof = I based o V is ot osistet agaist every alterative whe p goes to ifiity with,adits limitig distributio differs from its, p)limitig distributio uder the ull. This prompts us to itrodue the modified statisti 2) W = 1 p tr[ S I) 2] p [ ] 1 2 p trs) + p. W has the same asymptoti properties as V :it is osistet ad has the same limitig distributio as V udertheull. Weshowthat, otraryto V, thepower ad the size of the test based o W are robust agaist p large, ad eve larger tha. The otributios of this paper are: i) developig a method to hek the robustess of ovariae matrix tests agaist high dimesioality; ad ii) fidig two statistis oe old ad oe ew) for ommoly used ovariae matrix tests that a be used whe the sample ovariae matrix is sigular. Our results rest o a large ad importat body of literature o the asymptotis for eigevalues of radom matries, suh as Arharov 1971), Bai 1993), Girko 1979, 1988), Josso 1982), Narayaaswamy ad Raghavarao 1991), Serdobol skii 1985, 1995, 1999), Silverstei 1986), Silverstei ad Combettes 1992), Wahter 1976, 1978) ad Yi ad Krishaiah 1983), amog others. Also, we are addig to a substatial list of papers dealig with statistial tests usig results o large radom matries, suh as Alalouf 1978), Bai, Krishaiah, ad Zhao 1989), Bai ad Saraadasa 1996), Dempster 1958, 1960), Läuter 1996), Saraadasa 1993), Wilso ad Kshirsagar 1980) ad Zhao, Krishaiah ad Bai 1986a, 1986b). The remaider of the paper is orgaized as follows. Setio 2 ompiles prelimiary results. Setio 3 shows that the test statisti U for spheriity is robust agaist large dimesioality. Setio 4 shows that the test of = I based o V is ot. Setio 5 itrodues a ew statisti W that a be used whe p is large. Setio 6 reports evidee from Mote Carlo simulatios. Setio 7 addresses some possible oers. Setio 8 otais the olusios. Proofs are deferred to the Appedix. 2. Prelimiaries. The exat sese i whih sample size ad dimesioality go to ifiity together is defied by the followig assumptios.
3 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1083 ASSUMPTION 1 Asymptotis). Dimesioality ad sample size are two ireasig iteger futios p = p k ad = k of a idex k = 1, 2,... suh that lim k p k =+,lim k k =+ ad there exists 0, + ) suh that lim k p k / k =. The ase where the sample ovariae matrix is sigular orrespods to a oetratio higher tha oe. I this paper, we refer to oetratio asymptotis or, p)asymptotis. Aother term sometimes used for the same oept is ireasig dimesio asymptotis i.d.a) ; for example, see Serdobol skii 1999). ASSUMPTION 2 Datageeratig proess). For eah positive iteger k, X k is a k + 1) p k matrix of k + 1 i.i.d. observatios o a system of p k radom variables that are joitly ormally distributed with mea vetor µ k ad ovariae matrix k.letλ 1,k,...,λ pk,k deote the eigevalues of the ovariae matrix k. We suppose that their average α = p k i=1 λ i,k/p k ad their dispersio δ 2 = p k i=1 λ i,k α) 2 /p k are idepedet of the idex k. Furthermore, we require α>0. S k is the sample ovariae matrix with etries s ij,k = 1 x jl,k m j,k ) where m i,k = l=1 x il,k m i,k ) +1 l=1 x il,k. The ull hypothesis of spheriity a be stated as δ 2 = 0, ad the ull = I a be stated as δ 2 = 0adα = 1. We eed oe more assumptio to obtai overgee results uder the alterative. ASSUMPTION 3 Higher momets). The averages of the third ad fourth momets of the eigevalues of the populatio ovariae matrix p k λ i,k ) j j = 3, 4) p i=1 k overge to fiite limits, respetively. Depedee o k will be omitted whe o ambiguity is possible. Muh of the mathematial groudwork has already bee laid out by researh i the spetral theory of largedimesioal radom matries. The fudametal results of iterest to us are as follows. PROPOSITION 1 Law of large umbers). Uder Assumptios 1 3, 1 3) p trs) P α, 4) 1 p trs2 ) P 1 + )α 2 + δ 2 where P deotes overgee i probability.
4 1084 O. LEDOIT AND M. WOLF All proofs are i the Appedix. This law of large umbers will help us establish whether or ot a give test is osistet agaist every alterative as ad p go to ifiity together. The distributio of the test statisti uder the ull will be foud by usig the followig etral limit theorem. PROPOSITION 2 Cetral limit theorem). Uder Assumptios 1 2, if δ 2 = 0, the 1 p trs) α 1 p trs2 ) + p + 1 α 2 5) 2α 2 [ ] D ) α 3 N, ) ) 2 α α 4 where D deotes overgee i distributio ad N the ormal distributio. 3. Spheriity test. It is well kow that the spheriity test based o U is osistet. As for, p)osistey, Propositio 1 implies that, uder Assumptios 1 3, U = 1/p) trs2 ) [1/p) trs)] 2 1 P 1 + )α2 + δ 2 6) α 2 1 = + δ2 α 2. Sie a be approximated by the kow quatity p/, the power of this test to separate the ull hypothesis of spheriity δ 2 /α 2 = 0 from the alterative δ 2 /α 2 > 0 overges to oe as ad p go to ifiity together: this ostitutes a, p)osistet test. Joh 1972) showsthat, as goesto ifiity while p remais fixed, the limitig distributio of U uder the ull is give by p 7) 2 U D Y pp+1)/2 1 or, equivaletly, 8) U p D 2 p Y pp+1)/2 1 p where Y d deotes a radom variable distributed as a χ 2 with d degrees of freedom. It will beome apparet after Propositio 4 why we hoose to rewrite equatio 7) as 8). This approximatio may or may ot remai aurate uder, p)asymptotis, depedig o whether it omits terms of order p/. To fid out, let us start by derivig the, p)limitig distributio of U uder the ull hypothesis δ 2 /α 2 = 0.
5 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1085 PROPOSITION 3. Uder the assumptios of Propositio 2, 9) U p D N 1, 4). Now we a ompare equatios 8) ad 9). PROPOSITION 4. Suppose that, for every k, the radom variable Y pk p k +1)/2+a is distributed as a χ 2 with p k p k + 1)/2 + a degrees of freedom, where a is a ostat iteger. The its limitig distributio uder Assumptio 1 satisfies 10) 2 p k Y pk p k +1)/2+a p k D N 1, 4). Usig Propositio 4 with a = 1 shows that the limitig distributio give by equatio 8) is still orret uder, p)asymptotis. Theolusioof ouraalysisofthespheriitytestbasedo U is the followig: the existig asymptoti theory where p is fixed) remais valid if p goes to ifiity with, eve for the ase p>. 4. Test that a ovariae matrix is the idetity. As goes to ifiity with p fixed, S P, therefore V P p 1 tr[ I)2 ]. This shows that the test of = I based o V is osistet. As for, p)osistey, Propositio 1 implies that, uder Assumptios 1 3, 11) V = 1 p trs2 ) 2 p trs) + 1 P 1 + )α 2 + δ 2 2α + 1 = α 2 + α 1) 2 + δ 2. Sie 1 p tr[ I)2 ]=α 1) 2 + δ 2 is a squared measure of distae betwee the populatio ovariae matrix ad the idetity, the ull hypothesis a be rewritte as α 1) 2 + δ 2 = 0, ad the alterative as α 1) 2 + δ 2 > 0. The problem is that the probability limit of the test statisti V is ot diretly a futio of α 1) 2 +δ 2 : it ivolves aother term, α 2, whih otais the uisae parameter α 2. Therefore the test based o V may sometimes be powerless to separate the ull from the alterative. More speifially, whe the triplet,α,δ)satisfies 12) α 2 + α 1) 2 + δ 2 =, the test statisti V has the same probability limit uder the ull as uder the alterative. The learest outerexamples are those where δ 2 = 0, beause Propositio 2 allows us to ompute the limit of the power of the test agaist suh alteratives. Whe δ 2 = 0 the solutio to equatio 12) is α = 1 1+.
6 1086 O. LEDOIT AND M. WOLF PROPOSITION 5. Uder Assumptios 1 2, if 0, 1) ad there exists a fiite d suh that p = + d + o 1 ) the the power of the test of ay positive sigifiae level based o V to rejet the ull = I whe the alterative = 1 1+I is true overges to a limit stritly below oe. We see that the osistey of the test based o V does ot exted to, p)asymptotis. Nagao 1973) shows that, as goes to ifiity while p remais fixed, the limitig distributio of V udertheullis giveby p 13) 2 V D Y pp+1)/2 or, equivaletly, 14) V p D 2 p Y pp+1)/2 p where, as before, Y d deotes a radom variable distributed as a χ 2 with d degrees of freedom. It is ot immediately apparet whether this approximatio remais aurate uder, p)asymptotis. The, p)limitig distributio of V uder the ull hypothesis α 1) 2 + δ 2 = 0 is derived i equatio 38) i the Appedix as part of the proof of Propositio 5: 15) V p D N 1, 4 + 8). Usig Propositio 4 with a = 0showsthatthelimitig distributio give by equatio 14) is iorret uder, p)asymptotis. The olusio of our aalysis of the test of = I based o V is the followig: the existig asymptoti theory where p is fixed) breaks dow whe p goes to ifiity with, iludig the ase p>. 5. Test that a ovariae matrix is the idetity: ew statisti. The ideal would be to fid a simple modifiatio of V that has the same asymptoti properties ad better, p)asymptoti properties i the spirit of U). This is why we itrodue the ew statisti 16) W = 1 p tr[ S I) 2] p [ 1 p trs) ] 2 + p. As goes to ifiity with p fixed, W P 1 p tr[ I)2 ], therefore the test of = I based o W is osistet. As for, p)osistey, Propositio 1 implies that, uder Assumptios 1 3, 17) W P α 2 + α 1) 2 + δ 2 α 2 + = + α 1) 2 + δ 2.
7 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1087 Sie a be approximated by the kow quatity p/, the power of the test based o W to separate the ull hypothesis α 1) 2 + δ 2 = 0 from the alterative α 1) 2 +δ 2 > 0 overges to oe as ad p go to ifiity together: the test based o W is, p)osistet. The followig propositio shows that W has the same limitig distributio as V uder the ull. PROPOSITION 6. As goes to ifiity with p fixed, the limitig distributio of W uder the ull hypothesis α 1) 2 + δ 2 = 0 is the same as for V : p 18) 2 W D Y pp+1)/2 or, equivaletly, 19) W p D 2 p Y pp+1)/2 p where Y d deotes a radom variable distributed as a χ 2 with d degrees of freedom. To fid out whether this approximatio remais aurate uder, p)asymptotis, we derive the, p)limitig distributio of W uder the ull. 20) PROPOSITION 7. Uder Assumptios 1 2, if α 1) 2 + δ 2 = 0 the W p D N 1, 4). Usig Propositio 4 with a = 0 shows that the limitig distributio give by equatio 19) is still orret uder, p)asymptotis. The olusio of our aalysis of the test of = I based o W is the followig: the asymptoti theory developed for V is diretly appliable to W,aditremais valid for W but ot V )ifp goes to ifiity with, eve i the ase p>. 6. Mote Carlo simulatios. So far, little is kow about the fiitesample behavior of these tests. I partiular the questio of whether they are ubiased i fiite sample is ot readily tratable. Yet some light a be shed o fiitesample behavior through Mote Carlo simulatios. Mote Carlo simulatios are used to fid the size ad power of the test statistis U, V,adW for p, = 4, 8,...,256. I eah ase we ru 10, 000 simulatios. The alterative agaist whih power is omputed has to be salable i the sese that it a be represeted by populatio ovariae matries of ay dimesio p = 4, 8,...,256. The simplest alterative we a thik of is to set half of the populatio eigevalues equal to 1, ad the other oes equal to 0.5. Table 1 reports the size of the spheriity test based o U. The test is arried out by omputig the 95% utoff poit from the χ 2 limitig distributio i
8 1088 O. LEDOIT AND M. WOLF TABLE 1 Size of spheriity test based o U. The ull hypothesis is rejeted whe the test statisti exeeds the 95% utoff poit obtaied from the χ 2 approximatio. Atual size overges to omial size as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p equatio 8). We see that the quality of this approximatio does ot get worse whe p gets large: it a be relied upo eve whe p>. This is what we expeted give Propositio 4. Table 2 shows the power of the spheriity test based o U agaist the alterative desribed above. We see that the power does ot beome lower whe p gets large: power stays high eve whe p>. This ofirms the, p)osistey result derived from equatio 6). The table idiates that the power seems to deped predomiatly o. For fixed sample size, the power of the test is ofte ireasig i p, whih is somewhat surprisig. We do ot have ay simple explaatio of this pheomeo but will address it i future researh fousig o the aalysis of power. TABLE 2 Power of spheriity test based o U. The ull hypothesis is rejeted whe the test statisti exeeds the 95% utoff poit obtaied from the χ 2 approximatio. Data are geerated uder the alterative where half of the populatio eigevalues are equal to 1, ad the other oes are equal to 0.5. Power overges to oe as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p
9 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1089 TABLE 3 Size of equality test based o V. The ull hypothesis is rejeted whe the test statisti exeeds the 95% utoff poit obtaied from the χ 2 approximatio. Atual size does ot overge to omial size as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p Usig the same methodology as i Table 1, we report i Table 3 the size of the test for = I based o V. We see that the χ 2 limitig distributio uder the ull i equatio 14) is a poor approximatio for large p. This is what we expeted give the disussio surroudig equatio 15). Usig the same methodology as i Table 2, we report i Table 4 the power of the test based o V agaist the alterative desribed above. Give the disussio surroudig equatio 12), we atiipate that this test will ot be powerful whe =[α 1) 2 + δ 2 ]/1 α 2 ) = 2/7. Ideed we observe that, i the ells where p/ exeeds the ritial value 2/7, this test does ot have muh power to rejet the alterative. TABLE 4 Power of equality test based o V. The ull hypothesis is rejeted whe the test statisti exeeds the 95% utoff poit obtaied from the χ 2 approximatio. Data are geerated uder the alterative where half of the populatio eigevalues are equal to 1, ad the other oes are equal to 0.5. Power does ot overge to oe as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p
10 1090 O. LEDOIT AND M. WOLF TABLE 5 Size of equality test based o W. The ull hypothesis is rejeted whe the test statisti exeeds the 95% utoff poit obtaied from the χ 2 approximatio. Atual size overges to omial size as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p Usig the same methodology as i Table 1, we report i Table 5 the size of the test for = I based o W. We see that the χ 2 approximatio i equatio 19) for the ull distributio does ot get worse whe p gets large: it a be relied upo eve whe p>. This is what we expeted give the disussio surroudig equatio 15). Usig the same methodology as i Table 2, we report i Table 6 the power of the test based o W agaist the alterative desribed above. We see that the power does ot beome lower whe p gets large: power stays high eve whe p>.this ofirms the, p)osistey result derived from equatio 17). As with U, the table idiates that the power seems to deped predomiatly o, ad to be ireasig i p for fixed. TABLE 6 Power of equality test based o W. The ull hypothesis is rejeted whe the test statisti exeeds the 95% utoff poit obtaied from the χ 2 approximatio. Data are geerated uder the alterative where half of the populatio eigevalues are equal to 1, ad the other eigevalues are equal to 0.5. Power overges to oe as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p
11 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1091 Overall, these Mote Carlo simulatios ofirm the fiitesample relevae of the asymptoti results obtaied i Setios 3, 4 ad Possible oers. For the disussio that follows, reall the defiitio of the rth mea of a olletio of p oegative reals, {s 1,...,s p },giveby ) 1/p 1 p s p i, if r 0, p Mr) = i=1 p s 1/p i, if r = 0. i=1 A possible oer is the use of Joh s statisti U for testig spheriity, sie it is based o the ratio of the first ad seod meas [i.e., M1) ad M2)] ofthe sample eigevalues. The likelihood ratio LR) test statisti, o the other had, is based o the ratio of the geometri mea [i.e., M0)] to the first mea of the sample eigevalues; for example, see Muirhead 1982), Setio 8.3. It has log bee kow that the LR test has the desirable property of beig ubiased; see Gleser 1966) ad Marshall ad Olki 1979), pages Also, for the related problem of testig homogeeity of variaes, it has log bee established that ertai tests based o ratios of the type Mr)/Mt) with r 0adt 0 are ubiased; see Cohe ad Strawderma 1971). No ubiasedess properties are kow for tests based o ratios of the type Mr)/Mt) with both r>0adt>0. Still, we advoate the use of Joh s statisti U over the LR statisti for testig spheriity whe p is large ompared to. First, the LR test statisti is degeerate whe p>though oe might try to defie a alterative statisti usig the ozero sample eigevalues oly i this ase). Seod, whe p is less tha or equal to but lose to some of the sample eigevalues will be very lose to zero, ausig the LR statisti to be early degeerate; this should affet the fiitesample performae of the LR test. Obviously, this also questios the strategy of ostrutig a LRlike statisti based o the ozero sample eigevalues oly whe p>.) Our ituitio is that tests whose statisti ivolves a mea Mr) with r 0 will misbehave whe p beomes lose to. The reaso is that they give too muh importae to the sample eigevalues lose to zero, whih otai iformatio ot o the true ovariae matrix but o the ratio p/; see Figure 1 for a illustratio. To hek this ituitio, we ru a Mote Carlo o the LR test for spheriity for the ase p. Critial values are obtaied from the χ 2 approximatio uder the ull; for example, see Muirhead 1982), Setio 8.3. The simulatio setup is idetial to that of Setio 6. Table 7 reports the simulated size of the LR test ad severe size distortios for large values of p ompared to are obvious. Next we ompute the power of the LR test i a way that eables diret ompariso with Table 2: we use the distributio of the LR test statisti simulated uder the ull
12 1092 O. LEDOIT AND M. WOLF FIG.1. Sample versus true eigevalues. The solid lie represets the distributio of the eigevalues of the sample ovariae matrix based o the asymptoti formula prove by Marčeko ad Pastur 1967). Eigevalues are sorted from largest to smallest, the plotted agaist their rak. I this ase, the true ovariae matrix is the idetity, that is, the true eigevalues are all equal to oe. The distributio of the true eigevalues is plotted as a dashed horizotal lie at oe. Distributios are obtaied i the limit as the umber of observatios ad the umber of variables p bothgoto ifiity with the ratio p/ overgig to a fiite positive limit, the oetratio. The four plots orrespod to differet values of the oetratio. to fid the utoff poits orrespodig to the realized sizes i Table 1 most of them are equal to the omial size of 0.05, but for small values of p ad they are lower). Usig these utoff poits for the LR test statisti geerates a test with exatly the same size as the test based o Joh s statisti U, so we a diretly ompare the power of the two tests. Table 8 is the equivalet of Table 2 exept it uses the LR test statisti for p. We a see that the LR test is slightly more powerful tha Joh s test by oe peret or less) whe p is small ompared to,
13 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1093 TABLE 7 Size of spheriity test based o LR test statisti. The ull hypothesis is rejeted whe the test statisti exeeds the 95% utoff poit obtaied from the χ 2 approximatio. Atual size does ot overge to omial size as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p but is substatially less powerful whe p gets lose to. Hee, both i terms of size ad power, the test based o U is preferable to the LR test whe p is large ompared to, ad this is the seario of iterest of the paper. Aother possible oer addresses the otio of osistey whe p teds to ifiity. For p fixed, the alterative is give by a fixed ovariae matrix ad osistey meas that the power of the test teds to oe as the sample size teds to ifiity. Of ourse, whe p ireases the matrix of the alterative a o loger be fixed. Our approah is to work withi a asymptoti framework that plaes ertai restritios o how a evolve, amely we require that the quatities α ad δ 2 aot hage; see Assumptio 2. Obviously, this exludes TABLE 8 Power of spheriity test based o LR test statisti. The ull hypothesis is rejeted whe the test statisti exeeds the 95% sizeadjusted utoff poit to eable diret ompariso with Table 2) obtaied from the χ 2 approximatio. Data are geerated uder the alterative where half of the populatio eigevalues are equal to 1, ad the other oes are equal to 0.5. Power does ot overge to oe as dimesioality p goes to ifiity with sample size. Results ome from 10,000 Mote Carlo simulatios p
14 1094 O. LEDOIT AND M. WOLF ertai alteratives of iterest suh as havig all eigevalues equal to 1 exept for the largest whih is equal to p β,forsome0<β<0.5. For this sequee of alteratives, the test based o Joh s statisti U is ot osistet ad a test based o aother statisti would have to be devised e.g., ivolvig the maximum sample eigevalue). Suh other asymptoti frameworks are deferred to future researh. 8. Colusios. I this paper, we have studied the spheriity test ad the idetity test for ovariae matries whe the dimesioality is large ompared to the sample size, ad i partiular whe it exeeds the sample size. Our aalysis is restrited to a asymptoti framework that osiders the first two momets of the eigevalues of the true ovariae matrix to be idepedet of the dimesioality. We foud that the existig test for spheriity based o Joh s 1971) statisti U is robust agaist high dimesioality. O the other had, the related test for idetity based o Nagao s 1973) statisti V is iosistet. We proposed a modifiatio to the statisti V whih makes it robust agaist high dimesioality. Mote Carlo simulatios ofirmed that our asymptoti results ted to hold well i fiite samples. Diretios for future researh ilude: applyig the method to other test statistis; fidig limitig distributios uder the alterative to ompute power; searhig for most powerful tests withi speifi asymptoti frameworks for the sequee of alteratives); relaxig the ormality assumptio. APPENDIX PROOF OF PROPOSITION 1. The proof of this propositio is otaied iside the proof of the mai theorem of Yi ad Krishaiah 1983). Their paper deals with the produt of two radom matries but it a be applied to our setup by takig oe of them to be the idetity matrix as a speial ase of a radom matrix. Eve though their mai theorem is derived uder assumptios o all the average momets of the eigevalues of the populatio ovariae matrix, areful ispetio of their proof reveals that overgee i probability of the first two average momets requires oly assumptios up to the fourth momet. The formulas for the limits ome from Yi ad Krishaiah s 1983) seod equatio o the top of page 504. PROOF OF PROPOSITION 2. Chagig α simply amouts to resalig 1 p trs) by α ad p 1 trs2 ) by α 2, therefore we a assume without loss of geerality that α = 1. Josso s 1982) Theorem 4.1 shows that, uder the assumptios of Propositio 2, { } trs) E[trS)] + p 21) 2 { trs 2 + p) 2 ) E[trS 2 )] }
15 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1095 overges i distributio to a bivariate ormal. Sie p/ 0, + ), this implies that 22) [ ] 1 1 p trs) E p trs) 1 p trs2 ) E [ ] 1 p trs2 ) also overges i distributio to a bivariate ormal. p trs) is the average of the diagoal elemets of the ubiased sample ovariae matrix, therefore its expetatio is equal to oe. Joh 1972), Lemma 2, shows that the expetatio of p 1 trs2 ) is equal to +p+1. So far we have established that 1 23) 1 p trs) 1 1 p trs2 ) + p + 1 overges i distributio to a bivariate ormal. Sie this limitig bivariate ormal has mea zero, the oly task left is to ompute its ovariae matrix. This a be doe by takig the limit of the ovariae matrix of the expressio i equatio 23). Usig oe agai the momets omputed by Joh 1972), Lemma 2, we fid that [ ] Var p trs) [ ) 2 ] = E p trs) ] Var[ p trs2 ) [ ) 2 ] = E p trs2 ) [ ]) 2 E p trs) p + 2) = 2 = 2 p p 2, [ ]) 2 E p trs2 ) = p3 + 2p 2 + 2p + 8) 2 + p 3 + 2p p + 20) + 8p p + 20 p + p + 1) 2 = 8 p + 20p2 + 20p p 2 + 8p3 + 20p p p
16 1096 O. LEDOIT AND M. WOLF Fially we have to fid the ovariae term. Let s ij deote the etry i, j) of the ubiased sample ovariae matrix S. Wehave p p p E[trS) trs 2 )]= E[s ii sjl 2 ] i=1 j=1 l=1 = pp 1)p 2)E[s 11 s23 2 ]+pp 1)E[s 11s22 2 ] + 2pp 1)E[s 11 s 2 12 ]+pe[s3 11 ] 24) = pp 1)p 2) + pp 1) pp 1) ) + 4) 2 + p 2 = p 2 + p3 + p 2 + 4p + 4p2 + 4p 2. The momet formulas that appear i equatio 24) are omputed i the same fashio as i the proof of Lemma 2 by Joh 1972). This eables us to ompute the limitig ovariae term as 25) [ Cov p trs), ] p trs2 ) = 2 [ p 2 E[trS) trs2 )] E p trs) = 2 + p2 + p + 4 p = 4 p p ). ] [ ] E p trs2 ) + 4p p + 1) p This ompletes the proof of Propositio 2. PROOF OF PROPOSITION 3. Defie the futio fx,y) = y 1. The x 2 U = f 1 p trs), 1 p trs2 )). Propositio 2 implies that, by the delta method, [ U f α, + p + 1 )] α 2 D N 0, lim A),
17 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1097 where f α, + p + 1 ) α 2 x A = f α, + p + 1 ) α 2 y 2α ) α 3 f α, + p + 1 ) α 2 x ) ) 2 α α 4 f α, + p + 1 ) α 2 y ad deotes the traspose. Notie that f α, + p + 1 ) 26) α 2 = p + 1, f α, + p + 1 ) 27) α 2 = 2 + p + 1, x α f α, + p + 1 ) 28) α 2 = 1 y α 2. Plaig the last two expressios ito the formula for A yields 29) + p + 1)2 A = ) ) + p ) 1 + ) ) ) ) = 4. This ompletes the proof of Propositio 3. PROOF OF PROPOSITION 4. Let z 1,z 2,...deote a sequee of i.i.d. stadard ormal radom variables. The Y pk p k +1)/2+a has the same distributio as z zp 2 k p k +1)/2+a.SieE[z2 1 ]=1adVar[z2 1 ]=2, the Lideberg Lévy etral limit theorem implies that [ ] Ypk p p k p k + 1)/2 + a k +1)/2+a D 31) p k p k + 1)/2 + a 1 N 0, 2). Multiplyig the lefthad side by p k p k + 1) + 2a/p k, whih overges to oe, does ot affet the limit, therefore 2 Y pk p p k +1)/2+a p k a 2 D 32) N 0, 2). k 2 p k
18 1098 O. LEDOIT AND M. WOLF Subtratig from the lefthad side a 2/p k, whih overges to zero, does ot affet the limit, therefore 2 Y pk p p k +1)/2+a p k + 1 D 33) N 0, 2). k 2 Resalig equatio 33) yields 10). PROOF OF PROPOSITION 5. Defie the futio gx,y) = y 2x + 1. The V = g p 1 trs), p 1 trs2 )). Propositio 2 implies that, by the delta method, [ V g α, + p + 1 )] α 2 D N 0, lim B), where g α, + p + 1 ) α 2 x B = g y Notie that 34) 35) 36) α, + p + 1 2α g g x g y ) α ) α 3 ) ) 2 α α, + p + 1 α, + p + 1 ) α 2 = 2, α, + p + 1 ) α 2 = 1. g α, + p + 1 ) α 2 x α 4 g α, + p + 1 ) α 2. y α 2 ) = α 1) 2 + p + 1 α2, Plaig the last two expressios ito the formula for B yields B = 8 1 α ) ) 2 37) α α 4. First let usfid the, p)limitig distributio of V uderthe ull. Settig α equal to oe yields g1, +p+1 ) = p+1 ad B = Hee, uder the ull, V p + 1 ) D 38) N 0, 4 + 8).
19 LARGEDIMENSIONAL COVARIANCE MATRIX TESTS 1099 Now let us fid the, p)limitig distributio of V uder the alterative. Settig α equal to 1 1+ yields 1 g 1 +, + p ) 2 ) 1 + ) 2 ad B = 8 )2 1 = = p p + 1 ) ) 2 ) d ) 2 + o ) ) ) 1 3 ) = 41 ) ) 4. Hee, uder the alterative, 39) V p + 1 ) D 2 )d + 1) N 1 + ) 2, 41 ) ) 4 Therefore the power of a test of sigifiae level θ>0 to rejet the ull = I whe the alterative = 1 1+I is true overges to 1 1 θ) )d + 1)/1 + ) 2 ) 40) 1 < 1 41 ) )/1 + ) 4 where deotes the stadard ormal.d.f. 41) 42) Assumig p fixed, it is easily see that [ ] 1 2 ) W V) = p 1 p trs) PROOF OF PROPOSITION 6. P p1 α 2 ) = 0 uder the ull). Hee, uder the ull, W V) overges to zero i probability, as goes to ifiity for p fixed. The proof is ompleted by applyig Slutzky s theorem. PROOF OF PROPOSITION 7. Defie hx, y) = y 2x + 1 p x2 + p.the W = h p 1 trs), p 1 trs2 )). Propositio 2 implies that, by the delta method, [ W h 1, + p + 1 )] D N 0, lim C), ) 4 ).
20 1100 O. LEDOIT AND M. WOLF where Notie that 43) 44) 45) h 1, + p + 1 ) x C = h 1, + p + 1 ) y ) h h x h y ) , + p + 1 1, + p + 1 1, + p + 1 h 1, + p + 1 ) x ) h 1, + p + 1 ). y ) = p + 1, ) = 2 + p, ) = 1. Plaig the last two expressios ito the formula for C yields 46) + p)2 C = ) ) + p ) 1 + ) ) ) ) = 4. This ompletes the proof of Propositio 7. Akowledgmets. We wish to thak Theodore W. Aderso for eouragemet. We are also grateful to a Assoiate Editor ad a referee for ostrutive ritiisms that have led to a improved presetatio of the paper. REFERENCES ALALOUF, I. S. 1978). A expliit treatmet of the geeral liear model with sigular ovariae matrix. Sakhyā Ser.B ANDERSON, T. W. 1984). A Itrodutio to Multivariate Statistial Aalysis, 2d ed. Wiley, New York. ARHAROV, L. V. 1971). Limit theorems for the harateristi roots of a sample ovariae matrix. Soviet Math. Dokl BAI, Z. D. 1993). Covergee rate of expeted spetral distributios of large radom matries. II. Sample ovariae matries. A. Probab BAI, Z.D.,KRISHNAIAH, P.R.adZHAO, L. C. 1989). O rates of overgee of effiiet detetio riteria i sigal proessig with white oise. IEEE Tras. Iform. Theory
A Capacity Supply Model for Virtualized Servers
96 Iformatia Eoomiă vol. 3, o. 3/009 A apaity upply Model for Virtualized ervers Alexader PINNOW, tefa OTERBURG OttovoGuerikeUiversity, Magdeburg, Germay {alexader.piow stefa.osterburg}@iti.s.uimagdeburg.de
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationA Result on Diffuse Random Measure
It. J. Cotemp. Math. Si., Vol. 2, 2007, o. 14, 679683 Result o Diffuse Radom Measure. Varsei ad. Samimi Departmet of Mathematis Faulty of Siees, The Uiversity of Guila P.O. Box 1914 P.C. 41938, Rasht,
More informationChapter 14 Nonparametric Statistics
Chapter 14 Noparametric Statistics A.K.A. distributiofree statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More information2.11. Semiconductor thermodynamics
2.11. Semiodutor thermodyamis Thermodyamis a be used to explai some harateristis of semiodutors ad semiodutor devies, whih a ot readily be explaied based o the trasport of sigle partiles. Oe example is
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationFIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix
FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. Powers of a matrix We begi with a propositio which illustrates the usefuless of the diagoalizatio. Recall that a square matrix A is diogaalizable if
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationKey Ideas Section 81: Overview hypothesis testing Hypothesis Hypothesis Test Section 82: Basics of Hypothesis Testing Null Hypothesis
Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, Pvalue Type I Error, Type II Error, Sigificace Level, Power Sectio 81: Overview Cofidece Itervals (Chapter 7) are
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationSOLID MECHANICS DYNAMICS TUTORIAL DAMPED VIBRATIONS. On completion of this tutorial you should be able to do the following.
SOLID MECHANICS DYNAMICS TUTORIAL DAMPED VIBRATIONS This work overs elemets of the syllabus for the Egieerig Couil Eam D5 Dyamis of Mehaial Systems, C05 Mehaial ad Strutural Egieerig ad the Edeel HNC/D
More informationStatistica Siica 6(1996), 31139 EFFECT OF HIGH DIMENSION: BY AN EXAMPLE OF A TWO SAMPLE PROBLEM Zhidog Bai ad Hewa Saraadasa Natioal Su Yatse Uiversity Abstract: With the rapid developmet of moder computig
More informationLaws of Exponents. net effect is to multiply with 2 a total of 3 + 5 = 8 times
The Mathematis 11 Competey Test Laws of Expoets (i) multipliatio of two powers: multiply by five times 3 x = ( x x ) x ( x x x x ) = 8 multiply by three times et effet is to multiply with a total of 3
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More informationSequences II. Chapter 3. 3.1 Convergent Sequences
Chapter 3 Sequeces II 3. Coverget Sequeces Plot a graph of the sequece a ) = 2, 3 2, 4 3, 5 + 4,...,,... To what limit do you thik this sequece teds? What ca you say about the sequece a )? For ǫ = 0.,
More informationLecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)
18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the BruMikowski iequality for boxes. Today we ll go over the
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationChapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More information3. Covariance and Correlation
Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics
More informationx : X bar Mean (i.e. Average) of a sample
A quick referece for symbols ad formulas covered i COGS14: MEAN OF SAMPLE: x = x i x : X bar Mea (i.e. Average) of a sample x i : X sub i This stads for each idividual value you have i your sample. For
More informationConvexity, Inequalities, and Norms
Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationLecture 4: Cheeger s Inequality
Spectral Graph Theory ad Applicatios WS 0/0 Lecture 4: Cheeger s Iequality Lecturer: Thomas Sauerwald & He Su Statemet of Cheeger s Iequality I this lecture we assume for simplicity that G is a dregular
More informationOverview of some probability distributions.
Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationThe second difference is the sequence of differences of the first difference sequence, 2
Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More informationTHE TWOVARIABLE LINEAR REGRESSION MODEL
THE TWOVARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chisquare (χ ) distributio.
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationZTEST / ZSTATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
ZTEST / ZSTATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large TTEST / TSTATISTIC: used to test hypotheses about
More informationConfidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.
Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).
More informationABSTRACT INTRODUCTION MATERIALS AND METHODS
INTENATIONAL JOUNAL OF AGICULTUE & BIOLOGY 156 853/6/8 1 5 9 http://www.fspublishers.org Multiplate Peetratio Tests to Predit Soil Pressuresiage Behaviour uder etagular egio M. ASHIDI 1, A. KEYHANI AND
More informationUnit 20 Hypotheses Testing
Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect
More informationAn example of nonquenched convergence in the conditional central limit theorem for partial sums of a linear process
A example of oqueched covergece i the coditioal cetral limit theorem for partial sums of a liear process Dalibor Volý ad Michael Woodroofe Abstract A causal liear processes X,X 0,X is costructed for which
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationLinear Algebra II. 4 Determinants. Notes 4 1st November Definition of determinant
MTH6140 Liear Algebra II Notes 4 1st November 2010 4 Determiats The determiat is a fuctio defied o square matrices; its value is a scalar. It has some very importat properties: perhaps most importat is
More informationSTATISTICAL METHODS FOR BUSINESS
STATISTICAL METHODS FOR BUSINESS UNIT 7: INFERENTIAL TOOLS. DISTRIBUTIONS ASSOCIATED WITH SAMPLING 7.1. Distributios associated with the samplig process. 7.2. Iferetial processes ad relevat distributios.
More informationLecture 4: Cauchy sequences, BolzanoWeierstrass, and the Squeeze theorem
Lecture 4: Cauchy sequeces, BolzaoWeierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits
More informationInference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval
Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT  Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio
More informationExample 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook  Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 KolmogorovSmirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More informationDivide and Conquer. Maximum/minimum. Integer Multiplication. CS125 Lecture 4 Fall 2015
CS125 Lecture 4 Fall 2015 Divide ad Coquer We have see oe geeral paradigm for fidig algorithms: the greedy approach. We ow cosider aother geeral paradigm, kow as divide ad coquer. We have already see a
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationDefinition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean
1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.
More informationMEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)
MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More information3. Greatest Common Divisor  Least Common Multiple
3 Greatest Commo Divisor  Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationUC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006
Exam format UC Bereley Departmet of Electrical Egieerig ad Computer Sciece EE 6: Probablity ad Radom Processes Solutios 9 Sprig 006 The secod midterm will be held o Wedesday May 7; CHECK the fial exam
More informationHypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lieup for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
More informationTHIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E. MCCARTHY, SANDRA POTT, AND BRETT D. WICK
THIN SEQUENCES AND THE GRAM MATRIX PAMELA GORKIN, JOHN E MCCARTHY, SANDRA POTT, AND BRETT D WICK Abstract We provide a ew proof of Volberg s Theorem characterizig thi iterpolatig sequeces as those for
More informationQuantum Mechanics for Scientists and Engineers. David Miller
Quatum Mechaics for Scietists ad Egieers David Miller Measuremet ad expectatio values Measuremet ad expectatio values Quatummechaical measuremet Probabilities ad expasio coefficiets Suppose we take some
More information1.3 Binomial Coefficients
18 CHAPTER 1. COUNTING 1. Biomial Coefficiets I this sectio, we will explore various properties of biomial coefficiets. Pascal s Triagle Table 1 cotais the values of the biomial coefficiets ( ) for 0to
More informationMath C067 Sampling Distributions
Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters
More information1 Hypothesis testing for a single mean
BST 140.65 Hypothesis Testig Review otes 1 Hypothesis testig for a sigle mea 1. The ull, or status quo, hypothesis is labeled H 0, the alterative H a or H 1 or H.... A type I error occurs whe we falsely
More informationFactors of sums of powers of binomial coefficients
ACTA ARITHMETICA LXXXVI.1 (1998) Factors of sums of powers of biomial coefficiets by Neil J. Cali (Clemso, S.C.) Dedicated to the memory of Paul Erdős 1. Itroductio. It is well ow that if ( ) a f,a = the
More informationClass Meeting # 16: The Fourier Transform on R n
MATH 18.152 COUSE NOTES  CLASS MEETING # 16 18.152 Itroductio to PDEs, Fall 2011 Professor: Jared Speck Class Meetig # 16: The Fourier Trasform o 1. Itroductio to the Fourier Trasform Earlier i the course,
More informationBASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)
BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet
More informationChapter 5: Inner Product Spaces
Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples
More informationLecture 10: Hypothesis testing and confidence intervals
Eco 514: Probability ad Statistics Lecture 10: Hypothesis testig ad cofidece itervals Types of reasoig Deductive reasoig: Start with statemets that are assumed to be true ad use rules of logic to esure
More informationInfinite Sequences and Series
CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...
More informationINFINITE SERIES KEITH CONRAD
INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal
More informationCS103X: Discrete Structures Homework 4 Solutions
CS103X: Discrete Structures Homewor 4 Solutios Due February 22, 2008 Exercise 1 10 poits. Silico Valley questios: a How may possible sixfigure salaries i whole dollar amouts are there that cotai at least
More informationMannWhitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)
NoParametric ivariate Statistics: WilcoxoMaWhitey 2 Sample Test 1 MaWhitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo) MaWhitey (WMW) test is the oparametric equivalet of a pooled
More informationThe following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles
The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio
More informationBINOMIAL EXPANSIONS 12.5. In this section. Some Examples. Obtaining the Coefficients
652 (1226) Chapter 12 Sequeces ad Series 12.5 BINOMIAL EXPANSIONS I this sectio Some Examples Otaiig the Coefficiets The Biomial Theorem I Chapter 5 you leared how to square a iomial. I this sectio you
More informationTHE ABRACADABRA PROBLEM
THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationAnalyzing Patterns of User Content Generation in Online Social Networks
Aalyzig Patters of User Cotet Geeratio i Olie Soial Networks Lei Guo, Ehua Ta, Sogqig Che, Xiaodog Zhag, ad Yihog (Eri) Zhao Yahoo! I. 7 First Aveue Suyvale, CA 989, USA {lguo,yzhao}@yahooi.om Dept. of
More informationGregory Carey, 1998 Linear Transformations & Composites  1. Linear Transformations and Linear Composites
Gregory Carey, 1998 Liear Trasformatios & Composites  1 Liear Trasformatios ad Liear Composites I Liear Trasformatios of Variables Meas ad Stadard Deviatios of Liear Trasformatios A liear trasformatio
More informationChapter 10 Student Lecture Notes 101
Chapter 0 tudet Lecture Notes 0 Basic Busiess tatistics (9 th Editio) Chapter 0 Twoample Tests with Numerical Data 004 PreticeHall, Ic. Chap 0 Chapter Topics Comparig Two Idepedet amples Z test for
More information, a Wishart distribution with n 1 degrees of freedom and scale matrix.
UMEÅ UNIVERSITET Matematiskstatistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 00409 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that
More informationGibbs Distribution in Quantum Statistics
Gibbs Distributio i Quatum Statistics Quatum Mechaics is much more complicated tha the Classical oe. To fully characterize a state of oe particle i Classical Mechaics we just eed to specify its radius
More informationLesson 15 ANOVA (analysis of variance)
Outlie Variability betwee group variability withi group variability total variability Fratio Computatio sums of squares (betwee/withi/total degrees of freedom (betwee/withi/total mea square (betwee/withi
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationApproximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find
1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.
More informationhp calculators HP 12C Statistics  average and standard deviation Average and standard deviation concepts HP12C average and standard deviation
HP 1C Statistics  average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics
More informationLecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.
18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: CouratFischer formula ad Rayleigh quotiets The
More information