- PDF Free Download

Transcription

1 Statistica Siica 6(1996), EFFECT OF HIGH DIMENSION: BY AN EXAMPLE OF A TWO SAMPLE PROBLEM Zhidog Bai ad Hewa Saraadasa Natioal Su Yat-se Uiversity Abstract: With the rapid developmet of moder computig techiques, statisticias are dealig with data with much higher dimesio. Cosequetly, due to their loss of accuracy or power, some classical statistical ifereces are beig challeged by o-exact approaches. The purpose of this paper is to poit out ad briey aalyze such a pheomeo ad to ecourage statisticias to reexamie classical statistical approaches whe they are dealig with high dimesioal data. As a example, we derive the asymptotic power of the classical Hotellig's T test ad Dempster's oexact test for atwo-sample problem. Also, a asymptotically ormally distributed test statistic is proposed. Our results show that both Dempster's o-exact test ad the ew test have higher power tha Hotellig's test whe the data dimesio is proportioally close to the withi sample degrees of freedom. Although our ew test has a asymptotic power fuctio similar to Dempster's, it does ot rely o the ormality assumptio. Some simulatio results are preseted which show that the o-exact tests are more powerful tha Hotellig's test eve for moderately large dimesio ad sample sizes. Key words ad phrases: Edgeworth expasio, Hotellig T test, hypothesis test, power fuctio, sigicace test, approximatio. 1. Itroductio Moder computatio techiques make it possible to deal with high dimesioal data. Some recet examples of iterest i dealig with high dimesioal data ca be foud i Narayaaswamy ad Raghavarao (1991) ad Saraadasa (1991, 1993). Examples may also be foud i applied statistical iferece hadlig samples of may measuremets o idividuals. For example, i a cliical trial of pharmaceutical studies, may blood chemistry measuremets are measured o each idividual. I some studies the umber of variables is comparable to or eve exceeds the total sample size. The purpose of this article is to raise the followig questios: What's ew i high dimesioal statistical iferece ad what should be doe? The dierece of high dimesioal statistical iferece from that i classical statistical iferece will be referred to as the \Eect of High Dimesio" (EHD).

2 31 ZHIDONG BAI AND HEWA SARANADASA There are two aspects of the EHD. The rst, there are too may iterestig or uisace parameters i the model. For example, i M-estimatio i liear models, the umber of regressio parameters may be proportioal to the sample size. This problem remais usolved. The best results are due to Huber's work (1973) i which the cosistecy of estimatio is proved uder the assumptio that p =! 0 ad the asymptotic ormality uder p 3 =! 0, where ad p are the sample size ad the dimesio of regressio coecet vector. Althogh these requiremets o the ratio of the dimetio to the sample size were reduced, very strog assumptios were made o the desig sequece. Refereces are made to Portoy (1984,1985). Aother example is the model of Error i Variables i which the true regressor variables ca be cosidered as uisace parameters whose umber is p (while the umber of observatios is (p + 1)). I these cases, either the estimatio is very poor or it is impossible to get a ubiased or cosistet estimator. The secod case is that the dimesio itself of the data is very high. Of course, the umber of parameters to be estimated must be very large. A example is the detectio of the sigal umber i omi-directioal sigal processig. Whe the umber of sesors are icreased, the detectio accuracy is supposed to be better. However, the simulatio results show the opposite whe the traditioal method (the MUSIC method) is used if the umber of sesors is 10 or more. We believe that the reaso is that the umber of elemets of the covariace matrix (parametrs to be estimated) becomes very large (p ad 00 if p = 10). Some refereces i this directio are Bai, Krishaiah ad Zhao (1989) ad Zhao, Krishiaiah ad Bai (1986a,b). Although the EHD has bee oticed i may dieret directios of multivariate statistical ifereces, the problem has ot yet bee clearly stated i the literature ad o appropriate methods have bee proposed to deal with the EHD. To this ed, we shall aalyze these problems through the two sample problem, as a example to show howadwhy the EHD aects ifereces ad how the EHD ca be reduced. A classical method to deal with this problem is the famous Hotellig's T test. Its advatages iclude: it is ivariat uder liear trasformatio, its exact distributio is kow uder the ull hypothesis ad it is powerful whe the dimesio of data is sucietly small, compared with the sample sizes. However, Hotellig's test has the serious defect that the T statistic is udeed whe the dimesio of data is greater tha the withi sample degrees of freedom. Seekig remedies, Chug ad Fraser (1958) proposed a oparametric test ad Dempster (1958, 1960) discussed the so-called \o-exact" sigicace test. Dempster (1960) also cosidered the so-called radomizatio test. These works seek alteratives to Hotellig's test i situatios whe the latter does ot apply. Not oly beig a remedy whe the T is udeed, we show that eve it is well

3 EFFECT OF HIGH DIMENSION 313 deed, the o-exact test is more powerful tha the T test whe the dimesio is proportioally \close to" (more discussio o the ratio will be give i Sectio 5) the sample degrees of freedom. Both the T test ad Dempster's o-exact test strogly rely o the ormality assumptio. Moreover, Dempster's o-exact test statistic ivolves a complicated estimatio of r, the \degrees of freedom" for the chi-square approximatio. To simplify the testig procedure, a ew method is proposed i Sectio 4. It is prove i Sectios 3 ad4thatthe asymptotic power of the ew test is equivalet to that of Dempster's test. Simulatio results further show that our ew approach is slightly more powerful tha Dempster's. We believe that the estimatio of r ad its roudig to a iteger i Dempster's procedure may cause a error of order O(1=). This might idicate that the ew approach is superior to Dempster's test i the secod order term i some Edgeworth-type expasios. We shall ot discuss this i detail i this paper but hope to address it i future work. Some simulatio results ad discussios are preseted i Sectio 5 ad some techical proofs are give i the Appedix.. Asymptotic Power of Hotellig's Test I this sectio, we derive the asymptotic power fuctios of the T test for the two sample problem. The model described here is the same as the oe i Dempster's test give i the ext sectio. Suppose that x i j N p ( i ) j =1 ::: N i i =1 are two idepedet samples. To test the hypothesis H 0 : 1 = vs H 1 : 1 6=, traditioally oe uses Hotellig's famous T test which isdeedby T = (x 1 ; x ) 0 A ;1 (x 1 ; x ) (:1) P where x i = 1 N i Ni x j=1 i j i = 1 A = P P N i j=1(x i j ; x i )(x i j ; x i ) 0 ad = N1N N 1+N with = N 1 + N ;. The purpose of this sectio is to ivestigate the power fuctio of Hotellig's test whe p=! y (0 1) for guarateeig the existece of the T statistic, ad to compare it with other o-exact tests give i later sectios. To derive the asymptotic power of Hotellig's test, we rst derive a asymptotic expressio for the threshold of the test. It is well kow that uder the ull hypothesis, ;p+1 p T has a F -distributio with degrees of freedom p ad ; p +1. Let the sigicace level be chose as ad the threshold be deoted by F (p ; p + 1). We have the followig lemma. q p Lemma.1. F ;p+1 (p ; p +1)= y + y 1;y (1;y) 3 + o( p 1 ) where y = p, lim!1 y = y (0 1) ad is the 1 ; quatile of stadard ormal distributio.

4 314 ZHIDONG BAI AND HEWA SARANADASA Proof. Uder the ull hypothesis, by the Cetral Limit Theorem, s (1 ; y) 3 T y ; y!n(0 1) as!1 1 ; y from which the result follows immediately. Now, we cosider the behavior of T = uder H 1. I this case, its distributio is the same as (w + ;1= ) 0 U ;1 (w + ;1= ) (:) where = ; 1 (1 ; ) U= P u iu 0 i w =(w 1 ::: w p ) 0 ad u i i =1 ::: are i.i.d. N(0 I p ) radom vectors ad =(N 1 + N )=N 1 N : Deote the spectral decompositio of U ;1 by O 0 diag[d 1 ::: d p ]O with eigevalues d 1 d p > 0. The, (.) becomes (Ow + ;1= kkv) 0 diag[d 1 ::: d p ](Ow + ;1= kkv) (:3) where v = O=kk. Sice U has the Wishart distributio W ( I p ), the orthogoal matrix O has the Haar distributio o the group of all orthogoal p-matrices, ad hece the vector v is uiformly distributed o the uit p-sphere. Note that the coditioal distributio of Ow give O is N (0 I p ), the same as that of w which is idepedet ofo. This shows that Ow is idepedet ofv. Therefore, replacig Ow i (.3) by w does ot chage the joit distributio of Ow, v ad the d i 's. Cosequetly, T has the same distributio as = px (w i +w i v i ;1= kk + ;1 kk v i )d i (:4) where v =(v 1 ::: v p ) 0 is uiformly distributed o the uit sphere of R p ad is idepedet ofw ad the d i 's. Lemma.. Usig the above otatio, we have p P p d i ; y 1;y P p d i! y (1;y) 3 i probability.! 0 ad Proof. Recallig (.4) with = 0 uder the ull hypothesis ad applyig the Cetral Limit Theorem with D = fd 1 ::: d p g give, we have s T P 1 ; y y + y (1 ; y) 3 x p P q ( 1;y p ; y d y i)+ x (1;y) 3 h = E P P p (w i ; 1)d i p P p d i h p ( 1;y y = E P q p ; d y i)+ p P p d i (1;y) 3 p P p d i x D i i + o(1) (:5)

5 EFFECT OF HIGH DIMENSION 315 where is the distributio fuctio of a stadard ormal variable. O the other had, as show i the proof of Lemma.1, the Cetral Limit Theorem implies that the above quatity teds to (x), for all x. Hece, by the type-covergece theorem (see Page 16 of Loeve (1977)), the lemma is proved. Now we are i positio to derive a approximatio of the power fuctio of Hotellig's test. Theorem.1. If y = p! y (0 1), N 1=(N 1 + N )! (0 1) ad kk = o(1) the H () ; ; + s (1 ; y) y where H () is the power fuctio of Hotellig's test. (1 ; )kk! 0 (:6) Remark.1. The usual cosideratio of the alterative hypothesis i limitig theorems is to assume that p kk! a > 0. Uder this additioal assumptio, it follows from (.6) that the limitig power of Hotellig's test is give by (; +((1;y)=y) 1= (1;)a). This formula shows that the limitig power of Hotellig's test is slowly icreasig for y close to 1, as the o-cetral parameter (amely a) icreases. Proof. Write D =(d 1 ::: d ). Usig the facts Ev 1 =1=p, Ev 4 1 =3=[p(p +)] ad Ev 1 v =1=[p(p + )] ad the applyig Lemma., oe easily obtais E E h px w i v i d i ;1= kk i D = hx p (vi ; Evi ) ;1 kk d i i D = ; kk 4 h p(p +) px d i ; p (p +) px d i kk p! 0 i Pr. (:7) px i d i! 0i Pr. (:8) ad px (Ev i ) ;1 kk d i = 1 p kk px d i = y kk p(1 ; y ) (1 + o p( 1 p )): (:9) Thus, by the above ad Lemma.1, we have H () =P p X w i d i y 1 ; y + s y (1 ; y) 3 p ; y kk p(1 ; y ) + o( p 1 )

6 316 ZHIDONG BAI AND HEWA SARANADASA q y h = E P = ; + P p (w i ; 1)d i pp p d i s (1 ; y) y The proof of Theorem.1 is ow complete. 3. Discussio o Dempster's No-Exact Test p ) (1;y) 3 p ; ykk + o( 1 p(1;y) pp p d i D i (1 ; )kk + o(1): (:10) Dempster (1958, 1960) proposed a o-exact test for the hypothesis described i Sectio, with the dimesio of data possibly greater tha the sample degrees of freedom. First, let us briey describe his test. Deote q N = N 1 + N, X 0 = (x 11 x 1 ::: x 1N1 x 1 ::: x N ) ad by H 0 = ( p 1 N N J N ( N J0 1(N 1+N ) N 1, q N ; 1 N J0 (N 1+N ) N ) 0 h 3 ::: h N ) a suitably chose orthogoal matrix, where J d is a d dimesioal colum vector of 1's. Let Y = HX = (y 1 ::: y N ) 0. The, the vectors y 1 ::: y N are idepedet ormal radom vectors with E(y 1 ) = (N 1 1 +N )= p N, E(y )= ;1= ( 1 ; ), E(y j )=0 for 3 j N Cov(y j )= 1 j N. The, Dempster proposed his o-exact sigicace test statistic F = Q =( P N Q i=3 i=), where Q i = yiy 0 i, = N ;. He used the so-called approximatio techique, assumig Q i is approximately distributed as m r, where the parameters m ad r may besolved by the method of momets. The, the distributio of F is approximately F r r. But geerally the parameter r (its explicit form is give i (3.3) below) is ukow. He estimated r by either of the followig two ways. Approach 1: ^r is the solutio of the equatio Approach : ^r is the solutio of the equatio t + w = t = + 1^r 1+ 1 ( ; 1) (3:1) 1 3^r 1 + 1^r 1+ 1 ( ; 1) + + 3^r 1^r 3 (3:) ^r where t = [l( 1 P N i=3 Q i)] ; P N i=3 l Q i, w = ; P 3i<jN l si ij ad ij is the agle betwee the vectors of y i y j, 3 i<j N. Dempster's test is the to reject H 0 if F >F (^r ^r): By elemetary calculus, we have r = (tr()) tr( ) ad m = tr( ) tr : (3:3)

7 EFFECT OF HIGH DIMENSION 317 From (3.3) ad the Cauchy-Schwarz iequality, it follows that r p. O the other had, uder regular coditios, both tr() ad tr( ) are of the order O(), ad hece, r is of the same order. Uder wider coditios (3.7) ad (3.8) give i Theorem 3.1 below, it ca be proved that r! 1. Further, we may prove that t (=r)n (1 ;1= ) ad w (;1) 4 r N (1 + 8 (;1) r ). From these estimates, oe may coclude that both ^r 1 ad ^r are ratio-cosistet (i the sese that ^r r! 1). Therefore, the solutios of equatios (3.1) ad (3.) should satisfy ^r 1 = t + O(1) (3:4) ad ^r = 1 w + O(1) (3:5) respectively. Sice the radom eect may cause a error of order O(1), oe may simply choose the estimates of r as t or 1 w;. I the remaider of this sectio, we derive a asymptotic power fuctio of Dempster's o-exact test, uder the coditios: p=! y>0, N 1 =(N 1 + N )! (0 1) ad the parameter r is kow. The reader should ote that the limitig ratio y is allowed to be greater tha oe i this case, which is dieret fromthat assumed i Sectio. Whe r is ukow, substitutig r by the estimators ^r 1 or ^r may cause a error of high order smalless i the approximatio of the power fuctio of Dempster's o-exact test, as will be see i the proof of Theorem 3.1. Similar to Lemma.1 oe may show the followig lemma. Lemma 3.1. Whe r!1, F (r r)=1+ q =r + o(1= p r): (3:6) The we have the followig approximatio of the power fuctio of Dempster's test. Theorem 3.1. If ad r is kow, the where = 1 ; : D () ; (; + 0 = o(tr ) (3:7) max = o( p tr ) (3:8) (1 ; )kk p )! 0 (3:9) tr

8 318 ZHIDONG BAI AND HEWA SARANADASA Remark 3.1. I usual cases whe cosiderig the asymptotic power of Dempster's test, the quatity kk is ordiarily assumed to have the same order as 1= p ad tr( )tohave order. Thus, the quatities kk = p tr ad p kk are both bouded away from zero ad iity. The expressio of the asymptotic power of Hotellig test is ivolved with a factor p 1 ; y which disappears i the expressio of the asymptotic power of Dempster's test. This reveals the reaso why the power of the Hotellig test icreases much slower tha that of the Dempster test as the o-cetral parameter icreases if y is close to oe. Proof. Let =( 1 ::: p ) 0 = ; 1 : The, P p D () =P (yi + ;1= i y i + ;1 i ) P i P p > ;1 F (r r) (3:10) j=1 z ij i where y i, z ij i =1 ::: p j =1 ::: are i.i.d. N (0 1) variables ad 1 ::: p are eigevalues of. By the Cetral Limit Theorem, the laws of large umbers, (3.7) ad (3.8), oe may easily show that: P p (y i ; 1+ ;1= i y i ) i p tr = P p (y i ; 1+ ;1= i y i ) i p tr +4 ;1 0 D!N (0 1): (3:11) ad X px j=1 q q zij i = (tr) 1+ =rn (0 1) + o p ( 1=r) : (3:1) Notig that P p i i = kk ad r = (tr) tr the result (3.9) follows from (3.7) ad Lemma 3.1, immediately. The proof of Theorem 3.1 is ow complete. 4. A New Approach to Test H 0 I this sectio, we propose a ew test for H 0. Istead of the ormality of the uderlyig distributios, we assume: (a) x ij =;z ij + j i =1 ::: N j, j =1, where ; is a p m matrix (m 1) with ;; 0 =adz ij are i.i.d. radom m-vectors with idepedet compoets satisfyig Ez ij = 0, Var(z ij ) = I m, Ezijk 4 = 3+ < 1 ad Q m E k=1 z k ijk = 0 (ad 1) whe there is at least oe k = 1 (there are two k 's equal to, correspodigly), wheever m =4 (b) p=! y>0adn 1 =(N 1 + N )! (0 1) (c) (3.7) ad (3.8) are true. Here ad later, it should be oted that all radom variables ad parameters deped o. For simplicity we omit the subscript from all radom variables except those statistics deed later.

9 EFFECT OF HIGH DIMENSION 319 Now, we begi to costruct our test. Cosider the statistic M =(x 1 ; x ) 0 (x 1 ; x ) ; trs (4:1) where S = 1 A, x 1 x ad A are deed i Sectio. Uder H 0, we have EM =0. If the coditios (a) - (c) are true, it may be proved (see the Appedix) that uder H 0, M Z = p!n(0 1) as!1: (4:) VarM If the uderlyig distributios are ormal as described i Sectio, the uder H 0 wehave M := VarM = (1 + 1 )tr : (4:3) If the uderlyig distributios are ot ormal but satisfy the coditios (a) - (c), oe may show (see the Appedix) that VarM = M(1 + o(1)): (4:4) Hece (4.) is still true if the deomiator of Z is replaced by M. Therefore, to complete the costructio of our test statistic, we eed oly d a ratio-cosistet estimator of tr( )adsubstitute it ito the deomiator of Z. It seems that a atural estimator of tr should be trs. However, ulike the case where p is xed, trs is geerally either ubiased or ratio-cosistet eve uder the ormal assumptio. If S W p ( ),itisroutietoverify that B = trs ; 1 ( +)( ; 1) (trs ) is a ubiased ad ratio-cosistet estimator of tr. Here, it should be oted that trs ; 1 (trs ) 0, by the Cauchy-Schwarz iequality. I the Appedix, we shall prove that B is still a ratio-cosistet estimator of tr uder the Coditios (a) - (c). Replacig tr i (4.3) by the ratio-cosistet estimator B,we obtai our test statistic Z = (x 1 ; x ) 0 (x 1 ; x ) ; trs trs ; ;1 (trs ) r (+1) (+)(;1) = N 1N N 1+N (x 1 ; x ) 0 (x 1 ; x ) ; trs q (+1) B!N(0 1): (4:5) Due to (4.5) the test rejects H 0 if Z > : Regardig the asymptotic power of our ew test, we have the followig theorem.

10 30 ZHIDONG BAI AND HEWA SARANADASA Theorem 4.1. Uder the Coditios i (a) - (c), (1 ; )kk BS () ; ; + p tr! 0: (4:6) Proof. Let z j be the sample mea of z ij, i =1 ::: j j =1 ad let M 0 =(z 1 ; z ) 0 ; 0 ;(z 1 ; z ) ; tr(s ): The, M 0 has the same distributio as M uder H 0. Thus, Var(M 0 )= M(1 + o(1)) ad M 0 = p Var(M 0 )!N(0 1). Note that M = M 0 ; 0 (z 1 ; z )+kk ad by (3.7) Var( 0 (z 1 ; z )) = 0 = o( tr( )): Hece, Var(M 0 )= Var(M )! 1 ad cosequetly Note that (+1) B= Var(M) 0! 1: Hece, p M;kk!N(0 1): Var(M 0 ) Z ; (1 ; )kk p tr( )!N(0 1): This implies that BS () =P H1 (Z > ) = P M ;kk p Var M 0 = ; + > ; which completes the proof of the theorem. 5. Discussios ad Simulatios (1 ; )kk p tr + o(1) (1 ; )kk p + o(1) (4:7) tr Comparig Theorems.1, 3.1 ad 4.1, we d that from the poit of view of large sample theory, Hotellig's test is less powerful tha the other two tests, whe y is close to oe, ad that the latter two tests have the same asymptotic power fuctio. Our simulatio results show that eve for moderate sample ad dimesio sizes, Hotellig's test is still less powerful tha the other two tests whe the uderlyig covariace structure is reasoably regular (i.e., the structure of does ot cause a too large dierece betwee 0 ;1 ad p kk = p tr( )), whereas the Type I error does ot chage much i the latter two tests. It would ot be hard to see that usig the approach of this paper, oe may easily derive similar results for the oe-sample problem, amely, Hotellig's test

11 EFFECT OF HIGH DIMENSION 31 is less powerful tha a o-exact test which ca be deed as i Sectio 4, whe the dimesio of data is high. Now, we would like to explai why this pheomeo happes. The reaso for the less powerfuless of Hotellig's test is the \iaccuracy" of the estimator of the covariace matrix. Let X 1 ::: X be i.i.d. radom p-vectors of mea 0 ad variace-covariace matrix I p. By the law of large umbers, the sample covariace matrix S = P ;1 X i X 0 i should be \close" to the idetity I p with a error of the order O p (1= p ) whe p is xed. However, whe p is proportioal to (say p=! y (0 1)), the ratio of the largest ad the smallest eigevalues of S teds to (1 + p y) =(1 ; p y) (see, e.g., refereces Bai, Silverstei ad Yi (1988), Bai ad Yi (1993), Gema (1980), Silverstei (1985) ad Yi, Bai ad Krishaiah (1988)). More precisely, i the Theory of spectral aalysis of large dimesioal radom matrices, it has bee prove that the empirical distributio of the eigevalues of S teds to a limitig distributio spreadig over [(1 ; p y) (1 + p y) ] as! 1 (see e.g., Josso (198), Wachter (1978), Yi (1986) ad Yi, Bai ad Krishaiah (1983)). These show that S is ot close to I p. Especially whe y is \close to" oe, the S has may small eigevalues ad hece S ;1 has may huge eigevalues. This will cause the deciecy of the T test. We believe that i may other multivariate statistical ifereces with a iverse of a sample covariace matrix ivolved, the same pheomeo should exist ( as aother example, see Saraadasa (1991, 1993)). Here we would like to explai our quotatio-marked \ `close to' oe". Note that the limitig ratio of the largest to the smallest eigevalues of S teds to (1 + p y) =(1 ; p y). For our simulatio example, y =0:93 ad the ratio of the extreme eigevalues is about That is very serious. Eve for y as small as 0:1 or 0:01, the ratio ca be as large as 3:705 ad 1:494. These show that it is ot ecessary to require the dimesio of data to be very close to the degrees of freedom to make the eect of high dimesio visible. I fact, this has bee show by our simulatio for p =4. Dempster's test statistic depeds o the choice of vectors h 3 h 4 ::: h N because dieret choices of these vectors would produce dieret estimates of the parameter r. O the other had, the estimatio of r ad the roudig of the estimates may cause a error (probably a error of secod order smalless) i Dempster's test. Thus, we cojecture that our ew test ca be more powerful tha Dempster's i their secod terms of a Edgeworth type expasio of their power fuctios. This cojecture was strogly supported by our simulatio re-

12 3 ZHIDONG BAI AND HEWA SARANADASA sults. Because our test statistic is mathematically simple, it is ot dicult to get a Edgeworth expasio by usig the results obtai i Babu ad Bai (1993), Bai ad Rao (1991) or Bhattacharya ad Ghosh (1978). It seems dicult to get a similar expasio for Dempster's test due to his complicated estimatio of r. We coducted our simulatio study to compare the power of the three tests for both ormal ad o-ormal cases. Let N 1 = 5, N = 0, ad p = 40. For the o-ormal case, observatios were geerated by the followig movig average model: Let fu ijk g be a set of idepedet gamma variables with shape parameter 4 ad scale parameter 1. Dee X ijk = U ijk + U i j+1 k + jk (j =1 ::: p i =1 ::: N k k =1 ) where ad the 's are costats. Uder this model, = ( ij )with ii = 4(1 + ), i i1 =4 ad ij = 0 for ji ; jj > 1. For the ormal case, the covariace matrices were chose to be = I p ad=(1; )I p + J p with =0:5, where J is a p p matrix with all etries oe. Simulatio was also coducted for small p (chose as p =4). The tests were made for size =0:05 with 1000 repetitios. The power is evaluated at stadard parameter = k 1 ; k = p tr. The simulatio for the o-ormal case was coducted for =0 :3 :6 ad :9 (Table 5.1 ad Figure 5.1). All three tests have almost the same sigicace level. Uder the alterative hypothesis, the power curves of Dempster's test ad our test are rather close but that of our test is always higher tha Dempster's test. Theoretically, the power fuctio for Hotellig's test should icrease very lowly whe the ocetral parameter icreases. This was also demostrated by our simulatio results. The reader should ote that there are oly 1000 repetatios for each p value of ocetral parameter i our simulatio which may cause a error of 1=1000 = 0:0316 by Cetral Limit Theorem, it is ot surprisig the simulated power fuctio of the Hotellig's test, whose magitude is oly aroud 0:05, seems ot icreasig at some poits of the ocetral parameter. Similar tables are preseted for the ormal case (Table 5. ad Figure 5.). For higher dimesio cases the power fuctios of Dempster's test ad our test are almost the same ad our method is ot worse tha Hotellig's test eve for p =4. Ackowledgemet The research of the rst author was partially supported by US NSF grat DMS ad partially by ROC NSC grat NSC M L.

13 EFFECT OF HIGH DIMENSION 33 Table 5.1. Simulated power fuctios of the three tests with multivariate Gamma distributio. N =45 p=40 = :05 =0 =1 = :3 =3:4 = :6 =15:6 = :9 = 35:8 H D BS H D BS H D BS H D BS Table 5.. Simulated power fuctios of the three tests with multivariate ormal distributio. N =45 p=40 = :05 N =45 p=40 = :05 =0 =1 = :5 =41 =0 =1 = :5 =5 H D BS H D BS H D BS H D BS H: Hotellig's F test, D: Dempster's o exact F test, BS: Proposed ormal test, = kp 1; k ad = max tr mi.

14 34 ZHIDONG BAI AND HEWA SARANADASA (p =40 =0:0) (p =40 =0:3) Power Power Power Power (p =40 =0:6) (p =4 =0:9) Hotellig's test ;;;Dempster's test { BS's test Figure 5.1. Simulated power fuctios of the three tests with multivariate Gamma distributio (p =4 =0:0) (p =4 =0:5) Power Power (p =4 =0:0) (p =4 =0:5) Power Power Figure 5.. Simulated power fuctios of the three tests with multivariate ormal distributio. 1.0

15 EFFECT OF HIGH DIMENSION 35 Appedix. Asymptotics Related to the Statistic M A.1. The proof of (4.4): By deitio, we have M =(1+N 1 ;1 )kx 1 k +(1+N ;1 )kx k ; x 0 1 x ; ;1 X XNj j=1 kx ij k : Uder H 0, we may assume 1 = = 0. Write ; = [; 1 ::: ; p ] 0 = [ k `] ad ; 0 ;=[ k`]. The, by Coditios (a) - (c), we have Var( ;1 X XNj j=1 h = ; N tr( )+ Similarly, we may show that kx ij k )= ; E mx `=1 `` i X XNj px j=1 k=1 [(; 0 kz ij ) ;k; k k ] C ;1 [tr + max tr] = o( M): XN 1 Var(x 0 x 1 )=N ; N ; 1 E XN `=1 x 0 i1 x` 1 = tr( ) N 1 N Var(kx 1 k ) = N 1 P tr( )+ N m`=1 ``, P Var(kx 1 3 k = N tr( )+ N m`=1 `` 3 Cov(kx 1 k kx k ) = 0 ad Cov(x 0 1x kx j k )=0forj =1. Therefore, by the fact that P m `=1 `` p max, wehave 1 Var(M )= tr( )+ N1 3 The proof of (4.4) is the complete. + 1 h m i N X`=1 `` = M(1 + o(1)): 3 A.. The asymptotic ormality ofz uder H 0 : From the proof of A.1, oe ca see that (tr(s ) ; tr())= M! 0. Therefore, to show that Z!N(0 1), we eed oly show that[kx 1 ; x k ; E(kx 1 ; x k )]= M!N(0 1). We may rewrite kx 1 ; x k ; E(kx 1 ; x k )= := px k=1 mx `=1 h=1 [(; 0 k(z 1 ; z )) ;k; k k ] NX [U` h + V` h + ``(w ` h ; E(w ` h))]

16 36 ZHIDONG BAI AND HEWA SARANADASA P where z j = N j ;1 N j z ij, z jk deotes the kth compoet ofz j ad h;1 X i U` h = M hw` h ;1 `` w` k1 k 1=1 X `;1 V` h = M ;1 w` h `1=1 `1`(z 1`1 ; z `1) with the covetio that P 0 `1=1 = 0 ad the otatio w` h = 8 >< >: 1 N 1 z h 1 ` if h =1 ::: N 1, 1 N z h;n1 ` if h = N 1 +1 ::: N. Sice Var( P m P N `=1 h=1(w ` h ; E(w ` h))) = ( + )( 1 N N ) P m ``= 1 3 `=1 M! 0, we eed oly show that P m P N `=1 h=1[u` h + V` h ]!N(0 1). Note that fun(`;1)+k = U` k+v` k g forms a sequece of martigale diereces with -elds F N(`;1)+h = F(z ijt j = 1 t < ` i = 1 ::: N j ad w` i i h). The the asymptotic ormality may be proved by employig Corollary 3.1 i Hall (1980) with routie vericatio of the followig: ad m Var X`=1 mx NX `=1 h=1 NX h=1 The proof of (4.) is ow complete. E(U 4` h + V 4 ` h)! 0 E[(U ` h + V` h)jf N(`;1)+h ]! 0: A.3. The ratio-cosistecy of B : We oly eed show that ~ B = trs ; 1 (trs ) is ratio-cosistet for tr( ). Without loss of geerality, we assume that 1 = =0. Note that h X XNj S = ;1 j=1 x ij x 0 ij ; N 1x 1 x 0 1 ; N x x 0 Sice Ex 0 j x j = N ;1 j tr() = o( p tr( )), j = 1, it follows that, x 0 j x j = o( p tr( )). Therefore, we eed oly show that 1 ^B =tr X XNj j=1 x ij x 0 ij 1 ; tr( 1 X XNj j=1 i : x ij x 0 ij)

17 EFFECT OF HIGH DIMENSION 37 is a ratio-cosistet estimator of tr( ). P By elemetary calculatio, we have E(tr( 1 P N j j=1 x ij x 0 ij)) = N tr() ad Var(tr( p P 1 P N j j=1 x ij x 0 ij)) = O(tr( )). These, together with p ;1= tr() = o( tr( )), imply that Rewrite 1 tr 1 tr( 1 X XNj j=1 X XNj j=1 x ij x 0 ij = N tr( )+ N + 1 X X XNj N j 0 X j=1 j 0 =1 i 0 =1 x ij x 0 N ij) = (tr()) + o p (tr( )): X XNj j=1 := N tr( )+H 1 + H : We have E(H 1 )=0adVar(H 1 )= 4N 3 4 Thus, tr((; 0 ;) (z ij z 0 ij ; I m)) (tr((; 0 ;)(z ij z 0 ij ; I m ))(; 0 ;)(z i 0 j0z0 i 0 j ; I m)) 0 h tr( 4 )+ P i m ([(; 0 ;) ] ii = o(tr ( )). H 1 = o p (tr( )): Write H = H 1 + H + H 3 + H 4 + H 5, where H 1 = 1 H = 1 X (ij)6=(i 0 j 0 ) X XNj (tr((; 0 ;)(z ij z 0 ij ; I m))(; 0 ;)(z i 0 j0z0 i 0 j ; I m)) 0 X j=1 (k 0 6=` `06=k) k ` k 0 `0(z ij`z ijk 0z ij`0z ijk ) ad H 3 = H 4 = H 5 = 1 X XNj X j=1 X XNj X `6=k6=`0 j=1 `6=k X XNj mx j=1 k `=1 k `` `0((z ij` ; 1)(z ij`0z ijk )) k `` `((z ij` ; 1)(z ij`z ijk )) k `(z ij` ; 1)(z ijk ; 1):

18 38 ZHIDONG BAI AND HEWA SARANADASA We have E(H 1 )=0,E(H )= N [tr( )+tr (); P m k=1 kk = N tr ()+ o(tr( )) ad Var(H 1 )= N(N ; 1) h 4tr ( )+4 4 mx i j t=1 ij it + X m ij 4 i j=1 i = o(tr ( )): Similarly,wemay showthatvar(h ) ad Var(H 3 )have the same order. Fially, oe may show that ad EjH 4 j CN CN EjH 5 j CN mx `=1 mx ` `E mx k `z 11k k=1 `=1 ` `v uut m X k=1 k ` CN maxtr() = o(tr()): mx k `=1 k ` = o(tr()): Combiig the above, we obtai H = N tr () + o p (tr( )). Thus, ^B = tr( )[1 + o p (1)] ad cosequetly, the ratio-cosistecy of ^B follows. Refereces Babu, G. J. ad Bai, Z. D. (1993). Edgeworth expasios of a fuctio of sample meas uder miimal momet coditios ad partial Cramer's coditios. Sakhya Ser.A 55, Bai, Z. D., Krishaiah, P. R. ad Zhao, L. (1989). O rates of covergece of eciet detectio criteria i sigal processig with white oise IEEE Iformatio 35, Bai, Z. D. ad Rao, C. R. (1991). Edgeworth expasio of a fuctio of sample meas. A. Statist. 19, Bai, Z. D. Silverstei, J. W. ad Yi, Y. Q. (1988). A ote o the largest eigevalue of a large dimesioal sample covariace matrix. J. Multivariate Aal. 6, Bai, Z. D. ad Yi, Y. Q. (1993). Limit of the smallest eigevalue of large dimesioal sample covariace matrix. A. Probab. 1, Bhattacharya, R. N. ad Ghosh, J. K. (1988). O momet coditios for valid formal Edgeworth expasios. J. Multivariate Aal. 7, Chug, J. H. ad Fraser, D. A. S. (1958). Radomizatio tests for a multivariate two-sample problem. J. Amer. Statist. Assoc. 53, Dempster, A. P. (1958). A high dimesioal two sample sigicace test. A. Math. Statist. 9, Dempster, A. P. (1960). A sigicace test for the separatio of two highly multivariate small samples. Biometrics 16, Gema, S. (1980). A limit theorem for the orm of radom matrices. A. Probab. 8, 5-61.

19 EFFECT OF HIGH DIMENSION 39 Hall, P. G. ad Heyde, C. C. (1980). Martigale Limit Theory ad Its Applicatios. Academic Press, New York. Huber, Peter J. (1973). Robust regressio: Asymptotics, cojectures ad Mote Carlo A. Statist. 1, Josso, D. (198). Some limit theorems for the eigevalues of a sample covariace matrix. J. Multivariate Aal. 1, Loeve, M. (1977). Probability Theory, 4th Ed. Spriger-Verlag, New York. Narayaaswamy, C. R. ad Raghavarao, D. (1991). Pricipal compoet aalysis of large dispersio matrices. Appl. Statist. 40, Portoy, S. (1984). Asymptotic behavior of M-estimators of p regressio parameters whe p = is large. I. Cosistecy A. Statist. 1, Portoy, S. (1985). Asymptotic behavior of M-estimators of p regressio parameters whe p = is large: II. Normal approximatio (Corr: 91V19 p8) A. Statist. 13, Saraadasa, H. (1991). Discrimiat aalysis based o experimetal desig cocepts, Ph.D. Thesis, Departmet of Statistics, Temple Uiversity. Saraadasa, H. (1993). Asymptotic expasio of the misclassicatio probabilities of D- ad A-criteria for discrimiatio from two high dimesioal populatios usig the theory of large dimesioal radom matrices. J. Multivariate Aal. 46, Silverstei, J. W. (1985). The smallest eigevalue of a large dimesioal Wishart matrix. A. Probab. 13, Wachter, K. W. (1978). The strog limits of radom matrix spectra for sample matrices of idepedet elemets. A. Probab. 6, Yi, Y. Q. (1986). Limitig spectral distributio for a class of radom matrices. J. Multivariate Aal. 0, Yi, Y. Q., Bai, Z. D. ad Krishaiah, P. R. (1983). Limitig behavior of the eigevalues of a multivariate F matrix. J. Multivariate Aal. 13, Yi, Y. Q., Bai, Z. D. ad Krishaiah, P. R. (1988). O the limit of the Largest eigevalue of the large dimesioal sample covariace matrix. Probab. Theory Related Fields 78, Zhao, L. C., Krishaiah, P. R. ad Bai, Z. D. (1986a). O detectio of the umber of sigals i presece of white oise J. Multivariate Aal. 0, 1-5. Zhao, L. C., Krishaiah, P. R. ad Bai, Z. D. (1986b). O detectio of the umber of sigals whe the oise covariace matrix is arbitrary J. Multivariate Aal. 0, Departmet of Applied Mathematics, Natioal Su Yat-se Uiversity, Kaohsiug 8044, Taiwa. The R. W. Johso Pharmaceutical Research Istitute, Precliical Biostatistics, Welsh ad Mckea Road, Sprig House, PA , U.S.A. (Received July 1993 accepted April 1995)