On Mardia s Tests of Multinormality


 Jessie Dorsey
 1 years ago
 Views:
Transcription
1 On Mardia s Tests of Multinormality Kankainen, A., Taskinen, S., Oja, H. Abstract. Classical multivariate analysis is based on the assumption that the data come from a multivariate normal distribution. The tests of multinormality have therefore received very much attention. Several tests for assessing multinormality, among them Mardia s popular multivariate skewness kurtosis statistics, are based on stardized third fourth moments. In Mardia s construction of the affine invariant test statistics, the data vectors are first stardized using the sample mean vector the sample covariance matrix. In this paper we investigate whether, in the test construction, it is advantagoeus to replace the regular sample mean vector sample covariance matrix by their affine equivariant robust competitors. Limiting distributions of the stardized third fourth moments the resulting test statistics are derived under the null hypothesis are shown to be strongly dependent on the choice of the location vector scatter matrix estimate. Finally, the effects of the modification on the limiting finitesample efficiencies are illustrated by simple examples in the case of testing for the bivariate normality. In the cases studied, the modifications seem to increase the power of the tests. 1. Introduction Classical methods of multivariate analysis are for the most part based on the assumption that the data are coming from a multivariate normal population. In the onesample multinormal case, the regular sample mean vector sample covariance matrix are complete sufficient, but on the other h highly sensitive to outlying observations. It is also well known that the tests estimates based on the sample mean vector sample covariance matrix have poor efficiency properties in the case of heavy tailed noise distributions. Testing for departures from multinormality helps to make practical choices between competing methods of analysis. In the univariate case, stardized third fourth moments b 1 b 2 are often used to indicate the skewness kurtosis. For a rom sample x 1,..., x n from a pvariate distribution with sample mean vector x sample covariance Received by the editors May 2, Mathematics Subject Classification. Primary 62H15; Secondary 62F02. Key words phrases. Kurtosis, Multinormality, Robustness, Skewness.
2 2 Kankainen, A., Taskinen, S., Oja, H. matrix S, Mardia (1970,1974,1980) defined the pvariate skewness kurtosis statistics as b 1,p = ave i,j [(x i x) T S 1 (x j x)] 3 b 2,p = ave i [(x i x) T S 1 (x i x)] 2, respectively. The statistics b 1,p b 2,p are functions of stardized third fourth moments, respectively. From the definition one easily sees that b 1,p b 2,p are invariant under affine transformations. In the univariate case these reduce to the usual univariate skewness kurtosis statistics b 1 b 2. Mardia advocated using the skewness kurtosis statistics to test for multinormality as they are distributionfree under multinormality. See also Bera John (1983) Koziol (1993) for the use of the stardized third fourth moments in the test construction. Intuitively, as in the univariate case, the skewness kurtosis statistics b 1,p b 2,p compare the variation measured by third fourth moments to that measured by second moments. The second, third fourth moments are all highly nonrobust statistics, however, more efficient procedures may be obtained through comparisons of robust nonrobust measures of variations. It is therefore an appealing idea to study whether it is useful or reasonable to replace the regular sample mean vector x sample covariance matrix S by some affine equivariant robust competitors, location vector estimate ˆµ scatter matrix estimate ˆΣ. The paper is organised as follows. In Section 2 the limiting distributions of Mardia s test statistics under the null model of multinormality are discussed. The test statistics are based on the stardized third fourth moments; the regular mean vector regular covariance matrix are used in the stardization. Section 3 lists the tools assumptions for alternative nconsistent affine equivariant location scatter estimates. The location scatter estimates are then used in the regular manner to stardize the data vectors. The limiting distributions of the stardized third fourth moments in this general case are found in Section 4. As an illustration of the theory, the effect of the way of stardization on the limiting small sample efficiencies in the bivariate case are studied in Sections 5 6. The auxiliary lemmas proofs have been postponed to the Appendix. 2. Limiting null distributions of Mardia s statistics In this section we recall the wellknown results concerning the limiting null distributions of the Mardia s skewness kurtosis statistics. As, due to affine invariance, the distributions of the test statistics do not depend on the unknown µ Σ, it is not a restriction to assume in the following derivations that x 1,..., x n is a rom sample from the N p (0, I p ) distribution. Write z i = S 1/2 (x i x), i = 1,..., n,
3 On Mardia s Tests of Multinormality 3 for the data vectors stardized in the classical way. Mardia s skewness statistic can be decomposed as b 1,p = 6 j<k<l[ave i {z ij z ik z il }] i {zijz j k[ave 2 ik }] 2 + [ave i {zij}] 3 2 j the Mardia s kurtosis statistic is similarly b 2,p = j k ave i {z 2 ijz 2 ik} + j ave i {z 4 ij}. See e.g. Koziol (1993). Under multinormality b 1,p b 2,p are affine invariant, all the stardized third fourth moments are asymptotically normal asymptotically independent consequently the limiting distributions of n b 1,p 6 n b 2,p p(p + 2) 8p(p + 2) are a chisquare distribution with p(p+1)(p+2)/6 degrees of freedom a N(0, 1) distribution, respectively. 3. Location scatter estimates In Mardia s test construction, the data vectors are stardized using the regular sample mean vector x sample covariance matrix S. In this paper we thus consider what happens if these are replaced by other nconsistent affine equivariant location scatter estimates of µ Σ, say, ˆµ ˆΣ. Next we list assumptions useful notations for location vector estimate ˆµ scatter matrix estimate ˆΣ. First we assume that, for the N p (0, I p ) distribution, the influence functions of these location scatter functionals at z are γ(r) u α(r) uu T β(r) I p, respectively, where r = z u = z 1 z. We will later see that the limiting distributions of the modified test statistics depend on the used location scatter estimates through these functions. For the inluence functions of location scatter functionals, see also Hampel et al. (1986), Croux Haesbroek (2000) Ollila et al. (2003a,b). We further assume that, again under the N p (0, I p ) distribution, t = n ave{γ(r i )u i } = n ˆµ + o P (1) C = n ave{α(r i )u i u T i β(r i )I p } = n (ˆΣ I p ) + o P (1). Finally, assume that, for the N p (0, I p ) distribution, the expected values E(γ 2 (r)), E(α 2 (r)) E(β 2 (r)) are finite. These assumptions imply that the limiting distributions of nˆµ n vec(ˆσ I p ) are p p 2 variate normal distributions
4 4 Kankainen, A., Taskinen, S., Oja, H. with zero mean vectors covariance matrices E[γ 2 (r)] I p p E[α 2 (r)] p(p + 2) [I p 2+I p,p+vec(i p )vec(i p ) T ]+ (E[β 2 (r)] + 2p ) E[α(r)β(r)] vec(i p )vec(i p ) T, respectively. (Here I p,p is the so called commutation matrix. See e.g. Ollila et al. (2003b).) 4. Limiting distribution of the stardized third fourth moments Consider now any estimators ˆµ ˆΣ write z i = ˆΣ 1/2 (x i ˆµ), i = 1,..., n, for the stardized observations. Denote the stardized third fourth moments by U jkl = n ave i {z ij z ik z il }, V jk = n ave i {z 2 ijz 2 ik 1} V jj = n ave i {z 4 ij 3}. The limiting distributions of the stardized third fourth moments are given by the following theorem. The results easily follow from Lemmas stated proven in the Appendix. Theorem 4.1. Under multinormality, the limiting distribution of the vector of all possible third moments U jkl (j < k < l), U jjk (j k) U jjj is a multivariate normal distribution with zero mean vector limiting variances covariances where V ar(u jkl ) = 1 V ar(u jjk ) = 2 + κ V ar(u jjj ) = 6 + 9κ Cov(U jjk, U llk ) = κ Cov(U jjk, U kkk ) = 3κ κ = 1 + E[γ2 (r)] p All the other covariances are zero. 2 E[r3 γ(r)] p(p + 2).
5 On Mardia s Tests of Multinormality 5 Theorem 4.2. Under multinormality, the limiting distribution of the vector of fourth moments V jk (j k) V jj is a multivariate normal distribution with zero mean vector limiting variances covariances V ar(v jk ) = 4 + τ 1 + 2τ 2 V ar(v jj ) = τ τ 2 Cov(V jk, V jl ) = τ 1 + τ 2 Cov(V jk, V ml ) = τ 1 Cov(V jj, V kk ) = 9τ 1 Cov(V jk, V jj ) = 3τ 1 + 6τ 2 Cov(V jk, V ll ) = 3τ 1 where τ 1 = 4 [ E[α 2 ] (r)] p(p + 2) 2E[α(r)β(r)] + E[β 2 (r)] E[α(r)r4 ] p p(p + 2)(p + 4) + E[β(r)r4 ] p(p + 2) τ 2 = E[α2 (r)] p(p + 2) 4 E[α(r)r 4 ] p(p + 2)(p + 4). We close this section with some discussion on the implications of the above results. First note that, if the regular mean vector sample covariance matrix are used in the stardization, then simply which gives γ(r) = r, α(r) = r 2 β(r) = 1 κ = τ 1 = τ 2 = 0 all the third moments fourth moments are asymptotically mutually independent, respectively. For other location scatter statistics this is not necessarily true also the limiting variances may vary. This means that the limiting distributions efficiencies of b 1,p b 2,p may drastically change. The limiting distribution of nb 1,p /6 is a weighted sum of chisquare variables with one degree of freedom which makes the calculation of the approximate pvalue less practical. The limiting distribution of n(b 2,p p(p + 2)) is still normal with mean value zero, but its limiting variance depends on the chosen scatter matrix. In the last two section we illustrate the impact of the choice of the location vector scatter matrix estimate on the limiting distribution efficiency in the bivariate case. The extension to the general pvariate case is straightforward. 5. Tests for bivariate normality Assume now that x 1,..., x n is a rom sample from an unknown bivariate distribution we wish to test the null hypothesis H 0 of bivariate normality. For
6 6 Kankainen, A., Taskinen, S., Oja, H. limiting power considerations we use the contaminated bivariate normal distribution, the so called Tukey distribution: We say that the distribution of x is a bivariate Tukey distribution T (, µ, σ) if the pdf of x is f(x) = (1 )φ(x) + σ 2 φ((x µ)/σ), where φ(x) is the pdf of N 2 (0, I 2 ) distribution [0, 1]. Alternative sequences H n 1 : T (, µ, 1) H n 2 : T (, 0, σ) with = δ/ n, r = µ > 0 σ > 1 are then used for the comparisons of the skewness kurtosis test statistics, respectively. Write now T 1 = (U 112, U 221, U 111, U 222 ) T T 2 = (V 11, V 22, V 12 ) T for the vectors of the stardized third fourth moments. Using the results in Section 4, one directly gets the following results. Corollary 5.1. Under H 0, the limiting distribution of nt 1 is a 4variate normal distribution with zero mean vector covariance matrix Ω 1 = κ Corollary 5.2. Under H 0, the limiting distribution of nt 2 is a 3variate normal distribution with zero mean vector covariance matrix Ω 2 = τ τ It is now possible to construct the limiting distributions of the affine invariant test statistics b 1,2 b 2,2. The comparison of the effect of different stardizations is possible but not easy, however, as the limiting distributions may be of different type. This is avoided if one uses the following test statistic versions: Definition 5.3. Modified Mardia s skewness kurtosis statistics in the bivariate case are b 1,2 = T 1 T Ω 1 1 T 1 b 2,2 = (1, 1, 2)T 2. Note that, if the sample mean covariance matrix are used, the definition gives b 1,2 = b 1,2 /6 b 2,2 = b 2,2 8, respectively. In the former case, with general ˆµ ˆΣ, the affine equivariance property is lost. The alternative to the regular sample mean vector sample covariance matrix used in this example is the affine equivariant Oja median (Oja, 1983) the related scatter matrix estimate based on the Oja sign covariance matrix (SCM)..
7 On Mardia s Tests of Multinormality 7 See Visuri et al. (2000) Ollila et al. (2003b) for the SCM. Note also that the scatter matrix estimate based on the Oja SCM is asymptotically equivalent with the zonoid covariance matrix (ZCM) can be seen as an affine equivariant extension on mean deviation. For the definition use of the ZCM, see Koshevoy et al (2003). We chose these statistics as their influence functions are relatively simple at the N 2 (0, I 2 ) case, the influence functions are given by [ ] 2 2 γ(r) = 2 π, α(r) = 2 2 π r 1 β(r) = 1. The constants needed for the asymptotic variances covariances are then κ = 4 π 1 2, τ 1 = 32 π 29 τ 2 = 16 3 π Consider first the alternative contiguous sequence H1 n with µ = (r, 0,..., 0) T. Using classical LeCam s third lemma one then easily sees that, under the alternative sequence H1 n, the limiting distribution of nb 1,2 is a noncentral chisquare distribution with 4 degrees of freedom noncentrality parameter δ T 1 Ω 1 1 δ 1, where δ 1 depends on the location estimate used in the stardization. In the regular case (sample mean vector), the Oja median gives δ 1 = (0, 0, r 3, 0) T δ 1 = (0, r q(r)2 2/π, r 3 + 3r q(r)6 2/π, 0) T. The function q(r) = R(µ) is easily computable defined through the so called spatial rank function R(µ) = E( µ x 1 (µ x)) with x N 2 (0, I 2 ). See Möttönen Oja (1995). Next we consider the sequence H2 n.then V ar(b 2,2) = (1, 1, 2) Ω under H2 n the limiting distribution of n(b 2,2) 2 /V ar(b 2,2) is a noncentral chisquare distribution with one degree of freedom noncentrality parameter [(1, 1, 2)δ 2 ] 2 /V ar(b 2,2), where δ 2 in this case depends on the scatter estimate used in the stardization. In the regular case (sample covariance matrix) we get the ZCM gives δ 2 = (3(σ 2 1) 2, 3(σ 2 1) 2, (σ 2 1) 2 ) T δ 2 = (3(σ 4 1) 12(σ 1), 3(σ 4 1) 12(σ 1), (σ 4 1) 4(σ 1)) T In Figure 1 the asymptotical relative efficiencies of the new tests (using the Oja median the ZCM estimate) relative to the regular tests (using the sample mean sample covariance matrix) are illustrated for different values of r
8 8 Kankainen, A., Taskinen, S., Oja, H. for different values of σ. The AREs are then simply the ratios of the noncentrality parameters. The new tests seem to perform very well: The test statistics using robust stardization seem to perform better than the tests with regular location scatter estimates. ARE ARE r σ Figure 1. Limiting efficiencies of the tests based on the Oja median the ZCM relative to the regular Mardia s tests for different values of r σ. The left (right) panel illustrates the efficiency of b 1,2 (b 2,2). 6. A real data example In our real data example we consider two bivariate data sets of n = 50 observations the combined data set (n = 100). The data sets are shown explained in Figure 2. We wish to test the null hypothesis of bivariate normality. Again the the skewness kurtosis statistics to be compared are those based on the sample mean vector sample covariance matrix (regular Mardia s b 1,2 b 2,2 ) those based on the Oja median the ZCM (denoted by b 1,2 b 2,2). The values of the test statistics the approximate pvalues were calculated for the three data sets ( Boys, Girls, Combined ). See Table 1 for the results. As expected, the observed pvalues were much smaller in the combined data case. As discussed in the introduction, the statistics b 1,2 b 2,2 compare the third fourth central moments to the second ones (given by the sample covariance matrix). The statistics b 1,2 b 2,2 use robust location vector scatter matrix estimates in this comparison. One can then expect that, for data sets with outliers or for data sets coming from heavy tailed distributions, the comparison to robust estimates tends to produce higher values of the test statistics smaller pvalues.
9 On Mardia s Tests of Multinormality 9 Weights at birth (kg) Heights of the mothers (cm) Figure 2. The height of the mother the birth weight of the child measured on 50 boys who have siblings (o) 50 girls (x) who do not have siblings. Table 1. The observed values of the test statistics with corresponding pvalues for the three data sets. Boys Girls Combined Test statistic pvalue Test statistic pvalue Test statistic pvalue b 1, b 1, b 2, b 2, In our examples for the analyses of bivariate data we used the Oja median the ZCM to stardize the data vectors. In the general pvariate case (with large p large n) these estimates are, however, computationally highly intensive, should be replaced by more practical robust location scatter estimates.
10 10 Kankainen, A., Taskinen, S., Oja, H. 7. Appendix: Auxiliary lemmas The limiting distributions of the stardized third fourth moments are given by the following linear approximations. Lemma 7.1. Let x 1,..., x n be a rom sample from N(0, I p )distribution let z i = ˆΣ 1/2 (x i ˆµ), i = 1,..., n, be the stardized observations. Write also r i = x i u i = x i 1 x i. Then n avei {z ij z ik z il } = n ave i {x ij x ik x il } + o P (1) = n ave i {r 3 i u ij u ik u il } + o P (1) for j < k < l, n avei {z 2 ijz ik } = n ave i {x 2 ijx ik } t k + o P (1) = n ave i {r 3 i u 2 iju ik γ(r i )u ik } + o P (1) for j k n avei {z 3 ij} = n ave i {x 3 ij} 3t j + o P (1) = n ave i {r 3 i u 3 ij 3γ(r i )u ij } + o P (1). Lemma 7.2. Let x 1,..., x n be a rom sample from N(0, I p )distribution let z i = ˆΣ 1/2 (x i ˆµ), i = 1,..., n, be the stardized observations. Again, r i = x i u i = x i 1 x i. Then for j k, n avei {z 2 ijz 2 ik 1} = n ave i {x 2 ijx 2 ik 1} C jj C kk + o P (1) = n ave i {r 4 i u 2 iju 2 ik 1 α(r i )[u 2 ij + u 2 ik] + 2β(r i )} + o P (1) n avei {z 4 ij 3} = n ave i {x 4 ij 3} 6C jj + o P (1) = n ave i {r 4 i u 4 ij 3 6α(r i )u 2 ij + 6β(r i )} + o P (1). Proofs of the lemmas: Note first that Thus z i = ˆΣ 1/2 (x i ˆµ) = x i 1 n t 1 2 n Cx i + o(1/ n). z ij = x ij 1 n t j 1 2 n z 2 ij = x 2 ij 2 n t j x ij 1 n z 3 ij = x 3 ij 3 n t j x 2 ij 3 2 n p C jr x ir + o(1/ n), r=1 p C jr x ir x ij + o(1/ n), r=1 p C jr x ir x 2 ij + o(1/ n) r=1
11 On Mardia s Tests of Multinormality 11 z 4 ij = x 4 ij 4 n t j x 3 ij 2 n p C jr x ir x 3 ij + o(1/ n). As the observations come from the N p (0, I p ) distribution (all the moments exist), the stard law of large numbers central limit theorem together with Slutsky s lemma yield the result. r=1 Acknowledgements We thank the referees for the critical constructive comments. Finally, we wish to mention that, after writing the paper, we learned about the paper Croux et al. (2003) with a very similar but more general approach. Croux his cowriters study the behaviour of the infromation matrix (IM) test when the maximum likelihood estimates are replaced by robust estimates. One of their examples is the modified JarqueBera test for univariate skewness kurtosis. Their conclusion also is that the use of the robust estimates icreases the power of the IM test. References [1] A. Bera S. John, Tests for multivariate normality with Pearson alternatives. Commun. Statist. Theor. Meth. 12(1) (1983), [2] C. Croux, G. Dhaene D. Hoorelbeke, Testing the information matrix equality with robust estimators. Submitted. (Website address: [3] C. Croux G. Haesbroeck, Principal component analysis based on robust estimators of the covariance or correlation matrix: Influence functions efficiencies. Biometrika 87 (2000), [4] F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw, P. J. W.A. Stahel, Robust Statistics: The Approach Based on Influence Functions. Wiley, [5] G. Koshevoy, J. Mttnen H. Oja, A scatter matrix estimate based on the zonotope. Ann. Statist. 31 (2003), [6] J. A. Koziol, Probability plots for assessing multivariate normality. Statistician 42 (1993), [7] K. V. Mardia, Measures of multivariate skewness kurtosis with applications. Biometrika 57 (1970), [8] K. V. Mardia, Applications of some measures of multivariate skewness kurtosis in testing normality robustness studies. Sankhyã, Ser B 36(2) (1974), [9] K. V. Mardia, Tests of univariate multivariate normality. In: P.R. Krishnaiah (ed.), Hbook of Statistics, vol 1, , North Holl, [10] J. Möttönen H. Oja, Multivariate spatial sign rank methods. J. Nonparam. Statist. 5 (1995), [11] H. Oja, Descriptive statistics for multivariate distributions. Statist. Probab. Lett. 1 (1983),
12 12 Kankainen, A., Taskinen, S., Oja, H. [12] E. Ollila, T.P. Hettmansperger H. Oja, Affine equivariant multivariate sign methods. (2003a) Under revision. [13] E. Ollila, H. Oja C. Croux, The affine equivariant sign covariance matrix: Asymptotic behavior efficiencies. J. Multiv.Analysis (2003b), in print. [14] S. Visuri, V. Koivunen, H. Oja, Sign rank covariance matrices. J. Statist. Plann. Inference 91 (2000), University of Jyväskylä, Department of Mathematics Statistics, P.O. Box 35 (MaD), FIN University of Jyväskylä, Finl address: University of Jyväskylä, Department of Mathematics Statistics, P.O. Box 35 (MaD), FIN University of Jyväskylä, Finl address: University of Jyväskylä, Department of Mathematics Statistics, P.O. Box 35 (MaD), FIN University of Jyväskylä, Finl address:
MINITAB ASSISTANT WHITE PAPER
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationTests of MeanVariance Spanning
ANNALS OF ECONOMICS AND FINANCE 131, 145 193 (2012) Tests of MeanVariance Spanning Raymond Kan Rotman School of Management, University of Toronto, Canada Email: kan@chass.utoronto.ca and GuoFu Zhou
More informationA MeanVariance Framework for Tests of Asset Pricing Models
A MeanVariance Framework for Tests of Asset Pricing Models Shmuel Kandel University of Chicago TelAviv, University Robert F. Stambaugh University of Pennsylvania This article presents a meanvariance
More informationBeyond Correlation: Extreme Comovements Between Financial Assets
Beyond Correlation: Extreme Comovements Between Financial Assets Roy Mashal Columbia University Assaf Zeevi Columbia University This version: October 14, 2002 Abstract This paper investigates the potential
More informationINFERENCE FOR NORMAL MIXTURES IN MEAN AND VARIANCE
Statistica Sinica 18(2008), 443465 INFERENCE FOR NORMAL MIXTURES IN MEAN AND VARIANCE Jiahua Chen 1, Xianming Tan 2 and Runchu Zhang 2 1 University of British Columbia and 2 LPMC Nankai University Abstract:
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 OneWay ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationIndependent Component Analysis: Algorithms and Applications
Independent Component Analysis: Algorithms and Applications Aapo Hyvärinen and Erkki Oja Neural Networks Research Centre Helsinki University of Technology P.O. Box 5400, FIN02015 HUT, Finland Neural Networks,
More informationOn estimation efficiency of the central mean subspace
J. R. Statist. Soc. B (2014) 76, Part 5, pp. 885 901 On estimation efficiency of the central mean subspace Yanyuan Ma Texas A&M University, College Station, USA and Liping Zhu Shanghai University of Finance
More informationAn Introduction to Regression Analysis
The Inaugural Coase Lecture An Introduction to Regression Analysis Alan O. Sykes * Regression analysis is a statistical tool for the investigation of relationships between variables. Usually, the investigator
More informationStrong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach
J. R. Statist. Soc. B (2004) 66, Part 1, pp. 187 205 Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach John D. Storey,
More informationIntroduction to partitioningbased clustering methods with a robust example
Reports of the Department of Mathematical Information Technology Series C. Software and Computational Engineering No. C. 1/2006 Introduction to partitioningbased clustering methods with a robust example
More informationThe Variability of PValues. Summary
The Variability of PValues Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 276958203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report
More informationPÓLYA URN MODELS UNDER GENERAL REPLACEMENT SCHEMES
J. Japan Statist. Soc. Vol. 31 No. 2 2001 193 205 PÓLYA URN MODELS UNDER GENERAL REPLACEMENT SCHEMES Kiyoshi Inoue* and Sigeo Aki* In this paper, we consider a Pólya urn model containing balls of m different
More informationTests in a case control design including relatives
Tests in a case control design including relatives Stefanie Biedermann i, Eva Nagel i, Axel Munk ii, Hajo Holzmann ii, Ansgar Steland i Abstract We present a new approach to handle dependent data arising
More informationExplaining Cointegration Analysis: Part II
Explaining Cointegration Analysis: Part II David F. Hendry and Katarina Juselius Nuffield College, Oxford, OX1 1NF. Department of Economics, University of Copenhagen, Denmark Abstract We describe the concept
More informationTWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS
TWO L 1 BASED NONCONVEX METHODS FOR CONSTRUCTING SPARSE MEAN REVERTING PORTFOLIOS XIAOLONG LONG, KNUT SOLNA, AND JACK XIN Abstract. We study the problem of constructing sparse and fast mean reverting portfolios.
More informationMultivariate Analysis of Variance (MANOVA): I. Theory
Gregory Carey, 1998 MANOVA: I  1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the
More information1 Theory: The General Linear Model
QMIN GLM Theory  1.1 1 Theory: The General Linear Model 1.1 Introduction Before digital computers, statistics textbooks spoke of three procedures regression, the analysis of variance (ANOVA), and the
More informationA direct approach to false discovery rates
J. R. Statist. Soc. B (2002) 64, Part 3, pp. 479 498 A direct approach to false discovery rates John D. Storey Stanford University, USA [Received June 2001. Revised December 2001] Summary. Multiplehypothesis
More informationSubspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity
Subspace Pursuit for Compressive Sensing: Closing the Gap Between Performance and Complexity Wei Dai and Olgica Milenkovic Department of Electrical and Computer Engineering University of Illinois at UrbanaChampaign
More informationRegression. Chapter 2. 2.1 Weightspace View
Chapter Regression Supervised learning can be divided into regression and classification problems. Whereas the outputs for classification are discrete class labels, regression is concerned with the prediction
More informationRESAMPLING FEWER THAN n OBSERVATIONS: GAINS, LOSSES, AND REMEDIES FOR LOSSES
Statistica Sinica 7(1997), 131 RESAMPLING FEWER THAN n OBSERVATIONS: GAINS, LOSSES, AND REMEDIES FOR LOSSES P. J. Bickel, F. Götze and W. R. van Zwet University of California, Berkeley, University of
More informationNew insights on the meanvariance portfolio selection from de Finetti s suggestions. Flavio Pressacco and Paolo Serafini, Università di Udine
New insights on the meanvariance portfolio selection from de Finetti s suggestions Flavio Pressacco and Paolo Serafini, Università di Udine Abstract: In this paper we offer an alternative approach to
More informationAn analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 TwoWay ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
More informationAdaptive linear stepup procedures that control the false discovery rate
Biometrika (26), 93, 3, pp. 491 57 26 Biometrika Trust Printed in Great Britain Adaptive linear stepup procedures that control the false discovery rate BY YOAV BENJAMINI Department of Statistics and Operations
More informationFalse Discovery Rate Control with Groups
False Discovery Rate Control with Groups James X. Hu, Hongyu Zhao and Harrison H. Zhou Abstract In the context of largescale multiple hypothesis testing, the hypotheses often possess certain group structures
More informationwww.rmsolutions.net R&M Solutons
Ahmed Hassouna, MD Professor of cardiovascular surgery, AinShams University, EGYPT. Diploma of medical statistics and clinical trial, Paris 6 university, Paris. 1A Choose the best answer The duration
More informationThe InStat guide to choosing and interpreting statistical tests
Version 3.0 The InStat guide to choosing and interpreting statistical tests Harvey Motulsky 19902003, GraphPad Software, Inc. All rights reserved. Program design, manual and help screens: Programming:
More informationDIRECT REDUCTION OF BIAS OF THE CLASSI CAL HILL ESTIMATOR
REVSTAT Statistical Journal Volume 3, Number 2, November 2005, 113 136 DIRECT REDUCTION OF BIAS OF THE CLASSI CAL HILL ESTIMATOR Authors: Frederico Caeiro Universidade Nova de Lisboa, FCTDM) and CEA,
More information9. Sampling Distributions
9. Sampling Distributions Prerequisites none A. Introduction B. Sampling Distribution of the Mean C. Sampling Distribution of Difference Between Means D. Sampling Distribution of Pearson's r E. Sampling
More information