On Mardia s Tests of Multinormality



Similar documents
Multivariate Normal Distribution

Sections 2.11 and 5.8

Multivariate Statistical Inference and Applications

Multivariate normal distribution and testing for means (see MKB Ch 3)

Department of Economics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

II. DISTRIBUTIONS distribution normal distribution. standard scores

NOTES ON LINEAR TRANSFORMATIONS

Introduction to General and Generalized Linear Models

Module 3: Correlation and Covariance

Convex Hull Probability Depth: first results

Least Squares Estimation

MATHEMATICAL METHODS OF STATISTICS

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A MULTIVARIATE OUTLIER DETECTION METHOD

IRREDUCIBLE OPERATOR SEMIGROUPS SUCH THAT AB AND BA ARE PROPORTIONAL. 1. Introduction

An extension of the factoring likelihood approach for non-monotone missing data

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Life Table Analysis using Weighted Survey Data

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

OPTIMAl PREMIUM CONTROl IN A NON-liFE INSURANCE BUSINESS

Factor analysis. Angela Montanari

MEASURES OF LOCATION AND SPREAD

Financial Time Series Analysis (FTSA) Lecture 1: Introduction

A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS

Uncertainty quantification for the family-wise error rate in multivariate copula models

Mathematics within the Psychology Curriculum

Data Mining: Algorithms and Applications Matrix Math Review

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Recall this chart that showed how most of our course would be organized:

Non Parametric Inference

Statistics Graduate Courses

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

THE SELECTION OF RETURNS FOR AUDIT BY THE IRS. John P. Hiniker, Internal Revenue Service

How To Check For Differences In The One Way Anova

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Exploratory Data Analysis as an Efficient Tool for Statistical Analysis: A Case Study From Analysis of Experiments

Multivariate Analysis of Ecological Data

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

SYSTEMS OF REGRESSION EQUATIONS

1 Another method of estimation: least squares

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Section Inner Products and Norms

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Fairfield Public Schools

Module 5: Multiple Regression Analysis

Estimating an ARMA Process

Statistical Machine Learning

Quantitative Methods for Finance

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

PÓLYA URN MODELS UNDER GENERAL REPLACEMENT SCHEMES

Simple Linear Regression Inference

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

A repeated measures concordance correlation coefficient

Multiple group discriminant analysis: Robustness and error rate

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Nonparametric adaptive age replacement with a one-cycle criterion

This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Statistical Models in R

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

SAS Software to Fit the Generalized Linear Model

Introduction to Statistics and Quantitative Research Methods

Introduction to Multivariate Analysis

Mehtap Ergüven Abstract of Ph.D. Dissertation for the degree of PhD of Engineering in Informatics

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Vector Time Series Model Representations and Analysis with XploRe

APPLIED MISSING DATA ANALYSIS

How To Identify Noisy Variables In A Cluster

The degrees of freedom of the Lasso in underdetermined linear regression models

Derivative Approximation by Finite Differences

Description. Textbook. Grading. Objective

Multivariate Analysis of Variance (MANOVA): I. Theory

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Week 1. Exploratory Data Analysis

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

Continued Fractions and the Euclidean Algorithm

Estimating Industry Multiples

Applications to Data Smoothing and Image Processing I

State Space Time Series Analysis

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

STANDARDISATION OF DATA SET UNDER DIFFERENT MEASUREMENT SCALES. 1 The measurement scales of variables

Goodness of fit assessment of item response theory models

Component Ordering in Independent Component Analysis Based on Data Power

A linear algebraic method for pricing temporary life annuities

Lecture 3: Linear methods for classification

Multivariate Analysis. Overview

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

Stat 704 Data Analysis I Probability Review

3.1 Least squares in matrix form

Transcription:

On Mardia s Tests of Multinormality Kankainen, A., Taskinen, S., Oja, H. Abstract. Classical multivariate analysis is based on the assumption that the data come from a multivariate normal distribution. The tests of multinormality have therefore received very much attention. Several tests for assessing multinormality, among them Mardia s popular multivariate skewness kurtosis statistics, are based on stardized third fourth moments. In Mardia s construction of the affine invariant test statistics, the data vectors are first stardized using the sample mean vector the sample covariance matrix. In this paper we investigate whether, in the test construction, it is advantagoeus to replace the regular sample mean vector sample covariance matrix by their affine equivariant robust competitors. Limiting distributions of the stardized third fourth moments the resulting test statistics are derived under the null hypothesis are shown to be strongly dependent on the choice of the location vector scatter matrix estimate. Finally, the effects of the modification on the limiting finite-sample efficiencies are illustrated by simple examples in the case of testing for the bivariate normality. In the cases studied, the modifications seem to increase the power of the tests. 1. Introduction Classical methods of multivariate analysis are for the most part based on the assumption that the data are coming from a multivariate normal population. In the one-sample multinormal case, the regular sample mean vector sample covariance matrix are complete sufficient, but on the other h highly sensitive to outlying observations. It is also well known that the tests estimates based on the sample mean vector sample covariance matrix have poor efficiency properties in the case of heavy tailed noise distributions. Testing for departures from multinormality helps to make practical choices between competing methods of analysis. In the univariate case, stardized third fourth moments b 1 b 2 are often used to indicate the skewness kurtosis. For a rom sample x 1,..., x n from a p-variate distribution with sample mean vector x sample covariance Received by the editors May 2, 2003. 1991 Mathematics Subject Classification. Primary 62H15; Secondary 62F02. Key words phrases. Kurtosis, Multinormality, Robustness, Skewness.

2 Kankainen, A., Taskinen, S., Oja, H. matrix S, Mardia (1970,1974,1980) defined the p-variate skewness kurtosis statistics as b 1,p = ave i,j [(x i x) T S 1 (x j x)] 3 b 2,p = ave i [(x i x) T S 1 (x i x)] 2, respectively. The statistics b 1,p b 2,p are functions of stardized third fourth moments, respectively. From the definition one easily sees that b 1,p b 2,p are invariant under affine transformations. In the univariate case these reduce to the usual univariate skewness kurtosis statistics b 1 b 2. Mardia advocated using the skewness kurtosis statistics to test for multinormality as they are distribution-free under multinormality. See also Bera John (1983) Koziol (1993) for the use of the stardized third fourth moments in the test construction. Intuitively, as in the univariate case, the skewness kurtosis statistics b 1,p b 2,p compare the variation measured by third fourth moments to that measured by second moments. The second, third fourth moments are all highly non-robust statistics, however, more efficient procedures may be obtained through comparisons of robust nonrobust measures of variations. It is therefore an appealing idea to study whether it is useful or reasonable to replace the regular sample mean vector x sample covariance matrix S by some affine equivariant robust competitors, location vector estimate ˆµ scatter matrix estimate ˆΣ. The paper is organised as follows. In Section 2 the limiting distributions of Mardia s test statistics under the null model of multinormality are discussed. The test statistics are based on the stardized third fourth moments; the regular mean vector regular covariance matrix are used in the stardization. Section 3 lists the tools assumptions for alternative n-consistent affine equivariant location scatter estimates. The location scatter estimates are then used in the regular manner to stardize the data vectors. The limiting distributions of the stardized third fourth moments in this general case are found in Section 4. As an illustration of the theory, the effect of the way of stardization on the limiting small sample efficiencies in the bivariate case are studied in Sections 5 6. The auxiliary lemmas proofs have been postponed to the Appendix. 2. Limiting null distributions of Mardia s statistics In this section we recall the wellknown results concerning the limiting null distributions of the Mardia s skewness kurtosis statistics. As, due to affine invariance, the distributions of the test statistics do not depend on the unknown µ Σ, it is not a restriction to assume in the following derivations that x 1,..., x n is a rom sample from the N p (0, I p ) distribution. Write z i = S 1/2 (x i x), i = 1,..., n,

On Mardia s Tests of Multinormality 3 for the data vectors stardized in the classical way. Mardia s skewness statistic can be decomposed as b 1,p = 6 j<k<l[ave i {z ij z ik z il }] 2 + 3 i {zijz j k[ave 2 ik }] 2 + [ave i {zij}] 3 2 j the Mardia s kurtosis statistic is similarly b 2,p = j k ave i {z 2 ijz 2 ik} + j ave i {z 4 ij}. See e.g. Koziol (1993). Under multinormality b 1,p b 2,p are affine invariant, all the stardized third fourth moments are asymptotically normal asymptotically independent consequently the limiting distributions of n b 1,p 6 n b 2,p p(p + 2) 8p(p + 2) are a chi-square distribution with p(p+1)(p+2)/6 degrees of freedom a N(0, 1) distribution, respectively. 3. Location scatter estimates In Mardia s test construction, the data vectors are stardized using the regular sample mean vector x sample covariance matrix S. In this paper we thus consider what happens if these are replaced by other n-consistent affine equivariant location scatter estimates of µ Σ, say, ˆµ ˆΣ. Next we list assumptions useful notations for location vector estimate ˆµ scatter matrix estimate ˆΣ. First we assume that, for the N p (0, I p ) distribution, the influence functions of these location scatter functionals at z are γ(r) u α(r) uu T β(r) I p, respectively, where r = z u = z 1 z. We will later see that the limiting distributions of the modified test statistics depend on the used location scatter estimates through these functions. For the inluence functions of location scatter functionals, see also Hampel et al. (1986), Croux Haesbroek (2000) Ollila et al. (2003a,b). We further assume that, again under the N p (0, I p ) distribution, t = n ave{γ(r i )u i } = n ˆµ + o P (1) C = n ave{α(r i )u i u T i β(r i )I p } = n (ˆΣ I p ) + o P (1). Finally, assume that, for the N p (0, I p ) distribution, the expected values E(γ 2 (r)), E(α 2 (r)) E(β 2 (r)) are finite. These assumptions imply that the limiting distributions of nˆµ n vec(ˆσ I p ) are p- p 2 -variate normal distributions

4 Kankainen, A., Taskinen, S., Oja, H. with zero mean vectors covariance matrices E[γ 2 (r)] I p p E[α 2 (r)] p(p + 2) [I p 2+I p,p+vec(i p )vec(i p ) T ]+ (E[β 2 (r)] + 2p ) E[α(r)β(r)] vec(i p )vec(i p ) T, respectively. (Here I p,p is the so called commutation matrix. See e.g. Ollila et al. (2003b).) 4. Limiting distribution of the stardized third fourth moments Consider now any estimators ˆµ ˆΣ write z i = ˆΣ 1/2 (x i ˆµ), i = 1,..., n, for the stardized observations. Denote the stardized third fourth moments by U jkl = n ave i {z ij z ik z il }, V jk = n ave i {z 2 ijz 2 ik 1} V jj = n ave i {z 4 ij 3}. The limiting distributions of the stardized third fourth moments are given by the following theorem. The results easily follow from Lemmas 7.1 7.2 stated proven in the Appendix. Theorem 4.1. Under multinormality, the limiting distribution of the vector of all possible third moments U jkl (j < k < l), U jjk (j k) U jjj is a multivariate normal distribution with zero mean vector limiting variances covariances where V ar(u jkl ) = 1 V ar(u jjk ) = 2 + κ V ar(u jjj ) = 6 + 9κ Cov(U jjk, U llk ) = κ Cov(U jjk, U kkk ) = 3κ κ = 1 + E[γ2 (r)] p All the other covariances are zero. 2 E[r3 γ(r)] p(p + 2).

On Mardia s Tests of Multinormality 5 Theorem 4.2. Under multinormality, the limiting distribution of the vector of fourth moments V jk (j k) V jj is a multivariate normal distribution with zero mean vector limiting variances covariances V ar(v jk ) = 4 + τ 1 + 2τ 2 V ar(v jj ) = 24 + 9τ 1 + 36τ 2 Cov(V jk, V jl ) = τ 1 + τ 2 Cov(V jk, V ml ) = τ 1 Cov(V jj, V kk ) = 9τ 1 Cov(V jk, V jj ) = 3τ 1 + 6τ 2 Cov(V jk, V ll ) = 3τ 1 where τ 1 = 4 [ E[α 2 ] (r)] p(p + 2) 2E[α(r)β(r)] + E[β 2 (r)] E[α(r)r4 ] p p(p + 2)(p + 4) + E[β(r)r4 ] p(p + 2) τ 2 = 2 + 2 E[α2 (r)] p(p + 2) 4 E[α(r)r 4 ] p(p + 2)(p + 4). We close this section with some discussion on the implications of the above results. First note that, if the regular mean vector sample covariance matrix are used in the stardization, then simply which gives γ(r) = r, α(r) = r 2 β(r) = 1 κ = τ 1 = τ 2 = 0 all the third moments fourth moments are asymptotically mutually independent, respectively. For other location scatter statistics this is not necessarily true also the limiting variances may vary. This means that the limiting distributions efficiencies of b 1,p b 2,p may drastically change. The limiting distribution of nb 1,p /6 is a weighted sum of chi-square variables with one degree of freedom which makes the calculation of the approximate p-value less practical. The limiting distribution of n(b 2,p p(p + 2)) is still normal with mean value zero, but its limiting variance depends on the chosen scatter matrix. In the last two section we illustrate the impact of the choice of the location vector scatter matrix estimate on the limiting distribution efficiency in the bivariate case. The extension to the general p-variate case is straightforward. 5. Tests for bivariate normality Assume now that x 1,..., x n is a rom sample from an unknown bivariate distribution we wish to test the null hypothesis H 0 of bivariate normality. For

6 Kankainen, A., Taskinen, S., Oja, H. limiting power considerations we use the contaminated bivariate normal distribution, the so called Tukey distribution: We say that the distribution of x is a bivariate Tukey distribution T (, µ, σ) if the pdf of x is f(x) = (1 )φ(x) + σ 2 φ((x µ)/σ), where φ(x) is the pdf of N 2 (0, I 2 ) distribution [0, 1]. Alternative sequences H n 1 : T (, µ, 1) H n 2 : T (, 0, σ) with = δ/ n, r = µ > 0 σ > 1 are then used for the comparisons of the skewness kurtosis test statistics, respectively. Write now T 1 = (U 112, U 221, U 111, U 222 ) T T 2 = (V 11, V 22, V 12 ) T for the vectors of the stardized third fourth moments. Using the results in Section 4, one directly gets the following results. Corollary 5.1. Under H 0, the limiting distribution of nt 1 is a 4-variate normal distribution with zero mean vector covariance matrix 2 0 0 0 1 0 0 3 Ω 1 = 0 2 0 0 0 0 6 0 + κ 0 1 3 0 0 3 9 0. 0 0 0 6 3 0 0 9 Corollary 5.2. Under H 0, the limiting distribution of nt 2 is a 3-variate normal distribution with zero mean vector covariance matrix Ω 2 = 24 0 0 0 24 0 0 0 4 + τ 1 9 9 3 9 9 3 3 3 1 + τ 2 36 0 6 0 36 6 6 6 2 It is now possible to construct the limiting distributions of the affine invariant test statistics b 1,2 b 2,2. The comparison of the effect of different stardizations is possible but not easy, however, as the limiting distributions may be of different type. This is avoided if one uses the following test statistic versions: Definition 5.3. Modified Mardia s skewness kurtosis statistics in the bivariate case are b 1,2 = T 1 T Ω 1 1 T 1 b 2,2 = (1, 1, 2)T 2. Note that, if the sample mean covariance matrix are used, the definition gives b 1,2 = b 1,2 /6 b 2,2 = b 2,2 8, respectively. In the former case, with general ˆµ ˆΣ, the affine equivariance property is lost. The alternative to the regular sample mean vector sample covariance matrix used in this example is the affine equivariant Oja median (Oja, 1983) the related scatter matrix estimate based on the Oja sign covariance matrix (SCM)..

On Mardia s Tests of Multinormality 7 See Visuri et al. (2000) Ollila et al. (2003b) for the SCM. Note also that the scatter matrix estimate based on the Oja SCM is asymptotically equivalent with the zonoid covariance matrix (ZCM) can be seen as an affine equivariant extension on mean deviation. For the definition use of the ZCM, see Koshevoy et al (2003). We chose these statistics as their influence functions are relatively simple at the N 2 (0, I 2 ) case, the influence functions are given by [ ] 2 2 γ(r) = 2 π, α(r) = 2 2 π r 1 β(r) = 1. The constants needed for the asymptotic variances covariances are then κ = 4 π 1 2, τ 1 = 32 π 29 τ 2 = 16 3 π 14 3. Consider first the alternative contiguous sequence H1 n with µ = (r, 0,..., 0) T. Using classical LeCam s third lemma one then easily sees that, under the alternative sequence H1 n, the limiting distribution of nb 1,2 is a noncentral chi-square distribution with 4 degrees of freedom noncentrality parameter δ T 1 Ω 1 1 δ 1, where δ 1 depends on the location estimate used in the stardization. In the regular case (sample mean vector), the Oja median gives δ 1 = (0, 0, r 3, 0) T δ 1 = (0, r q(r)2 2/π, r 3 + 3r q(r)6 2/π, 0) T. The function q(r) = R(µ) is easily computable defined through the so called spatial rank function R(µ) = E( µ x 1 (µ x)) with x N 2 (0, I 2 ). See Möttönen Oja (1995). Next we consider the sequence H2 n.then V ar(b 2,2) = (1, 1, 2) Ω 2 1 1 2 under H2 n the limiting distribution of n(b 2,2) 2 /V ar(b 2,2) is a noncentral chisquare distribution with one degree of freedom noncentrality parameter [(1, 1, 2)δ 2 ] 2 /V ar(b 2,2), where δ 2 in this case depends on the scatter estimate used in the stardization. In the regular case (sample covariance matrix) we get the ZCM gives δ 2 = (3(σ 2 1) 2, 3(σ 2 1) 2, (σ 2 1) 2 ) T δ 2 = (3(σ 4 1) 12(σ 1), 3(σ 4 1) 12(σ 1), (σ 4 1) 4(σ 1)) T In Figure 1 the asymptotical relative efficiencies of the new tests (using the Oja median the ZCM estimate) relative to the regular tests (using the sample mean sample covariance matrix) are illustrated for different values of r

8 Kankainen, A., Taskinen, S., Oja, H. for different values of σ. The AREs are then simply the ratios of the noncentrality parameters. The new tests seem to perform very well: The test statistics using robust stardization seem to perform better than the tests with regular location scatter estimates. ARE 0.0 0.5 1.0 1.5 2.0 ARE 0.0 0.5 1.0 1.5 2.0 0 5 10 15 r 2 4 6 8 10 12 14 σ Figure 1. Limiting efficiencies of the tests based on the Oja median the ZCM relative to the regular Mardia s tests for different values of r σ. The left (right) panel illustrates the efficiency of b 1,2 (b 2,2). 6. A real data example In our real data example we consider two bivariate data sets of n = 50 observations the combined data set (n = 100). The data sets are shown explained in Figure 2. We wish to test the null hypothesis of bivariate normality. Again the the skewness kurtosis statistics to be compared are those based on the sample mean vector sample covariance matrix (regular Mardia s b 1,2 b 2,2 ) those based on the Oja median the ZCM (denoted by b 1,2 b 2,2). The values of the test statistics the approximate p-values were calculated for the three data sets ( Boys, Girls, Combined ). See Table 1 for the results. As expected, the observed p-values were much smaller in the combined data case. As discussed in the introduction, the statistics b 1,2 b 2,2 compare the third fourth central moments to the second ones (given by the sample covariance matrix). The statistics b 1,2 b 2,2 use robust location vector scatter matrix estimates in this comparison. One can then expect that, for data sets with outliers or for data sets coming from heavy tailed distributions, the comparison to robust estimates tends to produce higher values of the test statistics smaller p-values.

On Mardia s Tests of Multinormality 9 Weights at birth (kg) 2.5 3.0 3.5 4.0 4.5 145 150 155 160 165 170 Heights of the mothers (cm) Figure 2. The height of the mother the birth weight of the child measured on 50 boys who have siblings (o) 50 girls (x) who do not have siblings. Table 1. The observed values of the test statistics with corresponding p-values for the three data sets. Boys Girls Combined Test statistic p-value Test statistic p-value Test statistic p-value b 1,2 3.034 0.552 3.376 0.497 9.242 0.055 b 1,2 5.152 0.272 2.336 0.674 11.257 0.024 b 2,2 0.016 0.900 0.282 0.596 2.275 0.132 b 2,2 0.834 0.361 1.152 0.283 5.166 0.023 In our examples for the analyses of bivariate data we used the Oja median the ZCM to stardize the data vectors. In the general p-variate case (with large p large n) these estimates are, however, computationally highly intensive, should be replaced by more practical robust location scatter estimates.

10 Kankainen, A., Taskinen, S., Oja, H. 7. Appendix: Auxiliary lemmas The limiting distributions of the stardized third fourth moments are given by the following linear approximations. Lemma 7.1. Let x 1,..., x n be a rom sample from N(0, I p )-distribution let z i = ˆΣ 1/2 (x i ˆµ), i = 1,..., n, be the stardized observations. Write also r i = x i u i = x i 1 x i. Then n avei {z ij z ik z il } = n ave i {x ij x ik x il } + o P (1) = n ave i {r 3 i u ij u ik u il } + o P (1) for j < k < l, n avei {z 2 ijz ik } = n ave i {x 2 ijx ik } t k + o P (1) = n ave i {r 3 i u 2 iju ik γ(r i )u ik } + o P (1) for j k n avei {z 3 ij} = n ave i {x 3 ij} 3t j + o P (1) = n ave i {r 3 i u 3 ij 3γ(r i )u ij } + o P (1). Lemma 7.2. Let x 1,..., x n be a rom sample from N(0, I p )-distribution let z i = ˆΣ 1/2 (x i ˆµ), i = 1,..., n, be the stardized observations. Again, r i = x i u i = x i 1 x i. Then for j k, n avei {z 2 ijz 2 ik 1} = n ave i {x 2 ijx 2 ik 1} C jj C kk + o P (1) = n ave i {r 4 i u 2 iju 2 ik 1 α(r i )[u 2 ij + u 2 ik] + 2β(r i )} + o P (1) n avei {z 4 ij 3} = n ave i {x 4 ij 3} 6C jj + o P (1) = n ave i {r 4 i u 4 ij 3 6α(r i )u 2 ij + 6β(r i )} + o P (1). Proofs of the lemmas: Note first that Thus z i = ˆΣ 1/2 (x i ˆµ) = x i 1 n t 1 2 n Cx i + o(1/ n). z ij = x ij 1 n t j 1 2 n z 2 ij = x 2 ij 2 n t j x ij 1 n z 3 ij = x 3 ij 3 n t j x 2 ij 3 2 n p C jr x ir + o(1/ n), r=1 p C jr x ir x ij + o(1/ n), r=1 p C jr x ir x 2 ij + o(1/ n) r=1

On Mardia s Tests of Multinormality 11 z 4 ij = x 4 ij 4 n t j x 3 ij 2 n p C jr x ir x 3 ij + o(1/ n). As the observations come from the N p (0, I p ) distribution (all the moments exist), the stard law of large numbers central limit theorem together with Slutsky s lemma yield the result. r=1 Acknowledgements We thank the referees for the critical constructive comments. Finally, we wish to mention that, after writing the paper, we learned about the paper Croux et al. (2003) with a very similar but more general approach. Croux his cowriters study the behaviour of the infromation matrix (IM) test when the maximum likelihood estimates are replaced by robust estimates. One of their examples is the modified Jarque-Bera test for univariate skewness kurtosis. Their conclusion also is that the use of the robust estimates icreases the power of the IM test. References [1] A. Bera S. John, Tests for multivariate normality with Pearson alternatives. Commun. Statist. Theor. Meth. 12(1) (1983), 103 117. [2] C. Croux, G. Dhaene D. Hoorelbeke, Testing the information matrix equality with robust estimators. Submitted. (Website address: www.econ.kuleuven.ac.be/public/ndbae06/public.htm) [3] C. Croux G. Haesbroeck, Principal component analysis based on robust estimators of the covariance or correlation matrix: Influence functions efficiencies. Biometrika 87 (2000), 603 618. [4] F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw, P. J. W.A. Stahel, Robust Statistics: The Approach Based on Influence Functions. Wiley, 1986. [5] G. Koshevoy, J. Mttnen H. Oja, A scatter matrix estimate based on the zonotope. Ann. Statist. 31 (2003), 1439 1459. [6] J. A. Koziol, Probability plots for assessing multivariate normality. Statistician 42 (1993), 161 173. [7] K. V. Mardia, Measures of multivariate skewness kurtosis with applications. Biometrika 57 (1970), 519 530. [8] K. V. Mardia, Applications of some measures of multivariate skewness kurtosis in testing normality robustness studies. Sankhyã, Ser B 36(2) (1974), 115 128. [9] K. V. Mardia, Tests of univariate multivariate normality. In: P.R. Krishnaiah (ed.), Hbook of Statistics, vol 1, 279 320, North Holl, 1980. [10] J. Möttönen H. Oja, Multivariate spatial sign rank methods. J. Nonparam. Statist. 5 (1995), 201 213. [11] H. Oja, Descriptive statistics for multivariate distributions. Statist. Probab. Lett. 1 (1983), 327 332.

12 Kankainen, A., Taskinen, S., Oja, H. [12] E. Ollila, T.P. Hettmansperger H. Oja, Affine equivariant multivariate sign methods. (2003a) Under revision. [13] E. Ollila, H. Oja C. Croux, The affine equivariant sign covariance matrix: Asymptotic behavior efficiencies. J. Multiv.Analysis (2003b), in print. [14] S. Visuri, V. Koivunen, H. Oja, Sign rank covariance matrices. J. Statist. Plann. Inference 91 (2000), 557 575. University of Jyväskylä, Department of Mathematics Statistics, P.O. Box 35 (MaD), FIN-40014 University of Jyväskylä, Finl E-mail address: kankaine@maths.jyu.fi University of Jyväskylä, Department of Mathematics Statistics, P.O. Box 35 (MaD), FIN-40014 University of Jyväskylä, Finl E-mail address: slahola@maths.jyu.fi University of Jyväskylä, Department of Mathematics Statistics, P.O. Box 35 (MaD), FIN-40014 University of Jyväskylä, Finl E-mail address: ojahannu@maths.jyu.fi