Ecological Archives XXX-XXX-XX
|
|
- Arron Horton
- 7 years ago
- Views:
Transcription
1 Ecoloical Archives XXX-XXX-XX Marti J. Anderson, and Daniel C. I. Walsh. PERMAOVA, AOSIM and the Mantel test in the face of heteroeneous dispersions: What null hypothesis are you testin? Appendix A. Description of statistical tests and related methods. Let Y be an p matrix of i =,..., multivariate observations (rows) by k =,..., p variables (columns). Let D = {d } be a square symmetric matrix of distances (or dissimilarities) between all pairs of observations i =,..., and j =,..., with diaonal elements d = 0 i = j. For example, the Euclidean distance is: d ( E) p k ( y y ) (A.) ik jk In ecoloy, typically, some other resemblance may be calculated for non-neative count data, such as the Bray-Curtis measure: d ( BC ) p k p k y ( y ik ik y y jk jk ) (A.) which ranes from 0 to, and is also often expressed as a percent similarity: s 00 ( d ( BC ) ( BC ) ). Another commonly-used measure, calculated on presence-absence data and directly interpretable as the proportion of unshared species, is the Jaccard measure: d ( J ) p k ( y p k ik ( y ik ) ( y ) ( y jk jk ) ) ( y ik y jk ) (A.3) where () is an indicator function such that
2 0 if x 0 ( x) if x 0 ext, suppose the observations belon a priori to =,..., roups, with sample sizes n, n,..., n and n. Let X be an ( ) matrix of full rank containin orthoonal contrasts amon the roups. We can construct an projection matrix for this roup structure accordin to the classical linear least-squares solutions to the Gauss-Markov normal equations (e.., Plackett 949) as: H X[ XX] X (A.4) As outlined in McArdle and Anderson (00), this can be used to obtain a partitionin of the multivariate variability inherent in matrix D, by relyin on the followin transformation, due to Gower (966) and hihlihted for this purpose oriinally by McArdle (99). Let matrix A consist of elements a d, which, after centerin on its rows and columns, ives a matrix directly interpretable as sums of squares and cross products (SSCP). amely, matrix G of elements: a ai a j a (A.5) where j, a j i a and a i j a i a a. Indeed, supposin each variable (C) is centered on its mean to yield the centered data matrix Y c with elements { y } (namely, where ( C) for each column variable j, we have y ( y y j ) and y j i y ) and we have used Euclidean distances (E) d, then matrix G is equivalent to the outer product G Y Y. (A.6) c c The total sum of squares is obtained as the trace (sum of diaonal elements) of this matrix. If Euclidean distance is used, this is equivalent to the trace of the inner product SSCP for Y; i.e.
3 3 where tr indicates the trace of a matrix. If tr G] tr[ Y Y] tr[ YY ] (A.7) [ c c c c (BC) d, ( J ) d or some other resemblance measure is used to construct D, however, then the relationship between G and Y is not so straihtforward. ow the one-way PERMAOVA test-statistic (McArdle and Anderson 00) is easily obtained throuh a partitionin of the G matrix to yield a pseudo-f statistic: F pseudo tr[ HG]/ v tr[( I H) G]/ v (A.8) where v ( ), v ( ) and I is an identity matrix with ones alon the diaonal and zeros elsewhere. ote that this equation is equivalent to equation (4) in McArdle and Anderson (00) because of the idempotency of matrix H (i.e., HH = H) and the fact that tr[hgh] = tr[hhg]. ow, with just one variable (p =) and Euclidean distances, the value of pseudo-f is precisely equal to the oriinal univariate F ratio (Snedecor 934) used in classical analysis of variance. For the one-way case, a p value is calculated for PERMAOVA by a random reorderin (permutation or randomization) of the observation rows of Y relative to the fixed ordered list of n + n +..., n labels for the roups (Edinton 995, Manly 006). This is equivalent to a random simultaneous re-orderin of the rows and columns of matrix D, which maintains the inter-point structure in the multivariate space, but chanes the roup label with which each point is associated (Anderson 00b). If the desin is balanced, then all observations have an equal chance of fallin into any particular roup. If the desin is unbalanced, then this is not true; however, the structure of the existin imbalance in the number of replicates per roup is maintained under randomization and all re-orderins of the observations relative to this structure are equally likely. The test-statistic is re-calculated for each randomization ( F ( ), say)
4 4 and a distribution of ( ) F is thereby enerated under a null hypothesis of no differences amon the roups, conditional on the observed data. A random subset of all possible re-orderins can be used for accurate inference (Hope 968). A p value is calculated as the proportion of obtained under randomization that are reater than or equal to the observed value of pseudo-f. ote also that (A.8) can be calculated directly from sums of squared distances (or dissimilarities) in matrix D as described in Anderson (00a); namely, ( ) F F pseudo ( SS SS ) / v SS / v T W (A.9) W where SS T is the sum of squared inter-point dissimilarities divided by the number of points: ( ) SST d / i j( i) (A.0) and SS W is the sum of squared inter-point dissimilarities within each roup divided by the number of observations within that roup, and then summed across all roups: SS W ( ) i j( i) d / n (A.) Here and in what follows, is an indicator such that = if sample units i and j are in the same roup, or else = 0. ote also thattr[g ] SST. Leendre and Anderson (999, see Theorem in Appendix B therein) have shown the equivalence of (A.) with the sum-ofsquared distances to roup centroids in the case of Euclidean distances. A eometric F statistic constructed usin sums of squared Euclidean distances to centroids within and between roups was described as a possible multivariate randomization test by Edinton (995, pp. 88 9). Pillar and Orlóci (996) had also suested the use of a related test-statistic, Q SS SS ), B ( T W which, in the specific case of a one-way AOVA model only, is monotonic on the pseudo-f
5 5 statistic iven in (A.9) under permutation, as the derees of freedom (v and v ), and also SS T will all remain constant for any random re-orderin of the data, so identical p values will be obtained. It may be noted here that the PERMAOVA test-statistic has the advantae of bein constructed as a pivotal test-statistic (i.e., F pseudo calculated from a Euclidean distance matrix for variable is equivalent to the classical univariate F statistic), so should not be affected adversely by the presence of nuisance parameters and can be easily extended to multi-way desins. It is also clearly not restricted to the use of the Euclidean distance. ote also, however, that the construction of the test effectively relies holistically on sums of squared distances within (and between) roups, without any reard whatsoever for the particular direction of those distances within the multivariate space, which distinuishes it from the classical MAOVA test statistics. ext, the AOSIM statistic of Clarke (993) is easily described as a function of the ranks of matrix D. There will be M ( ) / inter-point distance values d within the upper-trianular (or, equivalently, the lower-trianular) portion of matrix D (excludin the diaonal); namely, for i =,..., ( ) and j = (i + ),...,. Let the values d be replaced by the rank order of their values, r, where the lowest value of d is iven a value of r = and the hihest value of d is iven a value of r = M. The AOSIM test-statistic (Clarke 993) is then iven by: ( rb rw ) R (A.) M / where r W is the averae of the ranked dissimilarities between observations within the same roup:
6 6 r W ( ) i j( i) r n ( n ) / (A.3) and r B is the averae of the ranked dissimilarities between observations in different roups: r B ( ) M i j( i) ( ) r n ( n ) / (A.4) A p value is obtained for the one-way case in the same way for AOSIM as for PERMAOVA, usin random re-orderins of the observations relative to the roup structure and calculatin a distribution of ( ) R provide a p value for the test. aainst which the value of R for the oriinal orderin is then compared to The Mantel test was first described as a test of association between two resemblance matrices (Mantel 967, Mantel and Valand 970). For a iven set of observations, suppose there are two resemblance matrices; for example, the first miht be dissimilarities based on species data while the second miht be eoraphic distances. A cross-product (or Pearson or Spearman correlation coefficient) is calculated between the matched paired values in the two matrices and this is compared with the distribution of the same under random re-orderin of the oriinal observations for one of the two matrices. The Mantel test may also be used for a oodness-of-fit test between a matrix of resemblances and a model matrix (Leendre and Leendre 998, see pp ). For example, to model the roup structure as in AOVA, the model matrix may have zeros in place of the between-roup distances and ones in place of the within-roup distances. In other words, the model matrix consists of the indicators, as defined in equation (A.) above. A cross-
7 7 product between the sub-diaonal elements (as these matrices are symmetric) then simply ives the sum of the within-roup dissimilarities, ( ) z (A.5) (,0) d i j( i) ote that the value of z (,0) will decrease with increasin deree of clumpin within roups, so the p-value for the test usin this statistic must be calculated as the proportion of values of z ( ) (,0) that are less than or equal to the observed value of z (,0). For one-way desins, other arbitrary contrast coefficients can be used in the indicator model matrix to distinuish the within-roup versus the between-roup dissimilarities, yet would yield the same result. For example, consider the use of (,+) rather than (0,) to ive: z ( ) (, ) ( ) i j( i) ( ) d d (A.6) i j( i) As the sum of all the dissimilarities in the sub-diaonal matrix of D is a constant, (A.6) will yield a cross-product that is monotonic with (A.5) under permutation, so will result in equivalent p values for the Mantel test. Furthermore, Leendre and Leendre (998, p. 56) demonstrated the clear relationship between the Mantel test and AOSIM. Specifically, in the model matrix, let the code for within-roup resemblances be: c W n ( n ( M / ) ) / (A.7) and the code for the between-roup resemblances be: c B M n ( n ( M / ) ) / (A.8)
8 8 then the Mantel cross-product statistic z yields a test statistic with an equivalent structure to the R-statistic of AOSIM, but it is calculated on the averaes of the between-roup and withinroup dissimilarity values themselves, rather than on the averaes of their ranks, namely: z ( c W, c ) B ( db dw ) (A.9) M / The use of (A.9) will yield an equivalent p-value for the one-way model as the use of either (A.5) or (A.6). It will not, however, yield the same results as the AOSIM R statistic in (A.), which is based on ranks. The form of the Mantel test-statistic iven in (A.9) was the one we used in our simulations. To draw further parallels, the Mantel test also has a clear and close kinship with the resemblance-based permutation test statistic described by Good (98) and Smith et al. (990), namely d / d B W. This would also be monotonic under permutation with any of (A.5), (A.6) or (A.9), and thus would yield identical p values to the Mantel test for these one-way model simulations. (Althouh oriinally described in terms of averae similarity, s, rather than dissimilarity, d, enerally one can easily write a simple inverse function d = s, so the result still holds). Other important parallels can be drawn between the methods we have included in our simulations and the multi-response permutation procedure (MRPP, Mielke et al. 98, Mielke and Berry 00). The eneral formulation of the MRPP statistic is iven by C (A.0) where C 0 is a roup weiht, C, and ( ) ( ) (A.) n ( n ) / i ji
9 9 is the averae of pairwise distance function values within each roup ( =,..., ), where ( ) is an indicator such that if sample units i and j are both within roup. The test () statistic ets smaller with increased clumpin of observations within roups, so the p-value for ( ) the MRPP test is calculated as the proportion of values of under permutation that are less than or equal to the observed value of. If we let d and assin the weihts C to be proportional to the roup sample sizes, i.e., C n /, then we have the direct result that, where /[ ( n )]. Thus, the MRPP test is equivalent to the Mantel test coded z c,0) ( W c W in this way for either balanced or unbalanced desins. ote, however, that under permutation the test statistic z ( c W,0) will only be monotonic on the Mantel test statistics of z (,0), z (, ) or z c W, c ) (as iven in equations (A.5), (A.6), and (A.9) above, respectively) when there are equal ( B numbers of replicate sample units per roup. Thus, the MRPP test usin (and with C n / ) will yield equivalent permutation p-values to these more eneral implementations of the Mantel test only for balanced one-way desins. Mielke and Berry (00, p. ) have also shown, for the one-way case, that MRPP, when based on squared Euclidean distances for a sinle variable, yields p values equivalent to the univariate F statistic under permutation. It is therefore easy to show here the relationship between MRPP and PERMAOVA more enerally for one-way models. First, it is important that the distances be squared, i.e., let. Then, let the weihts be C ( n ) /( ) d, and the relationship between the PERMAOVA statistic of (A.9) and the MRPP statistic of (A.0) is F pseudo SST v. (A.) v
10 0 As the values of SS T, v and v are all constant under permutation, based on squared dissimilarities with this choice of weihts will yield a p value for MRPP that is equivalent to PERMAOVA. This relationship holds for either balanced or unbalanced one-way desins. In this study, the resemblance-based permutation tests (PERMAOVA, AOSIM and Mantel) were compared with one another and with the classical MAOVA test statistic described by Pillai (955). Given that the SSCP matrix for the within-roup variation is W ( ) Yc I H Y c and the SSCP matrix for the between-roup variation is B Y HY c c, then ( ) Pillai s trace is defined as V s tr[ B( W B) ]. To obtain a p-value, the followin F- approximation (Pillai 955) was used: F Pillai ( s) (t s ) V (A.3) ( s) (q s )( s V ) with s ( q s ) and s ( t s ) derees of freedom, where, s min( v, p), q v p ) ( and t ( v p ). ote that we must have v p. Also note that for Euclidean distances only, we can write the PERMAOVA pseudo-f as: F pseudo tr[ B]/ v tr[ W]/ v (A.4) which hihlihts how it differs from Pillai s trace. Pseudo-F is a ratio of two traces, each of these bein a pure sum of individual sums of squares, thus inorin all off-diaonal cross-products and hence correlation structure. For Pillai s trace, in contrast, the off-diaonal cross-product terms will play a role throuh the calculation of an inverse followed by the matrix multiplication, both of which occur prior to takin the trace.
11 LITERATURE CITED Anderson, M. J. 00a. A new method for non-parametric multivariate analysis of variance. Austral Ecoloy 6:3 46. Anderson, M. J. 00b. Permutation tests for univariate or multivariate analysis of variance and reression. Canadian Journal of Fisheries and Aquatic Sciences 58: Clarke, K. R onparametric multivariate analyses of chanes in community structure. Australian Journal of Ecoloy 8:7 43. Edinton, E. S Randomization tests, 3rd edition. Marcel Dekker, ew York, USA. Good, I. J. 98. An index of separateness of clusters and a permutation test for its sinificance. Journal of Statistical Computation and Simulation 5:6 75. Gower, J. C Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53: Hope, A. C. A A simplified Monte Carlo sinificance test procedure. Journal of the Royal Statistical Society, Series B 30: Leendre, P., and M. J. Anderson Distance-based redundancy analysis: testin multispecies responses in multifactorial ecoloical experiments. Ecoloical Monoraphs 69: 4. Leendre, P., and L. Leendre umerical ecoloy, Second Enlish edition. Elsevier, Amsterdam, The etherlands. Manly, B. F. J Randomization, bootstrap and Monte Carlo methods in bioloy, 3rd edition. Chapman and Hall, London, United Kindom. Mantel, The detection of disease clusterin and a eneralized reression approach. Cancer Research 7:09 0.
12 Mantel,., and R. S. Valand A technique of nonparametric multivariate analysis. Biometrics 6: McArdle, B. H. 99. Detectin and displayin impacts of bioloical monitorin: spatial problems and partial solutions. Paes in Proceedins of Invited Papers, XVth International Biometrics Conference, IBC, Budapest, Hunary. McArdle, B. H., and M. J. Anderson. 00. Fittin multivariate models to community data: a comment on distance-based redundancy analysis. Ecoloy 8: Mielke, P. W., K. J. Berry, P. J. Brockwell, and J. S. Williams. 98. A class of nonparametric tests based on multiresponse permutation procedures. Biometrika 68: Mielke, P. W., and K. J. Berry. 00. Permutation methods: a distance function approach. Spriner-Verla, ew York, USA. Pillai, K. C. S Some new test criteria in multivariate analysis. Annals of Mathematical Statistics 6:7. Pillar, V. D. P., and L. Orlóci On randomization testin in veetation science: multifactor comparisons of relevé roups. Journal of Veetation Science 7: Plackett, R. L A historical note on the method of least squares. Biometrika 36: Smith, E. P., K. W. Pontasch, and J. Cairns Community similarity and the analysis of multispecies environmental data: a unified statistical approach. Water Research 4: Snedecor, G. W Calculation and interpretation of analysis of variance and covariance. Colleiate Press, Ames, Iowa, USA.
Multivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationIntroduction to Matrix Algebra
Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary
More informationMultivariate Analysis of Variance (MANOVA): I. Theory
Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the
More informationTorgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances
Torgerson s Classical MDS derivation: 1: Determining Coordinates from Euclidean Distances It is possible to construct a matrix X of Cartesian coordinates of points in Euclidean space when we know the Euclidean
More informationMultivariate Statistical Inference and Applications
Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim
More informationFactor Analysis. Chapter 420. Introduction
Chapter 420 Introduction (FA) is an exploratory technique applied to a set of observed variables that seeks to find underlying factors (subsets of variables) from which the observed variables were generated.
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationMultiple regression - Matrices
Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationCHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA
CHAPTER 14 ORDINAL MEASURES OF CORRELATION: SPEARMAN'S RHO AND GAMMA Chapter 13 introduced the concept of correlation statistics and explained the use of Pearson's Correlation Coefficient when working
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationLinearly Independent Sets and Linearly Dependent Sets
These notes closely follow the presentation of the material given in David C. Lay s textbook Linear Algebra and its Applications (3rd edition). These notes are intended primarily for in-class presentation
More informationRow Echelon Form and Reduced Row Echelon Form
These notes closely follow the presentation of the material given in David C Lay s textbook Linear Algebra and its Applications (3rd edition) These notes are intended primarily for in-class presentation
More informationAP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationRecall that two vectors in are perpendicular or orthogonal provided that their dot
Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal
More informationMultivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine
2 - Manova 4.3.05 25 Multivariate Analysis of Variance What Multivariate Analysis of Variance is The general purpose of multivariate analysis of variance (MANOVA) is to determine whether multiple levels
More informationTHE SIMPLE PENDULUM. Objective: To investigate the relationship between the length of a simple pendulum and the period of its motion.
THE SIMPLE PENDULUM Objective: To investiate the relationship between the lenth of a simple pendulum and the period of its motion. Apparatus: Strin, pendulum bob, meter stick, computer with ULI interface,
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
More information1 Determinants and the Solvability of Linear Systems
1 Determinants and the Solvability of Linear Systems In the last section we learned how to use Gaussian elimination to solve linear systems of n equations in n unknowns The section completely side-stepped
More informationMAT 200, Midterm Exam Solution. a. (5 points) Compute the determinant of the matrix A =
MAT 200, Midterm Exam Solution. (0 points total) a. (5 points) Compute the determinant of the matrix 2 2 0 A = 0 3 0 3 0 Answer: det A = 3. The most efficient way is to develop the determinant along the
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More informationMultivariate normal distribution and testing for means (see MKB Ch 3)
Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................
More informationPartial Least Squares (PLS) Regression.
Partial Least Squares (PLS) Regression. Hervé Abdi 1 The University of Texas at Dallas Introduction Pls regression is a recent technique that generalizes and combines features from principal component
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationSTATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239
STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent
More informationSystems of Linear Equations
Systems of Linear Equations Beifang Chen Systems of linear equations Linear systems A linear equation in variables x, x,, x n is an equation of the form a x + a x + + a n x n = b, where a, a,, a n and
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More information13 MATH FACTS 101. 2 a = 1. 7. The elements of a vector have a graphical interpretation, which is particularly easy to see in two or three dimensions.
3 MATH FACTS 0 3 MATH FACTS 3. Vectors 3.. Definition We use the overhead arrow to denote a column vector, i.e., a linear segment with a direction. For example, in three-space, we write a vector in terms
More informationMultivariate Analysis of Variance (MANOVA)
Chapter 415 Multivariate Analysis of Variance (MANOVA) Introduction Multivariate analysis of variance (MANOVA) is an extension of common analysis of variance (ANOVA). In ANOVA, differences among various
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More information1 Introduction to Matrices
1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns
More informationInference for Multivariate Means
Inference for Multivariate Means Statistics 407, ISU Inference for the Population Mean Inference for the Population Mean Inference for the Population Mean his section focuses on the question: This section
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the
More informationMATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.
MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column
More informationPOLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.
Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationSolution to Homework 2
Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if
More informationOn the Influence of the Prediction Horizon in Dynamic Matrix Control
International Journal of Control Science and Enineerin 203, 3(): 22-30 DOI: 0.5923/j.control.203030.03 On the Influence of the Prediction Horizon in Dynamic Matrix Control Jose Manue l Lope z-gue de,*,
More informationThe Characteristic Polynomial
Physics 116A Winter 2011 The Characteristic Polynomial 1 Coefficients of the characteristic polynomial Consider the eigenvalue problem for an n n matrix A, A v = λ v, v 0 (1) The solution to this problem
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationEigenvalues, Eigenvectors, Matrix Factoring, and Principal Components
Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they
More informationChapter 19. General Matrices. An n m matrix is an array. a 11 a 12 a 1m a 21 a 22 a 2m A = a n1 a n2 a nm. The matrix A has n row vectors
Chapter 9. General Matrices An n m matrix is an array a a a m a a a m... = [a ij]. a n a n a nm The matrix A has n row vectors and m column vectors row i (A) = [a i, a i,..., a im ] R m a j a j a nj col
More informationSTANDARDISATION OF DATA SET UNDER DIFFERENT MEASUREMENT SCALES. 1 The measurement scales of variables
STANDARDISATION OF DATA SET UNDER DIFFERENT MEASUREMENT SCALES Krzysztof Jajuga 1, Marek Walesiak 1 1 Wroc law University of Economics, Komandorska 118/120, 53-345 Wroc law, Poland Abstract: Standardisation
More informationHow To Understand Multivariate Models
Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models
More informationA KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION
A KERNEL MAXIMUM UNCERTAINTY DISCRIMINANT ANALYSIS AND ITS APPLICATION TO FACE RECOGNITION Carlos Eduardo Thomaz Department of Electrical Enineerin, Centro Universitario da FEI, FEI, Sao Paulo, Brazil
More informationIntroduction to Principal Components and FactorAnalysis
Introduction to Principal Components and FactorAnalysis Multivariate Analysis often starts out with data involving a substantial number of correlated variables. Principal Component Analysis (PCA) is a
More informationChapter 1 Introduction. 1.1 Introduction
Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations
More informationOperation Count; Numerical Linear Algebra
10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point
More informationE3: PROBABILITY AND STATISTICS lecture notes
E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................
More informationDepartment of Economics
Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
More informationSimilarity and Diagonalization. Similar Matrices
MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that
More informationUNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
More informationDimensionality Reduction: Principal Components Analysis
Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely
More informationNotes on Determinant
ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without
More informationby the matrix A results in a vector which is a reflection of the given
Eigenvalues & Eigenvectors Example Suppose Then So, geometrically, multiplying a vector in by the matrix A results in a vector which is a reflection of the given vector about the y-axis We observe that
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More informationOnline Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy
Online Appendix Assessing the Incidence and Efficiency of a Prominent Place Based Policy By MATIAS BUSSO, JESSE GREGORY, AND PATRICK KLINE This document is a Supplemental Online Appendix of Assessing the
More informationQuantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
More informationSolving Systems of Linear Equations
LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how
More informationSample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
More informationMatrix Differentiation
1 Introduction Matrix Differentiation ( and some other stuff ) Randal J. Barnes Department of Civil Engineering, University of Minnesota Minneapolis, Minnesota, USA Throughout this presentation I have
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationNon-Inferiority Tests for Two Means using Differences
Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous
More informationa 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.
Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given
More informationWater Quality and Environmental Treatment Facilities
Geum Soo Kim, Youn Jae Chan, David S. Kelleher1 Paper presented April 2009 and at the Teachin APPAM-KDI Methods International (Seoul, June Seminar 11-13, on 2009) Environmental Policy direct, two-stae
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationLAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics
Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,
More informationINTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationSimilar matrices and Jordan form
Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationA linear algebraic method for pricing temporary life annuities
A linear algebraic method for pricing temporary life annuities P. Date (joint work with R. Mamon, L. Jalen and I.C. Wang) Department of Mathematical Sciences, Brunel University, London Outline Introduction
More informationNOTES ON LINEAR TRANSFORMATIONS
NOTES ON LINEAR TRANSFORMATIONS Definition 1. Let V and W be vector spaces. A function T : V W is a linear transformation from V to W if the following two properties hold. i T v + v = T v + T v for all
More informationOrthogonal Diagonalization of Symmetric Matrices
MATH10212 Linear Algebra Brief lecture notes 57 Gram Schmidt Process enables us to find an orthogonal basis of a subspace. Let u 1,..., u k be a basis of a subspace V of R n. We begin the process of finding
More informationMULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
More informationLinear Codes. Chapter 3. 3.1 Basics
Chapter 3 Linear Codes In order to define codes that we can encode and decode efficiently, we add more structure to the codespace. We shall be mainly interested in linear codes. A linear code of length
More informationLinear Algebra Notes for Marsden and Tromba Vector Calculus
Linear Algebra Notes for Marsden and Tromba Vector Calculus n-dimensional Euclidean Space and Matrices Definition of n space As was learned in Math b, a point in Euclidean three space can be thought of
More informationDecember 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationMultivariate Analysis of Variance (MANOVA)
Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction
More informationChapter 12 Nonparametric Tests. Chapter Table of Contents
Chapter 12 Nonparametric Tests Chapter Table of Contents OVERVIEW...171 Testing for Normality...... 171 Comparing Distributions....171 ONE-SAMPLE TESTS...172 TWO-SAMPLE TESTS...172 ComparingTwoIndependentSamples...172
More informationAn introduction to Value-at-Risk Learning Curve September 2003
An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationSome probability and statistics
Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,
More informationRELIABILITY BASED MAINTENANCE (RBM) Using Key Performance Indicators (KPIs) To Drive Proactive Maintenance
RELIABILITY BASED MAINTENANCE (RBM) Usin Key Performance Indicators (KPIs) To Drive Proactive Maintenance Robert Ford, CMRP GE Power Generation Services 4200 Wildwood Parkway, Atlanta, GA 30339 USA Abstract
More informationPrinciple Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression
Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate
More informationSolving Linear Systems, Continued and The Inverse of a Matrix
, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing
More informationTesting for Granger causality between stock prices and economic growth
MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More information