Lecture 12: Indirect gradient analysis IV: weighted averaging, Correspondence Analysis (CA) and Detrended Correspondence Analysis (DCA)

Transcription

1 Lecture 12: Indirect gradient analysis IV: weighted averaging, Correspondence Analysis (CA) and Detrended Correspondence Analysis (DCA) 1. Why go further than PCA? 2. Weighted averaging 3. Correspondence Analysis (CA) 4. Detrended Correspondence Analysis (DCA) 5. Biplot diagrams 6. Review of all ordination techniques

2 PCA- Why go further? The linear model PCA usually doesn t fit most species data. Species come and go across a gradient -- often with a unimodal response (Gausian or bell-shaped distribution). If you sample across a broad enough gradient, this unimodal distribution patterns usually occurs. Correspondence analysis is another eigenvector-based technique that assumes a unimodal, rather than linear, relationship among the variables. As far as what you can do with it, what the graphing looks like, etc. it is very similar to PCA, the big change is how the axes are derived

3 Weighted Averaging A form of direct gradient analysis, weighted averaging was first used for vegetation data in 1967 by Robert Whittaker. It led quickly to reciprocal averaging (correspondence analysis) by Curtis and McIntosh in their studies of upland forests in southern Wisconsin. It is an important precursor to the application of more advanced multivariate techniques. Knowledge of species response along a known environmental gradient is used to order stands of vegetation along the same gradient.

4 Weighted averaging The weighted average is calculated for each stand by multiplying the abundance of each species times the weighting factor for that species and summing the weighted cover scores for all species and dividing by the sum of unweighted scores of all species.

5 Weighted averaging: an example Suppose that we had a set of vegetation samples, and we knew a lot about the wetland status of all the species within the samples. We could then calculate a wetland rating for each sample by taking the abundance of each species and using that to weight it according to its wetland ranking. Sample A: SPECIES WETLAND RANKING COVER WRXC Juntri Bigglu Lupplu sum: 9 15 In this example the wetland ranking for this sample is 15/9 or 1.67, reflecting that it is towards the dry end of the gradient (closer to 1 than 3). Our sample at Wetland scale If there were several samples: the samples could be ordered in a one-dimensional ordination according to their weighted wetland ranking.

6 Weighted averaging: an example (cont.) INTERPRETATION The method is a form of direct gradient analysis. It shows only one axis and is useful for situations where there is only one primary environmental gradient under consideration. The major criticism of the method is that the original weights assigned to each species is based on a subjective assessment of the position of the species with respect to the gradient, and this score could vary from one worker to another.

7 Reciprocal averaging (Correspondence Analysis) Reciprocal averaging works on much of the same principle as weighted averaging, but simultaneously orders both the columns and rows of a matrix. Also, rather than forcing an external structure into the results by use of a weighting factor, it finds inherent structure within a data set. Provides the basis for more advanced methods of ordinations developed after Papers by Hill (1973, 1974) first made CA well known to ecologists.

8 Basic idea of CA: To get a simultaneous ordination of the variables in the columns and rows (sample plots and species) based on the inherent structure of the data due to redundancy in the data, i.e. the co-occurrence of groups of species within the set of plots, and multiple plots showing the same groups of species. If you arrange a set of species and samples according to their first axis CA order, they will look similar to a sorted B-B table, with dominant species in the middle, and rare ones on the ends. The method of weighted averages is applied to a data matrix such that quadrat scores are derived from species scores and weightings. These are carried out successively using an interative procedure. The scores eventually stabilize to get a set of scores for quadrats and a set of scores for species. This two-way weighted averaging is only slightly more complex than one-way weighted averaging of the preceding example.

9 Calculation of the first axis Calculate the row and column totals. Allocate weights to the species. Reciprocal averaging then commences. The averaging process is then applied in reverse to give a new set of scores for the species using the sample scores. To avoid calculation with very small numbers these new species scores are rescaled from 1 to 100 The species scores of the final iteration are the positions of the species along the first axis (0 to 100) of the ordination, and the sample scores are the positions of the samples along the first axis.

10 Reciprocal averaging (an example): SPECIES Sample 1 Sample 2 Sample 3 Juntri Bigglu Lupplu Total spp Average RESCALE Rescaling: Example:(average cover-lowest average cover)/range of cover totals x 100. Rescaled value for Sample 1 = ( )/( ) x 100 =

11 Reciprocal averaging (an example): SPECIES Sample 1 Sample 2 Sample 3 Total samples Wt Av 1 Juntri Bigglu Lupplu Total spp Average RESCALE The next step is to calculate an average value for each species, weighted by the first axis value. So: ((83.3 x 2)+(100 x 8)+(1 x 1))/11= etc.

12 Reciprocal averaging (an example): SPECIES Sample 1 Sample 2 Sample 3 Total samples Wt Av 1 Juntri Bigglu Lupplu Total spp Average RESCALE WT AV RESCALE This new vector is used to calculate a new weighted average which is then rescaled. So: ((87.97 x 2)+(72.03 x 4)+(78.88 x 8))/14 = etc. Note: rescaling is done to keep the values from getting very small and keep them in a reasonable range. It is necessary to do this one time each iteration (for the sample scores in this case).

13 Reciprocal averaging (an example): Keep going... SPECIES Sample 1 Sample 2 Sample 3 Total samples Wt Av 1 Wt Av 2 Wt Av 3 Wt Av 4 Wt Av 5 Wt Av 6 Rescale 6 Juntri Bigglu Lupplu Total spp Average RESCALE WT AV RESCALE WT AV RESCSALE WT AV RESCALE WT AV RESCALE WT AV RESCALE until the amount of change between successive iterations for both the species and sample vectors is minimized. The species scores are rescaled on the last iteration, so that both the species and sample axes are scaled from 0 to 100.

14 Reciprocal averaging (an example): IF there is indeed a diagonal structure to the matrix (where the species and samples correspond to a diagonal along the middle) then the scores will stabilize. Once they stabilize, you have the first axis! The species axis is usually used as a test and the sample axis is the weighted average of the species axis.

15 Calculation of the eigenvalue As with PCA, the eigenvalue can be thought of as a measure of the proportion of the total variation in the data explained by the axis. It is obtained by taking the range of the unscaled scores in the final iteration and dividing by the range of the scaled scores. In our example, the eigenvalue of the first axis = ( )/100 = There is no specific eigenvalue cutoff for significant axes, but generally axes with eigenvalues >0.25 should be considered in the analysis. In our trivial example with 3 plots and 3 species, the eigenvalue of the first axis is close to 0.25, and no higher axes would be close to this.

16 Calculation of the second axis It is possible to extract a second axis. This will likely be necessary if there are plots that lie close together in the first axis but which also have a great deal of differences in species composition. The second axis is extracted by the same iteration process, with one extra step in which the trial scores for the second axis are made uncorrelated with the first axis. In other words, The linear correlation with the first axis is removed. This is done by taking the trial scores for the second axis and regressing them against the site scores for the first axis. The residuals from this regression are the new trial axis. This is done once for the sample scores at each iteration.

17 Example data set for reciprocal averaging from Kent and Coker Uses binary (presence/absence) data. Several species.

18 Forming the ordination The position of the quadrats and species in the ordination space is determined by species scores and quadrat scores for the first two axes. The table below provides the scores for the samples and species for the second axis. The first axis scores were derived in Table 6.1 in Kent and Coker.

19 Review of reciprocal averaging (correspondence analysis) Axis 1 Assign random scores to each species. Use these to calculate a weighted average for each sample. Rescale these sample scores. Use the sample scores to calculate a weighted average for each species. Use the species scores to calculate a weighted average for each sample. Continue until scores converge to a unique solution. Axis 2 Assign a new set of random scores to each species. Calculate a trial axis as above. Perform a multiple regression between the trial axis and the final axis obtained (above). Take the residual values as the new trial axis.

20 Problems with CA THE ARCH EFFECT A mathematical artifact corresponding to no real structure in the data. The second axis is a quadratic distortion of the first axis. In data sets where there is no strong controlling gradient for the second axis, the arch effect is likely to occur COMPRESSION NEAR THE ENDS OF THE AXES Related to the arch effect and does not show the actual comings and goings of species along the first axis.

21 Arch effect and compression of the axes resulting from CA in an idealized data set In this data set there should be no variation in the y axis scores of plots 1 to 7. The CA will create variation where there is none. It also compresses the Axis 1 scores towards the ends of the axis. Detrending and rescaling in DCA removes these effects.

22 Detrended Correspondence Analysis (DCA) Correcting for the arch effect (detrending) The first axis is divided into a number of segments and within each segment, the second axis scores are recalculated so that they have an average of zero. Correcting for the compression effect Also overcome by segmenting the first axis and rescaling the species ordination (not the sample ordination), such that the coming and going of species is about equal along the gradient.

23 Detrended Correspondence Analysis (DCA): scaling of the axes Scaling of the axes in sd units The axes in DCA are scaled into units that are the average standard deviation of species turnover (sd units). A 50% change in species composition occurs in a distance of about 1 SD unit. Species appear, rise to their modes, and disappear over a distance of about 4 SD units. The more SD units that occur along the axis the more change in species composition is shown. Thus, the sd units of DCA are a useful measure of beta diversity in the total data set. The position of samples along the 1st axis are thus shifted to equalize betadiversity. Species have, on average, a habitat breadth (as measure by standard deviations) of 1.

24 Detrended Correspondence Analysis (DCA): an example comparing results from CA with DCA

25 Output from PC-ORD DCA List of residuals. Eigenvectors are found by iteration and the residuals indicate how close a given vector is to an eigenvector at each iteration. A vector is considered an eigenvector if it has a residual of less than The eigenvalue is calculated at the final iteration. For the first axis it can be considered in the same way as in PCA, as an indicator of the amount of variance accounted for by the first axis. However, for higher axes, detrending in DCA destroys the correspondence between the eigenvalue and the structure along that axis. PC-ORD recommends using an after the fact coefficient of determination between Relative Euclidean distance in the unreduced species space and the Euclidean distance in the ordination space (Graph/ Ordination/ Statistics/ Percent of Variance). Length of segments. These lengths indicate the degree of rescaling that occurred. For example, small values for the middle segments suggests that a gradient was compressed at the ends before rescaling. Length of the gradient. This is the total length of the axis before and after rescaling. The units are in sd units, a measure of beta-diversity. Four sd units is approximately equal to one complete turnover of species. Note: in PC- Ord 5, the length of the gradient is multiplied by 100.

26 PC-ORD DCA output (cont ) At the end of the output are the species scores for the three axes, and their rankings along the first two axes. Along with the eigenvalue scores for each axis. This followed by a similar table for the sample scores.

27 How do you tell if DCA is for you? A general rule of thumb is that if the axis is less than 2 units long, you should consider PCA, if it is more than 4 units long then DCA will likely be appropriate, and in between that you will have to look harder at the data and make some educated choices...

28 Criticism of DCA The method used for correcting the arch effect and compression have no emprical or theoritical basis (Wartenberg et. al 1987). The assumption that species turnover is constant or even along gradients is likely not true. It removes, with brute force, if needed any arch effect even if there is really is an arch effect in your data set. Because it is done by arbitrarily dividing the axis into pieces, and then shifting those pieces up or down (to get a constant mean), the relationships within the segment are maintained, but other relationships can be ripped apart.

29 DCA OVERALL Despite these criticisms, DCA remains one of the most popular methods of indirect gradient analysis and is computationally very efficient. It is the best method to to use when there are no environmental data. The interpretation of results from DCA is best carried out with some knowledge of its limitations and comparison with other techniques. By doing this, one can get a better feel for patterns created by actual structure within the data set.

30 Joint plots, Biplots, Triplots A useful diagram that simultaneously shows the ordination together with the trends and strength of environmental factor correlations with the species data (Gabriel, 1971). A biplot is scaled somewhat differently than a joint plot. Biplot scaling is centered and normalized. Joint plot scaling is not. (These scaling options are part of the CCA ordination.) Both species and environmental factors are plotted on the same graph but using different scales. Arrows are drawn from the joint centered ordination axes to the points representing environmental variables. The direction of the arrow indicates the direction in which the abundance of a variable increases most rapidly. The length of the arrow indicates the rate of change in abundance in that direction. The method of joint plot calculation used in PC-ORD can be applied to any ordination technique. In PC-ORD the number of variables displayed is controlled by the Joint plot cutoff. Only variables in the 2nd matrix with r 2 values greater than the cutoff are displayed.

31 Joint plot: an example using a sample plot from DCA

32 Technique Review of ordination techniques Summary When to use Bray and Curtis(Polar Ordination) Principal Components Analysis (PCA) Reciprocal Averaging (Correspondence Analysis, (CA)) The first ordination technique. The axes are calculated by the use of floristic similarity coefficients. Samples that are least similar to each other form the ends of the first axis, and other plots are positioned along this axis according to their dissimilarity to both ends of the axis. The length of the axis is the dissimilarity of the plots of the end members of the axis. The second axis is artificially made perpendicular to the first axis. PCA is based on the assumption that there are linear correlations among the variables being reduced. It is an eigenvector technique. It reduces data sets with many highly correlated variables to a set of components that are uncorrelated with each other. The eigenvalues are correlation coefficients, and therefore can be used directly to measure the variance explained by the ordination. CA is also an eigenvector method. It simultaneously solves for the scores of plots and samples through a iterative process of matrix algebra that eventually results in scores that don t change with additional iterations. These scores are the solution for each axis. It is based on the assumption that the data have a unimodal response (Gaussian distribution) to an underlying gradient. The axes are generally scaled from 0 to 100. Both species and samples are plotted in the same ordination space. Any plant community data set can be ordinated with Bray and Curtis. The technique is not used much now because of the first and second axes are not mathematically orthogonal (noncorrelated), and there is criticism about the subjective nature of choosing the end points of the axes. It is, however, an easily understood method, and is good for teaching the principals of ordination.it has recently regained favor among some ecologists because modern computer algorithms have eliminated much of the subjectivity in the method (e.g., BCORD used in the PC-ORD program), and it has performed well in comparison with other methods. Any data set with high linear correlations between variables is appropriate for PCA. This is often the case with series of related data such as climate, soils, or biogeochemical data from plots. It can be appropriate for species data that have mostly linear correlations. It is generally not applied to plant community data that have high beta diversity (high species turnover along the first axis). It can be used for most plant community data, but the arch effect and compression of the scores toward the ends of the axes can be severe limitations. A great advantage is that both plots and species are plotted with the same axes.

33 Review of all ordination techniques (cont ) Technique Summary When to use Detrended Correspondence Analysis (DCA) Other methods: Canonical Correlations Analysis (CCA), Non-metric Multidimensional scaling (NMDS) DCA is similar to CA except that it corrects for the arch effect and the compression of the scores toward the end of the axes. It does this by dividing the the axis into segments and recalculating the scores within each segment so that the average score is zero. It also rescales each segment so that the coming and going of species in each segment is about equal. The axes are measured in sd units, which measure the turnover of species along the axes. CCA is a direct ordination method that uses correlation and regression between the floristic data and environmental data. Through an iterative process similar to that in CA and DCA, it selects the linear combination of environmental variables that explains most of the variation in the species scores on each axis. NMDS has only recently gained favor among ecologists (Clarke 1993). It is an iterative search for the best positions of n entitites in k dimensions (axes) that minimizes the stress of the k-dimensional configuration. DCA was extensively used in the 1980s and 1990s, but has lost favor in the past few years because the detrending process can have undesirable effects on ordinations. It remains a very good method for ordinating plots and species, when there are no environmental data. These are more advanced techniques have gained favor in the past few years. Students wishing to use ordination for their their theses are advised to seriously consider use of these techniques. CCA requires a good set of environmental data, whereas all other ordinations can be based purely on floristic data (environmental relationships are determined indirectly with these other methods).

34 Differences and advantages of NMDS vs. DCA Topic Computation time Distance metric Simultaneous ordination of species and plots Arch effect Related to direct gradient analysis methods Need to pre-specify numbers of dimensions Need to pre-specify parameters for no. of segments, etc. Solution changes depending on no. of axes viewed Handles samples with high noise levels Axes show measure of beta diversity (sd units) Used in other disciplines DCA Low Do not need to specify Yes Artificially and inelegantly removed Yes No Yes No Yes Yes No NMDS High Highly sensitive No Rarely occurs No Yes No Yes No? No Widely Axes interpretable as gradients Derived from a model of species response gradients Yes Yes No No Guaranteed to reach the global solution Yes No Adapted from: Michael Palmer s web page:

35 Two especially good references for ordination users Bruce McCune, James B.Grace, and Dean L. Urban on methods for analyzing multivariate data in community ecology, published by MjM Software Design, A expanded supplement for the PC-Ord manual by the authors of PC-Ord. Ordination Methods an overview. Michael Palmer: A condensed modern overview of the wide variety of ordination methods.