# Aus: Statnotes: Topics in Multivariate Analysis, by G. David Garson (Zugriff am

Save this PDF as:

Size: px
Start display at page:

Download "Aus: Statnotes: Topics in Multivariate Analysis, by G. David Garson (Zugriff am"

## Transcription

1 Aus: Statnotes: Topics in Multivariate Analysis, by G. David Garson (Zugriff am ) Planned multiple comparison t-tests, also just called "multiple comparison tests". In oneway ANOVA for confirmatory research, when difference of means tests are pre-planned and not just post-hoc, as when a researcher plans to compare each treatment group mean with the mean of the control group, one may apply a simple t-test, a Bonferroni-adjusted t-test, the Sidak test, or Dunnett's test. The last two are also variants of the t-test. The t-test is thus a test of significance of the difference in the means of a single interval dependent, for the case of two groups formed by a categorical independent. The difference between planned multiple comparison tests discussed in this section and posthoc multiple comparison tests discussed in the next section is one of power, not purpose. Some, including SPSS, lump all the tests together as "post hoc tests", as illustrated below. This figure shows the SPSS post hoc tests dialog after the Post Hoc button is pressed in the GLM Univariate dialog. (There is a similar dialog when Analyze, Compare Means, One- Way ANOVA is chosen, invoking the SPSS ONEWAY procedure, which the GLM procedure has superceded). The essential difference is that the planned multiple comparison tests in this section are based on the t-test, which generally has more power than the post-hoc tests listed in the next section. Warning! The model, discussed above, will make a difference for multiple comparison tests. A factor (ex., race) may display different multiple comparison results depending on what other factors are in the model. Covariates cannot be in the model at all for these tests to be done. Interactions may be in the model, but multiple comparison tests are not available to test them. Also note that all these t-tests are subject to the equality of variances assumption and therefore the data must meet Levene's test, discussed below. Finally, note that the significance level (.05 is default) may be set using the Options button off the main GLM dialog.

2 1. Simple t-test difference of means. The simple t-test is recommended when the researcher has a single planned comparison (a comparison of means specified beforehand on the basis of à priori theory). In SPSS, for One-Way ANOVA, select Analyze, Compare Means, One-Way ANOVA; click Post Hoc; select the multiple comparison test you want. If the Bonferroni test is requested, SPSS will print out a table of "Multiple Comparisons" giving the mean difference in the dependent variable between any two groups (ex., differences in test scores for any two educational groups). The significance of this difference is also printed, and an asterisk is printed next to differences significant at the.05 level or better. SPSS supports the Bonferroni test in its GLM and UNIANOVA procedure. SPSS. A simple t-test, with or without Bonferroni adjustment, may be obtained by selecting Statistics, Compare Means, One-Way ANOVA. Example. 2. Bonferroni-adjusted t-test. Also called the Dunn test, Bonferroni-adjusted t-tests are used when there are planned multiple comparisons of means. As a general principle, when comparisons of group means are selected on a post hoc basis simply because they are large, there is an expected increase in variability for which the researcher must compensate by applying a more conservative test -- otherwise, the likelihood of Type I errors will be substantial. The Bonferroni adjustment is perhaps the most common approach to making post-hoc significance tests more conservative. The Bonferroni method applies the simple t-test, but then adjusts the significance level by multiplying by the number of comparisons being made. For instance, a finding of.01 significance for 9 comparisons becomes.09. This is equivalent to saying that if the target alpha significance level is.05, then the t-test must show alpha/9 (ex.,.05/9 =.0056) or lower for a finding of significance to be made. Bonferroni-adjusted multiple t-tests are usually employed only when there are few comparisons, as with many it quickly becomes practically impossible to show

3 significance. If the independents formed 8 groups there would be 8!/6!2! = 28 comparisons and if one used the.05 significance level, one would expect at least one of the comparisons to generate a false positive (thinking you had a relationship when you did not). Note this adjustment may be applied to F-tests as well as t-tests. That is, it can handle nonpairwise as well as pairwise comparisons. The Bonferroni-adjusted t-test imposes an extremely small alpha significance level as the number of comparisons becomes large. That is, this method is not recommended when the number of comparisons is large because the power of the test becomes low. Klockars and Sax (1986: 38-39) recommend using a simple.05 alpha rate when there are few comparisons, but using the more stringent Bonferroni-adjusted multiple t-test when the number of planned comparisons is greater than the number of degrees of freedom for between-groups mean square (which is k-1, where k is the number of groups). Nonetheless, researchers still try to limit the number of comparisons, trying to reduce the probability of Type II errors (accepting a false null hypothesis). This test is not recommended when the researcher wishes to perform all possible pairwise comparisons. By the Bonferroni test, the figure above shows whites are significantly different from blacks but not from "other" races, with respect to mean highest year of education completed (the dependent variable). 3. Sidak test. The Sidak test, also called the Dunn-Sidak test, is a variant on the Dunn or Bonferroni approach, using a t-test for pairwise multiple comparisons. The alpha significance level for multiple comparisons is adjusted to tighter (more accurate) bounds than for the Bonferroni test (Howell, 1997: 364). SPSS supports the Sidak test in its GLM and UNIANOVA procedures. In the figure above, the Sidak test shows the same pattern as the Bonferroni test. 4. Dunnett's test is a t-statistic which is used when the researcher wishes to compare each treatment group mean with the mean of the control group, and for this purpose has better power than alternative tests. Dunnett's test does not require a prior finding of significance in the overall F test "as it controls the familywise error rate independently" (Cardinal & Aitken, 2005: 89). This test, based on a 1955 article by Dunnett, is not to be confused with Dunnett's C or Dunnett's T3, discussed below. In the example illustrated above, Dunnett's test leaves out the last category ("other"

4 race) as the reference category and shows whites are not significantly different from "other" but blacks are. HSU's multiple comparison with the best (MCB) test. HSU's MCB is an adaptation of Dunnett's method for the situation where the researcher wishes to compare the mean of each level with the best level, as in a treatment experiment where the best treatment is known. In such analyses the purpose is often to identify alternative treatments which are not significantly different from the best treatment but which may cost less or have other desirable features. HSU's MCB is supported by SAS JMP but not SPSS. HSU's unconstrained multiple comparison with the best (UMCB) test is a variant which takes each treatment group in turn as a possble best treatment and compares all others to it. Post-hoc multiple comparison tests, also just called "post-hoc tests," are used in exploratory research to assess which group means differ from which others, after the overall F test has demonstrated at least one difference exists. If the F test establishes that there is an effect on the dependent variable, the researcher then proceeds to determine just which group means differ significantly from others. That is, post-hoc tests are used when the researcher is exploring differences, not limited by ones specified in advance on the basis of theory. These tests may also be used for confirmatory research but the t-test-based tests in the previous section are generally preferred. In comparing group means on a post-hoc basis, one is comparing the means on the dependent variable for each of the k groups formed by the categories of the independent factor(s). The possible number of comparisons is k(k-1)/2. Multiple comparisons help specify the exact nature of the overall effect determined by the F test. However, note that post hoc tests do not control for the levels of other factors or for covariates (that is, interaction and control effects are not taken into account). Findings of significance or nonsignificance between factor levels must be understood in the context of full ANOVA F- test findings, not just post hoc tests, which are subordinant to the overall F test. Note the model cannot contain covariates when employing these tests. Computation. The q-statistic, also called the q range statistic or the Studentized range statistic, is commonly used in coefficients for post-hoc multiple comparisons, though some post hoc tests use the t statistic. In contrast to the planned comparison t-test, coefficients based on the q-statistic, are commonly used for post-hoc comparisons - that is, when the researcher wishes to explore the data to uncover large differences, without limiting investigation by à priori theory). Both the q and t statistics use the difference of means in the numerator, but where the t statistic uses the standard error of difference between the means in the denominator, q uses the standard error of the mean. Consequently, where the t test tests the difference between two means, the q-statistic tests the probability that the largest mean and smallest mean among the k groups formed by the categories of the independent(s) were sampled from the same population. If the q-statistic computed for the two sample means is not as large as the criterion q value in a table of critical q values, then the researcher cannot reject the null hypothesis that the groups do not differ at the given alpha significance level (usually.05). If the null hypothesis is not rejected for the largest compared to smallest group means, it follows that all intermediate groups are also drawn from the same population -- so the q-statistic is thus also a test of homogeneity for all k groups formed by the independent variable(s). Output formats: pairwise vs. multiple range. In pairwise comparisons tests, output is produced similar to the Bonferroni and Sidk tests above, for the LSD, Games-Howell,

5 Tamhane's T2 and T3, Dunnett's C, and Dunnett's T3 tests. Homogeneous subsets for range tests are provided for S-N-K, Tukey's b, Duncan, R-E-G-W F, R-E-G-W Q, and Waller. Some tests are of both types: Tukey's honestly significant difference test, Hochberg's GT2, Gabriel's test, and Scheff?s test. Warning! The model, discussed above, will make a difference for post hoc tests. A factor (ex., race) may display different multiple comparison results depending on what other factors are in the model. Covariates cannot be in the model at all for these tests to be done. Interactions may be in the model, but multiple comparison tests are not available to test them. Also note that all the post-hoc tests are subject to the equality of variances assumption and therefore the data must meet Levene's test, discussed below, with the exception of Tamhane's T2, Dunnett's T3, Games-Howell, and Dunnett's C, all of which are tailored for data where equal variances cannot be assumed. Finally, note that the significance level (.05 is default) may be set using the Options button off the main GLM dialog. Tests assuming equal variances 1. Least significant difference (LSD) test. This test, also called the Fisher's LSD, the protected LSD, or the protected t test, is based on the t-statistic and thus can be considered a form of t-test. "Protected" means the LSD test should be applied only after the overall F test is shown to be significant. LSD compares all possible pairs of means after the F-test rejects the null hypothesis that groups do not differ (this is a requirement of the test). (Note some computer packages wrongly report LSD t-test coefficients for comparisons even if the F test leads to acceptance of then null hypothesis). It can handle both pairwise and nonpairwise comparisons and does not require equal sample sizes. LSD is the most liberal of the post-hoc tests (it is most likely to reject the null hypothesis in favor of finding groups do differ). It controls the experimentwise Type I error rate at a selected alpha level (typically 5%), but only for the omnibus (overall) test of the null hypothesis. LSD allows higher Type I errors for the partial null hypotheses involved in the comparisons. Toothaker (1993: 42) recommends against any use of LSD on the grounds that it has poor control of experimentwise alpha significance, and better alternatives exist such as Shaffer-Ryan, discussed below. Others, such as Cardinal & Aitken (2005: 86) recommend its use only for factors with three levels. However, the LSD test is the default in SPSS for pairwise comparisons in its GLM or UNIANOVA procedures. As illustrated below, the LSD test is interpreted in the same manner as the Bonferroni test above and for this example yields the same substantive results: whites differ significantly from blacks but not other races on mean highest school year completed.

6 The Fisher-Hayter test is a modification of the LSD test meant to control for the liberal alpha significance level allowed by LSD. It is used when all pairwise comparisons are done post-hoc, but power may be low for fewer comparisons. See Toothaker (1993: 43-44). SPSS does not support the Fisher-Hayter test. 2. Tukey's test, a.k.a. Tukey honestly significant difference (HSD) test: As illustrated below, the multiple comparisons table for the Tukey test displays all pairwise comparisions between groups, interpreted in the same way as for the Bonferroni test discussed above. The Tukey test is conservative when group sizes are unequal. It is often preferred when the number of groups is large precisely because it is a conservative pairwise comparison test, and researchers often prefer to be conservative when the large number of groups threatens to inflate Type I errors. HSD is the most conservative of the posthoc tests in that it is the most likely to accept the null hypothesis of no group differences. Some recommend it only when all pairwise comparisons are being tested. When all pairwise comparisons are being tested, the Tukey HSD test is more powerful than the Dunn test (Dunn may be more powerful for fewer than all comparisons). The Tukey HSD test is based on the q-statistic (the Studentized range distribution) and is limited to pairwise comparisons. Select "Tukey" on the SPSS Post Hoc dialog (Example).

7 3. Tukey-b test, a.k.a. Tukey's wholly significant difference (WSD) test, also shown above, is a less conservative version of Tukey's HSD test, also based on the q-statistic. The critical value of WSD (Tukey-b) is the mean of the corresponding value for the Tukey's HSD test and the Newman-Keuls test, discussed below. In the illustration above, note no "Sig" significance values is output in the range test table for Tukey-b. Rather, the table shows there are two significantly different homogenous subsets on highest year of school completed, with the first group being blacks and the second group being whites and other race. 4. S-N-K or Student-Newman-Keuls test. also called the Newman-Keuls test, is a little-used post-hoc comparison test of the range type, also based on the q- statistic, which is used to evaluate partial null hypotheses (hypotheses that all but g of the k means come from the same population). It is recommended for one-way balanced ANOVA designs when there are only three means to be compared (Cardinal & Aitken, 2005: 87). Let k = the number of groups formed by categories of the independent variable(s). First all combinations of k-1 means are tested, then k-2 groups, and so on until sets of 2 means are tested. As one is proceeding toward testing ever smaller sets, testing stops if an insignificant range is discovered (that is, if the q-statistic for the comparison of the highest and lowest mean in the set [the "stretch"] is not as great as the critical value of q for the number of groups in the set). Klockars and Sax (1986: 57) recommend the Student-Newman-Keuls test when the researcher wants to compare adjacent means (pairs adjacent to each other when all means are presented in rank order). Toothaker (1993: 29) recommends Newman-Keuls only when the number of groups to be compared equals 3, assuming one wants to control the comparison error rate at the

8 experimentwise alpha rate (ex.,.05), but states that the Ryan or Shaffer-Ryan, or the Fisher-Hayter tests are preferable (Toothaker, 1993: 46). The example below shows the same homogenous groups as in the Tukey-b test above. Duncan test. A range test somewhat similar to the S-N-K test and also not commonly used due to poor control (Cardinal & Aitken, 2005: 88). Illustrated further below. 5. Ryan test (REGWQ): This is the Ryan-Einot-Gabriel-Welsch multiple range test based on range and is the usual Ryan test, a modified Student-Newman- Keuls test adjusted so critical values decrease as stretch size (the range from highest to lowest mean in the set being considered) decreases. The Ryan test is more powerful than the S-N-K test or the Duncan multiple range test discussed below. It is considered a conservative test and is recommended for one-way balanced ANOVA designs and is not recommended for unbalanced designs. The result is that Ryan controls the experimentwise alpha rate at the desired level (ex.,.05) even when the number of groups exceeds 3, but at a cost of being less powerful (more chance of Type II errors) than Newman- Keuls. As with Newman-Keuls, Ryan is a step-down procedure such that one will not get to smaller stretch comparisons if the null hypothesis is accepted for larger stretches of which they are a subset. Toothaker (1993: 56) calls Ryan the "best choice" among tests supported by major statistical packages because maintains good alpha control (ex., better than Newman-Keuls) while having at least 75% of the power of the most powerful tests (ex., better than Tukey HSD). Cardinal and Aiken (2005: 87) consider the Ryan test a "good compromise" between the liberal Student-Newman-Keuls test and the conservative Tukey HSD test. For the same data, it comes to the same conclusion as illustrated below. 6. Ryan test (REGWF): This is the Ryan test based on the F statistic rather than range. It is a bit more powerful than REGWQ, though less common and more computationally intensive. Also a conservative test, it tends to come to the same substantive conclusions as ordinary Ryan test. REGWF is supported by

9 SPSS but not SAS. The Shaffer-Ryan test modifies the Ryan test. It is also a protected or step-down test, requiring the overall F test reject the null hypothesis first but uses slightly different critical values. To date, Shaffer-Ryan is not supported by SAS or SPSS, but it is recommended by Toothaker (1993: 55) as "one of the best multiple comparison tests in terms of power." 7. The Scheffé test is a widely-used range test which works by first requiring the overall F test of the null hypothesis be rejected. If the null hypothesis is not rejected overall, then it is not rejected for any comparison null hypothesis. If the overall null hypothesis is rejected, however, then F values are computed simultaneously for all possible comparison pairs and must be higher than an even larger critical value of F than for the overall F test described above. Let F be the critical value of F as used for the overall test. For the Scheffé test, the new, higher critical value, F', is (k-1)f. The Scheffé test can be used to analyze any linear combination of group means. Output, illustrated below, is similar to other range tests discussed above and for this example comes to the same conclusions.

10 While the Scheffé test has the advantage of maintaining an experimentwise. 05 significance level in the face of multiple comparisons, it does so at the cost of a loss in statistical power (more Type II errors may be made -- thinking you do not have a relationship when you do). That is, the Scheffé test is a very conservative one (more conservative than Dunn or Tukey, for ex.), not appropriate for planned comparisons but rather restricted to post hoc comparisons. Even for post hoc comparisons, the test is used for complex comparisons and is not recommended for pairwise comparisons due to "an unacceptably high level of Type II errors" (Brown and Melamed, 1990: 35). Toothaker (1993: 28) recommends the Scheffé test only for complex comparisons, or when the number of comparisons is large. The Scheffé test is low in power and thus not preferred for particular comparisons, but it can be used when one wishes to do all or a large number of comparisons. Tukey's HSD is preferred for making all pairwise comparisons among group means, and Scheffé for making all or a large number of other linear combinations of group means. 8. Hochberg GT2 test. A range test considered similar to Tukey's HSD but which is quite robust against violation of homogeneity of variances except when cell sizes are extremely unbalanced. It is generally less powerful than Tukey's HSD when factor cell sizes are not equal.

11 9. Gabriel test. A range test based on the Studentized maximux modulus test. The Gabriel test is similar to but more powerful than the Hochberg GT2 test when cell sizes are unequal, but it tends to display a liberal bias as cell sizes vary greatly. 10.Waller-Duncan test. A range test based on a Bayesian approach, making it different from other tests in this section. When factor cells are not equal, it uses the harmonic mean of the sample sizes. The kratio is specified by the researcher in advance in lieu of specifying an alpha significance level (ex.,. 05). The kratio is known as the Type 1/Type 2 error seriousness ratio. The default value is 100, which loosely corresponds to a.05 alpha level; kratio = 500 loosely corresponds to alpha = 1. Tests not assuming equal variances. If the model is a one-way ANOVA with only one factor and no covariates and no interactions, then four additional tests are available which do not require the usual ANOVA assumption of homogeneity of variances. 1. Tamhane's T2 test. Tamhane's T2 is a conservative test. It is considered more appropriate than Tukey's HSD when cell sizes are unequal and/or when homogeneity of variances is violated.

12 2. Games-Howell test. The Games-Howell test is a modified HSD test which is appropriate when the homogeneity of variances assumption is violated. It is designed for unequal variances and unequal sample sizes, and is based on the q-statistic distribution. Games-Howell is slightly less conservative than Tamhane's T2 and can be liberal when sample size is small and is recommended only when group sample sizes are greater than 5. Because Games-Howell is only slightly liberal and because it is more powerful than Dunnett's C or T3, it is recommended over these tests. Toothaker (1993: 66) recommends Games-Howell for the situation of unequal (or equal) sample sizes and unequal or unknown variances. 3. Dunnett's T3 test and Dunnett's C test. These tests might be used in lieu of Games-Howell when it is essential to maintain strict control over the alpha significance level across multiple tests, similar to the purpose of Bonferroni adjustments (ex., exactly.05 or better). 4. The Tukey-Kramer test: This test, described in Toothaker (1993: 60), who also gives an appendix with critical values, controls experimentwise alpha. Requires equal population variances. Toothaker (p. 66) recommends this test for the situation of equal variances but unequal sample sizes. In SPSS, if you ask for the Tukey test and sample sizes are unequal, you will get the Tukey- Kramer test, using the harmonic mean. Not supported by SPSS 5. The Miller-Winer test: Not recommended unless equal population variances are assured. Not supported by SPSS

### Statistical notes for clinical researchers: post-hoc multiple comparisons

Open lecture on statistics ISSN 2234-7658 (print) / ISSN 2234-7666 (online) Statistical notes for clinical researchers: post-hoc multiple comparisons Hae-Young Kim* Department of Health Policy and Management,

### Multiple-Comparison Procedures

Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical

### Appendix 10: Post Hoc Tests 1

Appendix 0: Post Hoc Tests Notation Post hoc tests in SPSS are available in more than one procedure, including ONEWAY and GLM. The following notation is used throughout this appendix unless otherwise stated:

### INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

### SIMULTANEOUS COMPARISONS AND THE CONTROL OF TYPE I ERRORS CHAPTER 6

SIMULTANEOUS COMPARISONS AND THE CONTROL OF TYPE I ERRORS CHAPTER 6 ERSH 8310 Lecture 8 September 18, 2007 Today s Class Discussion of the new course schedule. Take-home midterm (one instead of two) and

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### SPSS and AMOS. Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong

Seminar on Quantitative Data Analysis: SPSS and AMOS Miss Brenda Lee 2:00p.m. 6:00p.m. 24 th July, 2015 The Open University of Hong Kong SBAS (Hong Kong) Ltd. All Rights Reserved. 1 Agenda MANOVA, Repeated

### UNDERSTANDING THE ONE-WAY ANOVA

UNDERSTANDING The One-way Analysis of Variance (ANOVA) is a procedure for testing the hypothesis that K population means are equal, where K >. The One-way ANOVA compares the means of the samples or groups

### The Type I error rate is the fraction of times a Type I error is made. Comparison-wise type I error rate CER. Experiment-wise type I error rate EER

Topic 5. Mean separation: Multiple comparisons [S&T Ch.8 except 8.3] 5. 1. Basic concepts If there are more than treatments the problem is to determine which means are significantly different. This process

### Fisher's least significant difference (LSD) 2. If outcome is do not reject H, then! stop. Otherwise continue to #3.

Fisher's least significant difference (LSD) Procedure: 1. Perform overall test of H : vs. H a :. Á. Á â Á. " # >. œ. œ â œ.! " # > 2. If outcome is do not reject H, then! stop. Otherwise continue to #3.

### Type I Error Of Four Pairwise Mean Comparison Procedures Conducted As Protected And Unprotected Tests

Journal of odern Applied Statistical ethods Volume 4 Issue 2 Article 1 11-1-25 Type I Error Of Four Pairwise ean Comparison Procedures Conducted As Protected And Unprotected Tests J. Jackson Barnette University

### o Exercise A Comparisonwise versus Experimentwise Error Rates

Multiple Comparisons Contents The Estimation of Group (Treatment) Means o Example Multiple Comparisons o Fisher's Least Significant Difference (LSD) Theory Example o Tukey's Honest Significant Difference

### Contrasts and Post Hoc Tests for One-Way Independent ANOVA Using SPSS

Contrasts and Post Hoc Tests for One-Way Independent ANOVA Using SPSS Running the Analysis In last week s lecture we came across an example, from Field (2013), about the drug Viagra, which is a sexual

### ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

ANOVA ANOVA Analysis of Variance Chapter 6 A procedure for comparing more than two groups independent variable: smoking status non-smoking one pack a day > two packs a day dependent variable: number of

### Simple Tricks for Using SPSS for Windows

Simple Tricks for Using SPSS for Windows Chapter 14. Follow-up Tests for the Two-Way Factorial ANOVA The Interaction is Not Significant If you have performed a two-way ANOVA using the General Linear Model,

### A posteriori multiple comparison tests

A posteriori multiple comparison tests 09/30/12 1 Recall the Lakes experiment Source of variation SS DF MS F P Lakes 48.933 2 24.467 5.872 0.017 Error 50.000 12 4.167 Total 98.933 14 The ANOVA tells us

### ANOVA Analysis of Variance

ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

### Lecture 23 Multiple Comparisons & Contrasts

Lecture 23 Multiple Comparisons & Contrasts STAT 512 Spring 2011 Background Reading KNNL: 17.3-17.7 23-1 Topic Overview Linear Combinations and Contrasts Pairwise Comparisons and Multiple Testing Adjustments

### IBM SPSS Advanced Statistics 20

IBM SPSS Advanced Statistics 20 Note: Before using this information and the product it supports, read the general information under Notices on p. 166. This edition applies to IBM SPSS Statistics 20 and

### Contrasts ask specific questions as opposed to the general ANOVA null vs. alternative

Chapter 13 Contrasts and Custom Hypotheses Contrasts ask specific questions as opposed to the general ANOVA null vs. alternative hypotheses. In a one-way ANOVA with a k level factor, the null hypothesis

### Multivariate analysis of variance

21 Multivariate analysis of variance In previous chapters, we explored the use of analysis of variance to compare groups on a single dependent variable. In many research situations, however, we are interested

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### Introduction. Two variables: 1 Categorical variable (factor/iv), 1 Quantitative variable (response/dv) Main Question: Do (the means of) the

One-way ANOVA Introduction Two variables: 1 Categorical variable (factor/iv), 1 Quantitative variable (response/dv) Main Question: Do (the means of) the quantitative variables depend on which group (given

### 13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior

### MULTIVARIATE GLM, MANOVA, AND MANCOVA Edition by G. David Garson and Statistical Associates Publishing Page 1

Copyright @c 2015 by G. David Garson and Statistical Associates Publishing Page 1 @c 2015 by G. David Garson and Statistical Associates Publishing. All rights reserved worldwide in all media. No permission

### Multiple Comparison Tests for Balanced, One-Factor Designs

Multiple Comparison Tests for Balanced, One-Factor Designs Term Paper FRST 533 Dec. 15, 2005 Craig Farnden 1.0 Introduction Frequently after completing an analysis of variance test in a single factor experimental

### 4.4. Further Analysis within ANOVA

4.4. Further Analysis within ANOVA 1) Estimation of the effects Fixed effects model: α i = µ i µ is estimated by a i = ( x i x) if H 0 : µ 1 = µ 2 = = µ k is rejected. Random effects model: If H 0 : σa

### SPSS Guide: Tests of Differences

SPSS Guide: Tests of Differences I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### INTERPRETING THE REPEATED-MEASURES ANOVA

INTERPRETING THE REPEATED-MEASURES ANOVA USING THE SPSS GENERAL LINEAR MODEL PROGRAM RM ANOVA In this scenario (based on a RM ANOVA example from Leech, Barrett, and Morgan, 2005) each of 12 participants

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504

Inferential Statistics Katie Rommel-Esham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice

### EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

### A Review of Experimentwise Type I Error: Implications for Univariate Post Hoc and for. Multivariate Testing. Dan Altman

Experimentwise Error 1 Running Head: REVIEW OF EXPERIMENTWISE ERROR A Review of Experimentwise Type I Error: Implications for Univariate Post Hoc and for Multivariate Testing Dan Altman Texas A&M University

### Notes on Maxwell & Delaney

Notes on Maxwell & Delaney PSY710 5 Chapter 5 - Multiple Comparisons of Means 5.1 Inflation of Type I Error Rate When conducting a statistical test, we typically set α =.05 or α =.01 so that the probability

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### ANSWERS TO EXERCISES AND REVIEW QUESTIONS

ANSWERS TO EXERCISES AND REVIEW QUESTIONS PART FIVE: STATISTICAL TECHNIQUES TO COMPARE GROUPS Before attempting these questions read through the introduction to Part Five and Chapters 16-21 of the SPSS

### UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### 1. Why the hell do we need statistics?

1. Why the hell do we need statistics? There are three kind of lies: lies, damned lies, and statistics, British Prime Minister Benjamin Disraeli (as credited by Mark Twain): It is easy to lie with statistics,

### Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

2 - Manova 4.3.05 25 Multivariate Analysis of Variance What Multivariate Analysis of Variance is The general purpose of multivariate analysis of variance (MANOVA) is to determine whether multiple levels

### CHAPTER 3 COMMONLY USED STATISTICAL TERMS

CHAPTER 3 COMMONLY USED STATISTICAL TERMS There are many statistics used in social science research and evaluation. The two main areas of statistics are descriptive and inferential. The third class of

### 1 Overview. Fisher s Least Significant Difference (LSD) Test. Lynne J. Williams Hervé Abdi

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 2010 Fisher s Least Significant Difference (LSD) Test Lynne J. Williams Hervé Abdi 1 Overview When an analysis of variance

### SPSS Explore procedure

SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

### Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

### IBM SPSS Advanced Statistics 22

IBM SPSS Adanced Statistics 22 Note Before using this information and the product it supports, read the information in Notices on page 103. Product Information This edition applies to ersion 22, release

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,

### Analysis of variance (ANOVA) is a

Steven F. Sawyer, PT, PhD Analysis of variance (ANOVA) is a statistical tool used to detect differences between experimental group means. ANOVA is warranted in experimental designs with one dependent variable

### Multiple Comparisons. Cohen Chpt 13

Multiple Comparisons Cohen Chpt 13 How many t-tests? We do an experiment, 1 factor, 3 levels (= 3 groups). The ANOVA gives us a significant F-value. What now? 4 levels, 1 factor: how many independent comparisons?

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Analysis of Variance. MINITAB User s Guide 2 3-1

3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

### Analysis of Variance ANOVA

Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### One-Way Analysis of Variance

Spring, 000 - - Administrative Items One-Way Analysis of Variance Midterm Grades. Make-up exams, in general. Getting help See me today -:0 or Wednesday from -:0. Send an e-mail to stine@wharton. Visit

### One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

### One-Way Analysis of Variance (ANOVA) with Tukey s HSD Post-Hoc Test

One-Way Analysis of Variance (ANOVA) with Tukey s HSD Post-Hoc Test Prepared by Allison Horst for the Bren School of Environmental Science & Management Introduction When you are comparing two samples to

### Analysis of numerical data S4

Basic medical statistics for clinical and experimental research Analysis of numerical data S4 Katarzyna Jóźwiak k.jozwiak@nki.nl 3rd November 2015 1/42 Hypothesis tests: numerical and ordinal data 1 group:

### UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

### How to choose a statistical test. Francisco J. Candido dos Reis DGO-FMRP University of São Paulo

How to choose a statistical test Francisco J. Candido dos Reis DGO-FMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There

### By Hui Bian Office for Faculty Excellence

By Hui Bian Office for Faculty Excellence 1 K-group between-subjects MANOVA with SPSS Factorial between-subjects MANOVA with SPSS How to interpret SPSS outputs How to report results 2 We use 2009 Youth

### Example: Multivariate Analysis of Variance

1 of 36 Example: Multivariate Analysis of Variance Multivariate analyses of variance (MANOVA) differs from univariate analyses of variance (ANOVA) in the number of dependent variables utilized. The major

### Testing Hypotheses using SPSS

Is the mean hourly rate of male workers \$2.00? T-Test One-Sample Statistics Std. Error N Mean Std. Deviation Mean 2997 2.0522 6.6282.2 One-Sample Test Test Value = 2 95% Confidence Interval Mean of the

### Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

### c. The factor is the type of TV program that was watched. The treatment is the embedded commercials in the TV programs.

STAT E-150 - Statistical Methods Assignment 9 Solutions Exercises 12.8, 12.13, 12.75 For each test: Include appropriate graphs to see that the conditions are met. Use Tukey's Honestly Significant Difference

### Allelopathic Effects on Root and Shoot Growth: One-Way Analysis of Variance (ANOVA) in SPSS. Dan Flynn

Allelopathic Effects on Root and Shoot Growth: One-Way Analysis of Variance (ANOVA) in SPSS Dan Flynn Just as t-tests are useful for asking whether the means of two groups are different, analysis of variance

### Tukey s HSD (Honestly Significant Difference).

Agenda for Week 4 (Tuesday, Jan 26) Week 4 Hour 1 AnOVa review. Week 4 Hour 2 Multiple Testing Tukey s HSD (Honestly Significant Difference). Week 4 Hour 3 (Thursday) Two-way AnOVa. Sometimes you ll need

About Single Factor ANOVAs TABLE OF CONTENTS About Single Factor ANOVAs... 1 What is a SINGLE FACTOR ANOVA... 1 Single Factor ANOVA... 1 Calculating Single Factor ANOVAs... 2 STEP 1: State the hypotheses...

### THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

### SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### Pairwise Multiple Comparison Test Procedures: An Update for Clinical Child and Adolescent Psychologists. H. J. Keselman. University of Manitoba

Pairwise Comparisons 1 Pairwise Multiple Comparison Test Procedures: An Update for Clinical Child and Adolescent Psychologists H. J. Keselman University of Manitoba Robert A. Cribbie York University and

### ANOVA - Analysis of Variance

ANOVA - Analysis of Variance ANOVA - Analysis of Variance Extends independent-samples t test Compares the means of groups of independent observations Don t be fooled by the name. ANOVA does not compare

### 6 Comparison of differences between 2 groups: Student s T-test, Mann-Whitney U-Test, Paired Samples T-test and Wilcoxon Test

6 Comparison of differences between 2 groups: Student s T-test, Mann-Whitney U-Test, Paired Samples T-test and Wilcoxon Test Having finally arrived at the bottom of our decision tree, we are now going

### CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples

### Hypothesis testing S2

Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

### The Statistics Tutor s Quick Guide to

statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7

### Multiple Analysis of Variance (MANOVA) Kate Tweedy and Alberta Lunardelli

Multiple Analysis of Variance (MANOVA) Kate Tweedy and Alberta Lunardelli Generally speaking, multivariate analysis of variance (MANOVA) is an extension of ANOV However, rather than measuring the effect

### The Statistics Tutor s

statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence Stcp-marshallowen-7 The Statistics Tutor s www.statstutor.ac.uk

### EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Chapter 16 Appendix. Nonparametric Tests with Excel, JMP, Minitab, SPSS, CrunchIt!, R, and TI-83-/84 Calculators

The Wilcoxon Rank Sum Test Chapter 16 Appendix Nonparametric Tests with Excel, JMP, Minitab, SPSS, CrunchIt!, R, and TI-83-/84 Calculators These nonparametric tests make no assumption about Normality.

### The General Linear Model: Theory

Gregory Carey, 1998 General Linear Model - 1 The General Linear Model: Theory 1.0 Introduction In the discussion of multiple regression, we used the following equation to express the linear model for a

### SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

### Categorical Variables in Regression: Implementation and Interpretation By Dr. Jon Starkweather, Research and Statistical Support consultant

Interpretation and Implementation 1 Categorical Variables in Regression: Implementation and Interpretation By Dr. Jon Starkweather, Research and Statistical Support consultant Use of categorical variables

### Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### ANOVA must be modified to take correlated errors into account when multiple measurements are made for each subject.

Chapter 14 Within-Subjects Designs ANOVA must be modified to take correlated errors into account when multiple measurements are made for each subject. 14.1 Overview of within-subjects designs Any categorical

### taken together, can provide strong support. Using a method for combining probabilities, it can be determined that combining the probability values of

taken together, can provide strong support. Using a method for combining probabilities, it can be determined that combining the probability values of 0.11 and 0.07 results in a probability value of 0.045.

### Lecture Notes #3: Contrasts and Post Hoc Tests 3-1

Lecture Notes #3: Contrasts and Post Hoc Tests 3-1 Richard Gonzalez Psych 613 Version 2.4 (2013/09/18 13:11:59) LECTURE NOTES #3: Contrasts and Post Hoc tests Reading assignment Read MD chs 4, 5, & 6 Read

### 1 Overview. Tukey s Honestly Significant Difference (HSD) Test. Hervé Abdi Lynne J. Williams

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 2010 Tukey s Honestly Significant Difference (HSD) Test Hervé Abdi Lynne J. Williams 1 Overview When an analysis of variance

### QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

### 0.1 Estimating and Testing Differences in the Treatment means

0. Estimating and Testing Differences in the Treatment means Is the F-test significant, we only learn that not all population means are the same, but through this test we can not determine where the differences

### Guidelines for Multiple Testing in Impact Evaluations of Educational Interventions

Contract No.: ED-04-CO-0112/0006 MPR Reference No.: 6300-080 Guidelines for Multiple Testing in Impact Evaluations of Educational Interventions Final Report May 2008 Peter Z. Schochet Submitted to: Institute

### Chapter 12 Statistical Foundations: Analysis of Variance 377. Chapter 12 Statistical Foundations: Analysis of Variance

Chapter 1 Statistical Foundations: Analysis of Variance 377 Chapter 1 Statistical Foundations: Analysis of Variance There are many instances when a researcher is faced with the task of examining three

### Multivariate Analysis of Variance (MANOVA) Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016

Multivariate Analysis of Variance () Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016 Multivariate Analysis of Variance Multivariate Analysis of Variance () ~ a dependence technique that

### Comparing three or more groups (one-way ANOVA...)

Page 1 of 36 Comparing three or more groups (one-way ANOVA...) You've measured a variable in three or more groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the

### MS&E 226: Small Data. Lecture 17: Additional topics in inference (v1) Ramesh Johari

MS&E 226: Small Data Lecture 17: Additional topics in inference (v1) Ramesh Johari ramesh.johari@stanford.edu 1 / 34 Warnings 2 / 34 Modeling assumptions: Regression Remember that most of the inference

### UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

### research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric