One-Way Analysis of Variance (ANOVA) with Tukey s HSD Post-Hoc Test Prepared by Allison Horst for the Bren School of Environmental Science & Management Introduction When you are comparing two samples to determine whether they are significantly different, we can use parametric (t-tests) or non-parametric tests (Wilcoxon Signed Rank, Mann Whitney U). What if we are trying to determine if significant differences exist when there are more than two groups (populations, samples, treatments, etc.) of interest? If we are trying to compare means for more than two groups, where it can be assumed that data satisfy parametric assumptions, an appropriate method to determine whether any significant differences exist between groups is called Analysis of Variance, or ANOVA. ANOVA is useful for comparing means across groups when there are more than 2 treatments being compared. For ANOVA, the null and alternative hypotheses are as follows: H 0 : means across groups do not differ H 1 : means differ between at least two groups If we re just trying to compare means between groups again, then why not just use multiple two-sample t-tests? Remember that with t-tests, we have about a 5% chance (if we use a 95% confidence level) of making a Type I Error. If we perform multiple two-sample t-tests, our opportunity for a Type I Error increases with each additional test...if you perform three independent t-tests, your total probability of committing a Type I Error is already 15% - which is already relatively high! Performing a single ANOVA test on all data groups simultaneously reduces the potential for Type I Error. ANOVA, Conceptually When using a t-test, we essentially try to determine whether the difference between the means is equal to zero (the null hypothesis) or not equal to zero (the alternative hypothesis). How do we expand this to include multiple groups that are influenced by one predictor variable (one-way ANOVA)? While we are comparing means, we actually use an analysis of sample variability to determine whether differences between samples are significance hence Analysis of Variance instead of Analysis of Means. In essence, ANOVA is comparing the variability within each group is large compared to the variability between the means of the groups. The basic question that ANOVA answers is: Do the sample means show differences from each other that are large relative to the differences among individual cases within each sample? In other words, how does the variability in sample means compare to the variability within each sample?
If the variability in mean differences is sufficiently greater than the variability within groups where sufficiently depends on the significance level selected by the user then the result will be to reject the null hypothesis. If the variability between group means is not sufficiently greater than the variability within groups, then the null hypothesis is retained. The measure of variability in ANOVA is via the F-statistic, which we saw earlier for determining if variances are equal, which can be converted into a p-value from the f-distribution. Performing one-way ANOVA with > 2 samples does NOT provide information about significances between any two samples. It can only help you to conclude whether there are not significant differences across all samples, or whether you think there are at least two significantly different treatments. To determine which two samples may be significantly different, you can follow ANOVA with a post-hoc test. When a significant result arises during ANOVA (i.e., you reject the null hypothesis), you can perform a posthoc (translates to after this ) test to determine which groups differ. There are several options for post-hoc statistical tests, including the Bonferroni approach, stepdown procedures, and Dunnet s and Hsu s procedures (found easily online or in basic statistics texts). One post-hoc test that works particularly well following ANOVA, is widely accepted in statistical literature, is easily performed in R, and is somewhat conservative, is the Tukey s Honestly Significant Difference (Tukey s HSD) test. **Note: If you are interested in the mathematics behind either one-way ANOVA or Tukey s HSD and would like to see an example done by-hand, contact Allison at ahorst@bren.ucsb.edu. One-Way ANOVA with Post-Hoc Tukey s HSD in RStudio Follow along with the example in this section to learn how to perform one-way ANOVA with post-hoc Tukey s HSD with a dataset describing the effects of four different enzymes on a certain reaction rate. Step 1. Organize your data When working in RStudio, the easiest way to organize your data for one-way ANOVA is as follows (note that this only shows through Enzyme B the actual dataset (see.xlsx file titled Enzymatic Reaction Rates accompanying this file on the website to follow along in RStudio) includes data for Enzymes A - D:!"#$%& '&()*+,"-'(*&-.%,/&010&),"23 4 5678 4 9879 4 9675 4 9978 4 9:7; 4 9<75 4 5678 4 5=78 4 5>79? 9<78? 997@? 957:? 967>? :@75? 9;7;? 957>? 9@76? 997; A 9>7>
**REMEMBER: Simplify all datasets and column names before saving as a.csv file and loading into RStudio. For the example here, the column names were simplified to Enzyme and Rate, respectively, and the data were saved as a.csv file titled Enzymes.csv. You are encouraged to do the same to follow along with the example presented here. There are several options for performing one-way ANOVA in RStudio. Here, I will use the aov() function. Explore the function (?aov). The function works as follows: > TESTNAME <- aov(valuescolumnname ~ SubsetColumn, data = DatasetName) For the enzymatic reaction rate data, the command would look like this (if the data is loaded using the dataset name Enzymes : > EnzymeANOVA <- aov(rate ~ Enzyme, data = Enzymes) Explore the results using the summary() function. > summary(enzymeanova) The p-value for the ANOVA summary helps you decide if you will reject or retain the null hypothesis (see hypotheses above). If you decide to reject the null hypothesis (thereby retaining the null that there is a significant difference between at least two of the treatments in your ANOVA), then you will likely want to examine further which treatments may be different. That s when post-hoc tests come in. One common (and robust) test that is already programmed in R to work with the aov() function is Tukey s HSD. Tukey s HSD is performed using the TukeyHSD() function in RStudio as follows: > PostHocTestName <- TukeyHSD(ANOVATestName) **Note: You must have already performed one-way ANOVA using the aov() function and assigned that test to a variable name. That variable name is what is entered into the TukeyHSD() function to run the post-hoc test. Perform Tukey s HSD with the EnzymeANOVA results: > PostHoc <- TukeyHSD(EnzymeANOVA)
Review the results from the post-hoc test. What do they tell you? > PostHoc Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = Rate ~ Enzyme, data = Enzymes) $Enzyme diff lwr upr p adj B-A 2.077778-3.267837 7.423392 0.7198683 C-A 10.244444 4.898830 15.590059 0.0000646 D-A 17.677778 12.332163 23.023392 0.0000000 C-B 8.166667 2.821052 13.512281 0.0012904 D-B 15.600000 10.254385 20.945615 0.0000000 D-C 7.433333 2.087719 12.778948 0.0035636 Notice that RStudio very nicely displays the results of the post-hoc test. The ONLY TWO treatments that are NOT significantly different following one-way ANOVA with post-hoc Tukey s analysis are Enzymes B and A. Communicating Results of One-Way ANOVA Reporting ANOVA results in text is somewhat involved, as you may want to include: - Whether the ANOVA is one-way or multi-way - What the independent and dependent variables are, and the conditions studied - The outcome of the hypothesis test - The significance level (α) - The between-groups (numerator) and within groups (denominator) degrees of freedom (k 1, and N k, respectively) - The calculated F-value - The corresponding p-value For example: One-way ANOVA was performed to compare the influence of species (that s the independent variable) on metabolic rate (there s the dependent variable) for dogs, mantis shrimp, and great white sharks (those are the conditions ). There was a significant effect of species on metabolic rate (that s the result of the hypothesis test) at a significance level of α = 0.05 (that s the significance level) for the three species [F(2,21)=16.02; p = 0.00006] (that s where you report the F-value, the degrees of freedom, and the p-value). Removing the notes in there, it would read as follows: One-way ANOVA was performed to compare the influence of species on metabolic rate for dogs, mantis shrimp, and great white sharks. There was a significant effect of species on metabolic rate at a significance level of α = 0.05 for the three species [F(2,21) = 16.02; p = 0.00006]. Another, shorter approach (though slightly less informative) for a different dataset is:
There was no significant difference in bacterial growth rate for cultures exposed to Low, Medium or High uranium concentrations (α = 0.05; one-way ANOVA, F(2,21) = 0.94; p = 0.34. Depending on how many samples you compare, you may also want to show the results visually using a chart with asterisks or like-letters indicating significance. Often you will see like letters or symbols indicating values that are not significantly different. From the Tukey s HSD test, determine which means are statistically the same. Add like letters to indicate values that do not differ significantly (usually above the column error bars) (in Excel, select series > right click > Add Data Labels > edit data labels to contain correct letters or symbols to indicate significance). For example:!"#$%&'(!#)"(*+&,"-.-/( &%" *&" *%" )&" )%" (&" (%" '&" '%" &" %" 03"$)(&4(0'12+"(526"(&'(!"#$%&'(!#)"( $" #"!"!" +"," -"." 0'12+"( Figure 1. Effect of enzyme type on reaction rate. Reaction rates (moles/s) for the conversion of Compound X to Compound Y for Enzymes A, B, C, and D. Like letters above error bars indicate values that are not significantly different by one-way ANOVA (F(3,32) = 33.7; p < 0.001, α = 0.05) with post-hoc Tukey s HSD (α = 0.05). Error bars indicate ± 1 standard deviation. The results can also be summarized in a table, using superscripts indicating means that do not differ significantly. For example: Table 1. Enzyme effect on reaction rate. Reaction rates (moles/s) for the conversion of X to Y in the presence of enzyme A, B, C, or D (M ± SD). Like letters indicate values that are not significantly different by one-way ANOVA (F(3,32) = 33.7; p < 0.001, α = 0.05) with post-hoc Tukey s HSD (α = 0.05).!"#$%&' (&)*+,-"'()+&'.%-/&01234! ""#$%&'&(#)) * + ",#-(&'&(#() *. ("#($&'&,#$( / 0 (1#)(&'&2#(- 3