Experimental Designs (revisited)



Similar documents
The Assumption(s) of Normality

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Chapter 7. One-way ANOVA

ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Study Guide for the Final Exam

Recall this chart that showed how most of our course would be organized:

One-Way Analysis of Variance

Statistics Review PSY379

The F distribution and the basic principle behind ANOVAs. Situating ANOVAs in the world of statistical tests

An analysis method for a quantitative outcome and two categorical explanatory variables.

Main Effects and Interactions

Profile analysis is the multivariate equivalent of repeated measures or mixed ANOVA. Profile analysis is most commonly used in two cases:

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 5 Analysis of variance SPSS Analysis of variance

Reporting Statistics in Psychology

The Dummy s Guide to Data Analysis Using SPSS

Two-sample hypothesis testing, II /16/2004

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

15. Analysis of Variance

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Comparing Means in Two Populations

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Multivariate Analysis of Variance (MANOVA): I. Theory

Unit 31: One-Way ANOVA

SPSS Explore procedure

One-Way Analysis of Variance (ANOVA) Example Problem

Randomized Block Analysis of Variance

CHAPTER 14 NONPARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

Simple Regression Theory II 2010 Samuel L. Baker

Section 13, Part 1 ANOVA. Analysis Of Variance

1.5 Oneway Analysis of Variance

2. Simple Linear Regression

Part 2: Analysis of Relationship Between Two Variables

Additional sources Compilation of sources:

Analysis of Variance. MINITAB User s Guide 2 3-1

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Projects Involving Statistics (& SPSS)

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

1 Theory: The General Linear Model

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

UNDERSTANDING THE TWO-WAY ANOVA

Multivariate Analysis of Variance (MANOVA)

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

12: Analysis of Variance. Introduction

DDBA 8438: The t Test for Independent Samples Video Podcast Transcript

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

CS 147: Computer Systems Performance Analysis

Independent samples t-test. Dr. Tom Pierce Radford University

Non-Inferiority Tests for Two Means using Differences

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

individualdifferences

Introduction to Regression and Data Analysis

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

How To Run Statistical Tests in Excel

ABSORBENCY OF PAPER TOWELS

International Statistical Institute, 56th Session, 2007: Phil Everson

How To Test For Significance On A Data Set

II. DISTRIBUTIONS distribution normal distribution. standard scores

9.63 Laboratory in Cognitive Science. Interaction: memory experiment

Intro to Parametric & Nonparametric Statistics

If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

How To Check For Differences In The One Way Anova

Confidence Intervals on Effect Size David C. Howell University of Vermont

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

This chapter discusses some of the basic concepts in inferential statistics.

Probability Using Dice

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Using Microsoft Excel to Analyze Data

Descriptive Statistics

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

CHAPTER 13. Experimental Design and Analysis of Variance

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

Analysis of Variance ANOVA

Association Between Variables

COLLEGE ALGEBRA. Paul Dawkins

Research Methods & Experimental Design

Analyzing and interpreting data Evaluation resources from Wilder Research

2 Sample t-test (unequal sample sizes and unequal variances)

Chapter 7 Section 7.1: Inference for the Mean of a Population

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

HLM software has been one of the leading statistical packages for hierarchical

Chi Square Distribution

Minitab Tutorials for Design and Analysis of Experiments. Table of Contents

Transcription:

Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described when researchers talk to each other and this is how the initial analysis is conducted.) As defined in the chapter on experimental design (back in Part 1), a factor is an independent variable (i.e., some property, characteristic, or quality that can be manipulated) that is being used as a predictor or explainer of variance in the data analysis. In most cases, each specific value of the IV defines a level within the factor, but that doesn t have to be true, so we have two different labels. The way to keep these straight is to remember that an IV is created and exists when the experiment is being run; a factor is part of the analysis. Sometimes, for any of a variety of reasons, you can change your mind about the best way to approach the experiment between the time that you collected the data (and had levels of the IV) and when you conduct the analysis (and have levels of the factor). For example, sometimes we collapse two or more levels of an IV into one level of a factor. Experimental Designs (revisited) There are two manners in which experimental designs are described. The simple method only specifies the number of factors, as in one-way or two-way for experiments with one or two factors, respectively. The more complicated method specifies both the number of factors and the number of levels within each factor. For example, if an experiment involves two factors, one of which has two levels and the other of which has three levels, then the experiment is said to employ a two-by-three design. The number of numbers in this description tells you how many factors; each of the numbers tells you how many levels. I suggest that you use the more complicated manner in most situations. Note: it is traditional to list the factors from smallest to largest; thus, one would not often say three-by-two design, but you can if that really would be better. It is also a good habit to specify whether the factors are within- or between-subjects. If all of the factors are of the same sort, just append the label at the end of factors & levels description; e.g., two-by-three, between-subjects design or two-by-three, within-subjects design. If the factor types are mixed, append the compound modifier mixed-factor and then say which factor or factors are within subjects using the label repeated measures ; e.g., two-by-three, mixed-factor design, with repeated measures on the first factor if the two-level factor is within-subjects and the three-level factor is between-subjects. Note: be very careful to call these mixed-factor designs; do not, for example, call them mixed-effect designs, because those are a very different thing. Note, also, that there are other ways to say these things. For example, factorial is another label for a completely between-subjects design. One-way, Between-subjects ANOVA The easiest way to describe the theory behind ANOVA is to talk about a one-way (i.e., one-factor), between-subjects experiment. In fact, maybe because of its simplicity, SPSS lists this very specific type of analysis separately from all other forms of ANOVA; SPSS puts one-way, between-subject ANOVA with the t-tests, under Analyze... Compare Means... But don t be fooled by where it appears in the menus; this is an ANOVA, not a t-test. (Plus, I don t suggest using this version; use Analyze... General Linear Model Univariate, instead, for several reasons.)

For the purposes of discussion, imagine that we have conducted an experiment concerning motion-sickness with three groups of subjects. One group was in the control condition, which we ll call C; nothing was given or done to these subjects other than putting them in a rotating drum and asking them to report how ill they feel on a ten-point scale. Another group was given Dramamine, so this is group D, and then they, too, were put in the drum and asked for an illness value. The last group was given a placebo that looks like Dramamine before being put in the drum; this is group P. There were seven subjects in each group. To be clear (and to recap some issues that were covered above or before): we have one nominal IV which took on three values (C, D, or P) and was manipulated between subjects. Paralleling this, in the analysis we ll have one between-subjects factor with three levels. The DV was quantitative and discrete, because the ratings were whole numbers between one and ten. Therefore, the data file will have two columns: one control variable that specifies condition (C, D, or P) and one data variable that contains the illness ratings (1-10). There were seven subjects in each group, so our data file will have 21 rows. The null hypothesis is that the population means for C, D, and P are all the same. This should written as H 0 : μ C = μ D = μ P. The big question is how does a one-way ANOVA test this hypothesis? Before answering this question, try thinking about this one, instead: if you took 21 random and independent samples from a single population (that has non-zero variance), then randomly divided these 21 observations into three groups of seven and calculated the mean for each of the groups, would the three means be exactly the same? If that is too abstract, imagine that you rolled a die 21 times, put the first seven rolls in Group 1, the next seven in Group 2, and the last seven in Group 3. The correct answer (to the question: would the three means be exactly the same? ) is no or, at least, not very often. By random chance, one of the groups will have the highest mean and another will have the lowest. In other words, even if the null hypothesis is exactly true (because the three samples were taken from the same population), we do not expect the three sample means to be the same. We would only expect them to be the same if the samples were very, very large and/or the variance within the population was very, very small. With that in mind, we can now go back and address the question of how one-way ANOVA works. There are, of course, a variety of ways to think about this; the following is my favorite because it parallels how I like to think about t-tests. According to the null hypothesis, the three populations that were being sampled have the same mean. Under all forms of ANOVA, the three populations are assumed to have the same variance and are assumed to be normally distributed. Therefore, according to the null hypothesis, the three populations are exactly the same, because they have the same center, spread, and shape. Because of this, we can pool all of the data to calculate one, common, hypothetical sampling distribution of the mean. In contrast to the independent-sample t-test, where we had the clinical trials version to fall back on, there is no such thing as an equal-variance-not-assumed version of ANOVA. If the equal-variance assumption is violated, then you have to do something to correct the problem or

switch to a different form of analysis. Even more: because SPSS has no clue what to do about a violation of the equal-variance assumption if it happens, it won t even test the assumption unless you ask it to. As always for parametric statistics, the hypothetical sampling distribution (for the mean) is assumed to be normal with a spread that depends on two things: the variance in the sampled population and the size(s) of the sample(s). Back when we were doing t-tests, we talked about the spread of the sampling distribution in terms of its standard deviation, which is called the standard error. (Read that again if this isn t already something that you re comfortable with: the standard deviation of the sampling distribution for the mean is the standard error; the standard error is the standard deviation of the hypothetical sampling distribution of the mean.) The calculation of the standard error for a t-test is simple: it s your best guess about the standard deviation (s) divided by the square-root of the size of the sample. Now that we re doing ANOVA, we need to work in terms of variance, instead of standard deviations (for reasons you ll see soon). So, we now talk about the variance of the sampling distribution for the mean, which is just your best guess about the variance in the population divided by the sample size. Now you ve got everything that you need: a center, a spread (albeit in variance format), and a shape. With this hypothetical sampling distribution in hand, it is relatively easy to calculate the probability of observing three sample means that are as different and extreme (i.e., as far from the overall mean) as the three that we have. If this probability is very small (i.e., less than 5%), then we reject the idea that the three samples came from the same population. In particular, we reject the idea that the population means are the same; we don t reject (or even question) any of the assumptions. This is the same bass-ackward logic that we use for t-tests, complete with the special status for assumptions over null hypotheses. We are not calculating the probability that the null hypothesis is true given the data; we are calculating the probability of getting the data given the null. A second way to think about one-way, between-subjects ANOVA is in terms of a ratio of variances. The story starts out the same as the above, but doesn t use the hypothetical sampling distribution of the mean to calculate the probability of observing the three sample means. Instead, it refers to the spread of the hypothetical distribution as the within-group or unexplained variance. This version also doesn t talk about the three sample means as being different from each other in a pair-wise sense, but simply calculates the variance across these three values and calls this the between-group or explained variance. Then it calculates a ratio by dividing the between-group variance by the within-group variance. This value is compared to a critical value in a table; if the observed ratio is above the critical -- implying that the group means are too variable to be consistent with the idea that they all came from the same population and are only different due to chance -- then the null hypothesis is rejected. Puzzler: assume that you take three samples of 10 each from a single population that has a true variance (σ 2 ) of 420.00. (I.e., I m telling you that the null hypothesis is true; the three samples came from the same distribution.) What do you expect the best-guess variance across the three sample means to be? Note: I m not asking you about the best-guess variance across all of the

data; that s 420.00, because s 2 is an unbiased estimator of σ 2 and we know that σ 2 is 420.00. I m asking you about the variance across the three means. Hey! Did you actually solve the puzzler -- or, at least, spend some time on it -- or did you just keep reading like it was just another paragraph? If you took it seriously and worked on it, then you have my apologies for the interruption (as well as for the unflattering inference behind it); please carry on. If you just breezed on by, however, then please go back and try to solve it. It wasn t a koan (i.e., an unsolvable problem that helps you to achieve enlightenment through some process that I don t understand); it was a real problem that I was hoping that you could solve. Hint: Deep Thought might be helpful. A third way to think about one-way ANOVA is close to the second, but even farther removed from the way that we talk about t-tests. This is the approach from which ANOVA gets its name, because it analyzes (i.e., breaks up) the total variance into various components. We start with a general model that says that all observed values are the sum of several components. Because summing is linear, the model is called the General Linear Model (GLM). In the case of one-way, between-subjects ANOVA, the GLM equation for each observed value is: O ki = F k + S i + ε where O ki is the observed value for subject i who was in condition k; F k is the fixed effect of the condition k, which is a level of the factor; S i is the fixed mean of subject i; and ε is normally-distributed error. Because it isn t possible to separate the effect of the subject from the error (because we only measure each subject once), it is useful to think of the above as: O ki = F k + ( S i + ε ) The version of the GLM equation that I ve given here embodies the claim that the observed value is determined by the mean of the subject plus two additive influences (viz., the factor effect and the random error). Other people prefer to use a slightly different equation which is a little less focused on the subjects -- O ki = M + F k + S i + ε -- which claims that the observed value is determined by some overall mean for all subjects (M), plus additive effects from the factor, the subject, and the error. These two versions are equivalent because ANOVA concerns variance, so whether you have a separate overall mean or put this into the subjects is irrelevant because an additive constant (such as M) has no variance, and adding or subtracting the overall mean from each of the subjects would not have any effect on variance across subjects. Before going on, note or recall the following rule regarding variance values: the variance of the sum (of two or more statistically-independent variables) is equal to the sum of the variances. This is a key to ANOVA, which is why you were probably asked to memorize some version of this statement during undergrad stats; it is why we use variance, instead of standard deviations. Because of the additivity of variance, the GLM equation above implies this: Which can also be written as: σ 2 O = σ 2 F + σ 2 S + σ 2 ε σ 2 O = σ 2 F + σ 2 S+ε

This last equation can be read as: the variance of the observed values equals the variance of the fixed factor effects plus the variance of the sum of the subject means and the error. As mentioned above, in between-subjects ANOVA we use the second version of the variance equation, because we have no way of separating the variance due to subjects from the variance due to error (because we only measure each subject one time). The first computational step to one-way ANOVA calculates the total variance in the sample. This step ignores that there are separate conditions and simply gets an estimate of the variance across all values of the DV. This is σ 2 O. The second step uses the means in each of the conditions to estimate the values of F k. (Note that the F k values are deviations from the overall mean, so they must sum to zero.) The variance across these values is used to estimate σ 2 F. The third step notes that, if σ 2 O = σ 2 F + σ 2 S+ε, then σ 2 S+ε = σ 2 O σ 2 F (by some simple algebra). So we can use the difference between of our estimates of σ 2 O and σ 2 F to estimate σ 2 S+ε. We have now analyzed or partitioned the total variance into two components: one component that is associated with differences between conditions and another that is associated with differences between subjects (within each of the conditions) plus error. These are often referred to as explained and unexplained variance, respectively, on the grounds that the former can be explained in terms of the experimental manipulation that defines the conditions, while the latter cannot be explained. Because σ 2 F is estimated (and should, therefore, probably be written as s 2 F but no-one does that), it has an associated degrees of freedom. Because it was estimated using the k condition means and we always lose one degree of freedom to the overall mean of any set of values (because the mean is needed to calculate variance), it has k 1 degrees of freedom. Because σ 2 F is going to end up in the numerator of something called the F-ratio, k 1 is the numerator degrees of freedom. Likewise, because σ 2 S+ε is estimated (albeit by subtracting two other values), it also has a certain number of degrees of freedom. Because it was estimated using N pieces of data which were divided into k groups, each with their own mean (which each had to be calculated), it has N k degrees of freedom. Finally, because σ 2 S+ε will be in the denominator of the F-ratio, N k is the denominator degrees of freedom. That s enough for now.