Introduction to ANOVA

Similar documents
1.5 Oneway Analysis of Variance

Regression Analysis: A Complete Example

12: Analysis of Variance. Introduction

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

The ANOVA for 2x2 Independent Groups Factorial Design

CHAPTER 13. Experimental Design and Analysis of Variance

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

Section 13, Part 1 ANOVA. Analysis Of Variance

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

ELEMENTARY STATISTICS

1 Basic ANOVA concepts

One-Way Analysis of Variance (ANOVA) Example Problem

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

One-Way Analysis of Variance

2. Simple Linear Regression

Analysis of Variance ANOVA

Recall this chart that showed how most of our course would be organized:

10. Analysis of Longitudinal Studies Repeat-measures analysis

2 Sample t-test (unequal sample sizes and unequal variances)

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

Elementary Statistics Sample Exam #3

Chapter 7. One-way ANOVA

Univariate Regression

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Regression step-by-step using Microsoft Excel

UNDERSTANDING THE TWO-WAY ANOVA

Simple Tricks for Using SPSS for Windows

The Analysis of Variance ANOVA

How to calculate an ANOVA table

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.

Statistical Models in R

Solution Let us regress percentage of games versus total payroll.

Study Guide for the Final Exam

Chapter 5 Analysis of variance SPSS Analysis of variance

CS 147: Computer Systems Performance Analysis

How To Check For Differences In The One Way Anova

Introduction to proc glm

Multiple Linear Regression

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Analysing Questionnaires using Minitab (for SPSS queries contact -)

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

An analysis method for a quantitative outcome and two categorical explanatory variables.

Multiple-Comparison Procedures

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

4. Multiple Regression in Practice

Module 5: Multiple Regression Analysis

Paired 2 Sample t-test

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Linear Models in STATA and ANOVA

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Projects Involving Statistics (& SPSS)

Part 2: Analysis of Relationship Between Two Variables

General Regression Formulae ) (N-2) (1 - r 2 YX

SPSS Guide: Regression Analysis

MULTIPLE REGRESSION WITH CATEGORICAL DATA

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

1.1. Simple Regression in Excel (Excel 2010).

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

individualdifferences

STAT 350 Practice Final Exam Solution (Spring 2015)

Final Exam Practice Problem Answers

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Analysis of Variance. MINITAB User s Guide 2 3-1

Notes on Applied Linear Regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

N-Way Analysis of Variance

Chapter 2 Simple Comparative Experiments Solutions

Two-sample hypothesis testing, II /16/2004

Multiple Regression: What Is It?

CHAPTER 14 NONPARAMETRIC TESTS

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Solutions to Homework 10 Statistics 302 Professor Larget

Regression III: Advanced Methods

Simple Regression Theory II 2010 Samuel L. Baker

Randomized Block Analysis of Variance

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

PSYC 381 Statistics Arlo Clark-Foos, Ph.D.

5. Linear Regression

Chapter 19 Split-Plot Designs

Use of Monte Carlo Simulation for a Peer Review Process Performance Model

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

17. SIMPLE LINEAR REGRESSION II

c 2015, Jeffrey S. Simonoff 1

Comparing Means in Two Populations

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Chapter 2 Probability Topics SPSS T tests

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

The F distribution and the basic principle behind ANOVAs. Situating ANOVAs in the world of statistical tests

Illustration (and the use of HLM)

Transcription:

Introduction to ANOVA Lamb Weight Gain Example from Text The following table contains fictitious data on the weight gain of lambs on three different diets over a 2 week period. Individual Value Plot of Diet 1, Diet 2, Diet 3 Weight Gain (lbs.) 22.5 Diet 1 Diet 2 Diet 3 20.0 8 9 15 17.5 16 16 10 15.0 9 21 17 12.5 11 6 10.0 18 Data 7.5 What is a question of interest? 5.0 Diet 1 Diet 2 Diet 3 How do we analyze this data? We have independent samples so why not use independent samples t tests? Answer: Using the t distribution to make more than one comparison of a pair of independent samples drives up the chance for error. Recall: The Type I error rate (the probability of going with the alternative when we shouldn t) of a t test is the significance level, α. The lamb weight data is comprised of three independent samples. How many pair wise comparisons can we make? Then, we can make a Type I Error in any or all of these comparisons. Let s look at what happens to the probability of at least one Type I Error when making multiple comparisons: More on what to do about this error rate problem later for now, we introduce the ANOVA.

A one way analysis of variance, or just ANOVA, that we ll be learning is a hypothesis testing procedure that uses the following hypotheses: H O : H A : The term one way refers to the fact that there is only one variable defining the groups (in our example this is Diet ). Notation: I = number of groups i denotes the i th group and j denotes the j th observation y ij = y 12 denotes the 2 nd observation in the first group n i = sample size for the i th group = sample mean for the i th group n. = (the total sample size across all groups) = (sample mean combining data across all groups). Sum of Squares (SS), Degrees of Freedom (df), and Mean Squares (MS) SS(within) = = 1 MS(within) = SSwithin df(within) = n. I dfwithin SS(between) = = MS(between) = SSbetween df(between) = I 1 SS(total) = MS(total) = SStotal df(total) = n. 1 dftotal dfbetween Introduction to ANOVA Page 2

Consider the deviation from an observation to the overall mean written in the following way: Notice that the left side is at the heart of SS(total), and the right side has the analogous pieces of SS(within) and SS(between). It actually works out (with a bit of math) that SS(total) = SS(within) + SS(between) The analysis of variance is centered around this idea of breaking down the total variation of the observations from their grand mean into its two pieces: the variation within groups (error variation) and the variation between groups (treatment variation). The F test for ANOVA The test statistic for the ANOVA is If the data indicates large differences in group means compared to each other relative to the variability within groups around group means, F s will be large. Big values of F s indicate evidence against H O (if the group means are the same, the variability between them should not be larger than the natural variation of the data within each group). The F distribution has degrees of freedom for the numerator (between) and the denominator (within). We say F s ~ F(ν 1, ν 2 ). The following picture from Wikipedia illustrates a few F pdf s. P values for the F test in ANOVA are tail area quantities that will be calculated for you. Introduction to ANOVA Page 3

Software packages performing the F test for ANOVA return something called an ANOVA table. The following is the ANOVA output from Minitab 16 for the lamb weight data. Notice where the numbers in the table come from. One-way ANOVA: Diet 1, Diet 2, Diet 3 Source DF SS MS F P Factor 2 36.0 18.0 0.77 0.491 Error 9 210.0 23.3 Total 11 246.0 S = 4.830 R-Sq = 14.63% R-Sq(adj) = 0.00% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev --------+---------+---------+---------+- Diet 1 3 11.000 4.359 (---------------*--------------) Diet 2 5 15.000 4.950 (------------*-----------) Diet 3 4 12.000 4.967 (-------------*-------------) --------+---------+---------+---------+- 8.0 12.0 16.0 20.0 Pooled StDev = 4.830 ANOVA in R weightgain<-c(8,16,9,9,16,21,11,18,15,10,17,6) diet<-c(1,1,1,2,2,2,2,2,3,3,3,3) diet<-factor(diet) > summary(aov(weightgain~diet)) Df Sum Sq Mean Sq F value Pr(>F) diet 2 36 18.00 0.771 0.491 Residuals 9 210 23.33 Conduct an ANOVA for the lamb data. Introduction to ANOVA Page 4

A few final notes on the one way ANOVA: It assumes a common standard deviation for the populations. Notice that MS(within) = MS(error) = MSE = is a pooled (weighted or average). variance for the groups. We view this MS(error) as an estimate of the population common variance across the treatments. Then, the estimate of the common population standard deviation is And this is often referred to as We check the assumption that the variation is the same for all groups before carrying out an ANOVA, possibly by looking at side by side boxplots (among other ways). There are other assumptions (and methods of checking them) we will discuss in the Regression / Correlation lecture notes. So, what do we do if the ANOVA F test rejects H O and we conclude there is at least one population mean that is different? Then, we can turn to the methods under a blanket of topics called multiple comparisons where we make pairwise comparisons to determine which treatments are significantly larger (or smaller) than the others. Healthy study is given to this set of theories, as careful attention must be paid to the error rates. Section 11.9 of your text gives an introduction to this topic. Introduction to ANOVA Page 5