# Section 13, Part 1 ANOVA. Analysis Of Variance

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Section 13, Part 1 ANOVA Analysis Of Variance

2 Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability Distributions Sampling distributions and CLT Statistical Inference Estimating parameters, confidence intervals of means and proportions Hypothesis tests of means and proportions PubH 6414 Section 13, Part 1 2

3 Course Overview Statistical Inference still to cover ANOVA: Section 13, Part 1 Comparing means between three or more groups Inference about Odds Ratios and Relative Risks: Section 13, Part 2 Confidence Intervals of the Odds Ratio and Relative Risk Simple Linear Regression: Section 14 Describing the linear relationship between two variables PubH 6414 Section 13, Part 1 3

4 ANOVA Many studies involve comparisons between more than two groups of subjects. If the outcome is continuous, ANOVA can be used to compare the means between groups. ANOVA is an abbreviation for the full name of the method: ANalysis Of Variance The test for ANOVA is the ANOVA F-test PubH 6414 Section 13, Part 1 4

5 ANOVA was developed in the 1920 s by R.A. Fisher He was described by Anders Hald as "a genius who almost single-handedly created the foundations for modern statistical science The ANOVA F-test is named for Fisher PubH 6414 Section 13, Part 1 5

6 Why use ANOVA instead of multiple t-tests? If you are comparing means between more than two groups, why not just do several two sample t-tests to compare the mean from one group with the mean from each of the other groups? One problem with this approach is the increasing number of tests as the number of groups increases PubH 6414 Section 13, Part 1 6

7 The problem with multiple t-tests The probability of making a Type I error increases as the number of tests increase. If the probability of a Type I error for the analysis is set at 0.05 and 10 t-tests are done, the overall probability of a Type I error for the set of tests = 1 (0.95) 10 = 0.40* instead of 0.05 *Note: The formula for calculating overall Type I error on page 164 in text is incorrect. PubH 6414 Section 13, Part 1 7

8 Multiple Comparisons Problem Another way to describe the multiple comparisons problem is to think about the meaning of an alpha level = 0.05 Alpha of 0.05 implies that, by chance, there will be one Type I error in every 20 tests: 1/20 = This means that, by chance the null hypothesis will be incorrectly rejected once in every 20 tests As the number of tests increases, the probability of finding a significant result by chance increases. PubH 6414 Section 13, Part 1 8

9 ANOVA: a single test for simultaneous comparison The advantage of using ANOVA over multiple t-tests is that the ANOVA F-test will identify if any of the group means are significantly different from at least one of the other group means with a single test. If the significance level is set at α, the probability of a Type I error for the ANOVA F-test = α regardless of the number of groups being compared. PubH 6414 Section 13, Part 1 9

10 ANOVA Hypotheses The Null hypothesis for ANOVA is that the means for all groups are equal: H µ = µ = µ =... = µ o : The Alternative hypothesis for ANOVA is that at least two of the means are not equal. Rejecting the null hypothesis doesn t require that ALL means are significantly different from each other. If at least two of the means are significantly different the null hypothesis is rejected k PubH 6414 Section 13, Part 1 10

11 Analysis of Variance ANOVA is used to compare means between three or more groups, so why is it called Analysis of VARIANCE? ANOVA is based on a comparison of two different sources of variation: The average variation of observations within each group around the group mean The average variation of the group means around the grand mean The grand mean is the mean of all observations PubH 6414 Section 13, Part 1 11

12 Variability Among and Within Groups The average variation of observations within each group around the group mean is called the within group variability The average variation of the group means around the grand mean is called the among group variability The ANOVA F-test statistic is the ratio of the average variability among groups to the average variability within groups. F = average variability among groups average variability within groups PubH 6414 Section 13, Part 1 12

13 Small variability among groups compared to variability within groups: small F-statistic Group 1 mean Grand mean Within Group Group 2 mean Group 3 mean Source: Introduction to the Practice of Statistics, Moore and McCabe PubH 6414 Section 13, Part 1 13

14 Large variability among groups relative to variability within groups: large F-statistic Group 1 mean Within group Group 2 mean Grand mean Group 3 mean Source: Introduction to the Practice of Statistics, Moore and McCabe PubH 6414 Section 13, Part 1 14

15 ANOVA: F-statistic and F-distribution The F-distribution is indexed by two degrees of freedom Numerator df = number of groups minus 1 Denominator df = total sample size minus number of groups The shape of the F-distribution varies depending on the degrees of freedom. PubH 6414 Section 13, Part 1 15

16 5 different F-Distributions (Wikipedia) PubH 6414 Section 13, Part 1 16

17 Partitioned difference from the grand mean The difference of each observation, X, from the grand mean can be partitioned into two parts: The difference between the individual observation and the group mean. The difference between the group mean and the grand mean ( X X ) = ( X X ) + ( X X j j ) Grand mean Group Mean PubH 6414 Section 13, Part 1 17

18 Sums of Squares This partitioned relationship is also true for the sums of the squared differences which are called sums of squares. In the formula below X ij refers to an individual observation i in group j 2 ( X X ) = ( X X ) + ij ij j 2 ( X j X ) 2 SS T = SS E + SS A In words, the total sum of squares (SS T ) is equal to the Error Sum of squares (SS E ) plus the sum of squares among groups (SS A ) PubH 6414 Section 13, Part 1 18

19 Mean Squares MS A is the mean square among groups= SS A / ( j-1) MS A has j-1 degrees of freedom where j= number of groups MS E is the error mean square= SS E / (N-j) MS E has N-j degrees of freedom where N= total number of observations and j= number of groups PubH 6414 Section 13, Part 1 19

20 F-statistic calculated Recall that the F-statistic is the ratio of the average variability among groups divided by the average variability within groups The F-statistic = MS A / MS E PubH 6414 Section 13, Part 1 20

21 ANOVA Assumptions There are some assumptions that should be met before using ANOVA: The observations are from a random sample and they are independent from each other The observations are assumed to be normally distributed within each group ANOVA is still appropriate if this assumption is not met but the sample size in each group is large (> 30) The variances are approximately equal between groups If the ratio of the largest SD / smallest SD < 2, this assumption is considered to be met. It is not required to have equal sample sizes in all groups. PubH 6414 Section 13, Part 1 21

22 ANOVA: the steps Step 1. State the Hypotheses Step 2. Calculate Test statistic Step 3. Calculate the p-value Step 4. Conclusion If you fail to reject the null hypothesis, stop If the null hypothesis is rejected indicating that at least two means are different, proceed with post-hoc tests to identify the differences PubH 6414 Section 13, Part 1 22

23 ANOVA Example Researchers were interested in evaluating whether different therapy methods had an effect on mobility among elderly patients. 18 subjects were enrolled in the study and were randomly assigned to one of three treatment groups Control no therapy Trt. 1 Physical therapy only Trt. 2 Physical therapy with support group After 8 weeks subjects responded to a series of questions from which a mobility score was calculated. PubH 6414 Section 13, Part 1 23

24 Data for ANOVA example The hypothetical data below represent mobility scores for subjects in the 3 treatment groups. A higher score indicates better mobility Assume that mobility scores are approximately normally distributed Control Trt. 1 Trt PubH 6414 Section 13, Part 1 24

25 ANOVA: 1. State the Hypotheses Step 1. State the Hypotheses Null Hypothesis: µ control = µ trt 1 = µ trt 2 Alternative Hypothesis: at least two of the means (µ control, µ trt 1, µ trt 2 ) are not equal ANOVA will identify if at least two of the means are significantly different PubH 6414 Section 13, Part 1 25

26 Identify the test and statistic The test for comparing more than 2 means is ANOVA Before proceeding with ANOVA check that the variances are approximately equal Calculate the sample SD for each group Control Trt. 1 Trt. 2 SD If the ratio of the largest SD / smallest SD < 2 the variances are approximately equal 5.33 / 4.86 = 1.1 so ANOVA is appropriate The test statistic is the ANOVA F-statistic PubH 6414 Section 13, Part 1 26

27 Significance Level and Critical Values Set the significance level use the conventional alpha = 0.05 Identify the critical value from the F- distribution with 2, 15 df Numerator df = number of groups minus 1 = 3-1 = 2 Denominator df = total N minus number of groups = 18-3 = 15 PubH 6414 Section 13, Part 1 27

28 F-distribution for 2, 15 df F-distribution for the F-statistic for 18 observations and 3 groups has 2, 15 df The critical value for a = 0.05 in the F 2, 15 distribution is 3.68 F Distribution for 2, 15 df f(x) X PubH 6414 Section 13, Part 1 28

29 F distribution in R To get critical value: >qf(probability, dfnum, dfden) i.e. > qf(0.95, 2, 15) [1] To get p-value: >1-pf(F, dfnum, dfden) i.e. > 1-pf(3.68, 2, 15) [1] PubH 6414 Section 13, Part 1 29

30 ANOVA step 2: Calculate the F-statistic First calculate the grand mean, group means and group SD 2 Grand mean = sum of all 18 observations divided by 18 = 41.7 Group Group Mean SD 2 Control Trt Trt Notice that none of the three group means are equal to the grand mean and that there are differences between the group means. ANOVA will identify if any of these differences are significant. PubH 6414 Section 13, Part 1 30

31 Calculate SS A and MS A SS A = ij 2 ( X j X ) = n j ( X j X j ) 2 SS A = 6*( ) 2 + 6*( ) 2 + 6*( ) 2 = 292 MS A Divide the SS A by the degrees of freedom (j-1) MS A = 292 / (3-1) = 292 / 2 = 146 PubH 6414 Section 13, Part 1 31

32 Calculate SS E and MS E SS E = ij X X 2 ( ij j ) = ( j 1)( j j n s ) 2 SS E = 5* * *28.4 =382 MS E = SS E divided by the degrees of freedom (N-j) MS E = 382 / (18-3) = 382 / 15 = PubH 6414 Section 13, Part 1 32

33 Step 3: Calculate p-value The ANOVA F-statistic = MS A /MS E ANOVA F-statistic for this example = 146/25.47 = 5.73 The p-value is the area beyond 5.73 under the F-distribution with 2, 15 df. PubH 6414 Section 13, Part 1 33

34 F-distribution for 2, 15 df > 1-pf(5.73, 2, 15) [1] F Distribution for 2, 15 df f(x) X PubH 6414 Section 13, Part 1 34

35 ANOVA Step 4: Conclusion The null hypothesis of equal means is rejected by the critical value method (5.73 > 3.68) or the p-value method (0.014 < 0.05) We can conclude that AT LEAST two of the means are significantly different. PubH 6414 Section 13, Part 1 35

36 ANOVA table for Example SUMMARY Groups Count Sum Average Variance Column Column Column ANOVA Source of Variation SS df MS F P-value F crit Between Groups Within Groups Total SS A SS E MS A = SS A /df MS E =SS E /df PubH 6414 Section 13, Part 1 36

37 ANOVA: Post-hoc tests A significant ANOVA F-test is evidence that not all group means are equal but it does not identify where the differences exist. Methods used to find group differences after the ANOVA null hypothesis has been rejected are called post-hoc tests. Post-hoc is Latin for after-this Post-hoc comparisons should only be done when the ANOVA F-test is significant. PubH 6414 Section 13, Part 1 37

38 Adjusting the α-level for Post-hoc comparisons When post-hoc multiple comparisons are being done it is customary to adjust the α-level of each individual comparison so that the overall experiment significance level remains at 0.05 PubH 6414 Section 13, Part 1 38

39 Bonferroni Post-hoc comparisons For an ANOVA with 3 groups, there are 3 combinations of t-tests. A conservative adjustment is to divide 0.05 by the number of comparisons. For our example with 3 groups the adjusted α-level = 0.05/3 = A difference will only be considered significant if the p-value for the t-test statistic < PubH 6414 Section 13, Part 1 39

40 Post-hoc Comparisons for Example Our example data had 3 groups so there are three two-sample post-hoc t-tests: Control group compared to Treatment 1 group Control group compared to Treatment 2 group Treatment 1 group compared to Treatment 2 group PubH 6414 Section 13, Part 1 40

41 Post-hoc Comparisons for Example We know from the significant F-test (p= 0.014) that at least two of the groups have significantly different means. A two-sample t-test will be done for each comparison and the result will be considered significant if the p-value < PubH 6414 Section 13, Part 1 41

42 Multiple Comparison Procedure H : µ = µ vs. H : µ µ o group another group A group another group xgroup xanothergroup t df =, where df = 1 1 MSE ( + ) n n group anothergroup N - j PubH 6414 Section 13, Part 1 42

43 Post-hoc comparison: Control and Treatment 1 Results of the two-sample t-test assuming equal variance to compare mean mobility scores between the control group and treatment 1 group: Control mean score = 36 Treatment 1 mean score = 44 P-value for two-sample t-test = This is not significant at the adjusted α-level of but could be considered marginally significant since it is close to Conclusion: After adjusting for multiple comparisons, the control group mean mobility score is marginally significantly different than the mean mobility score for the group that received physical therapy (p = ). PubH 6414 Section 13, Part 1 43

44 Post-hoc comparison: Control and Treatment 2 A two-sample t-test assuming equal variance is done to compare mean mobility scores between the control group and treatment 2 group: Control mean score = 36 Treatment 2 mean score = 45 P-value for two-sample t-test = This is a significant difference at the adjusted a-level of Conclusion: After adjusting for multiple comparisons, the control group mean mobility score is significantly different than the mean mobility score for the group that received physical therapy and a support group (p = 0.012). PubH 6414 Section 13, Part 1 44

45 Post-hoc comparison: Treatment 1 and Treatment 2 A two-sample t-test assuming equal variance is done to compare mean mobility scores between the two treatment groups. Treatment 1 mean score = 44 Treatment 2 mean score = 45 P-value for two-sample t-test = There is not a significant difference between these two treatment groups. Conclusion: There is no significant difference in mean mobility score between the two treatment groups (p = 0.743). PubH 6414 Section 13, Part 1 45

46 ANOVA results summary ANOVA was done to evaluate differences in mean mobility score between three groups: a control group, a group that received physical therapy only and a group that received physical therapy and a support group. The significant ANOVA F-test result indicated that at least two of the mean mobility scores were significantly different. PubH 6414 Section 13, Part 1 46

47 ANOVA results summary Post-hoc t-tests with adjusted α-level of were done. Results of the post-hoc comparisons indicated that the treatment group with both physical therapy and a support group had a significantly different mean mobility score than the control group (p = 0.012); the treatment group with physical therapy only had a marginally significantly different mean mobility score than the control group (p = ); there was no significant difference in mean mobility score between the two treatment groups. PubH 6414 Section 13, Part 1 47

48 Post-hoc test Procedures There are other procedures for adjusting the α-level for multiple comparisons which you may see referenced in the Medical Literature and which are described in the text Bonferroni procedure Dunnett s Procedure Newman-Keuls Procedure Scheffe s Procedure Tukey s HSD Procedure Duncan Multiple range Procedure Many statistical software packages can accommodate these adjustments but we won t use these in this class PubH 6414 Section 13, Part 1 48

49 Other ANOVA Procedures More than one factor can be included for two, three or four-way ANOVA PH6415 A continuous variable can be added to the model this is Analysis of Covariance (ANCOVA) PH6415 Repeated Measures ANOVA can handle replicated measurements on the same subject. PubH 6414 Section 13, Part 1 49

50 What if ANOVA is used to compare the means between two groups? The F-statistic for ANOVA with only two groups is equal to the t-statistic squared. F = t 2 when both tests (ANOVA and t-test) are applied to the same two group comparison of means. PubH 6414 Section 13, Part 1 50

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### Lecture 23 Multiple Comparisons & Contrasts

Lecture 23 Multiple Comparisons & Contrasts STAT 512 Spring 2011 Background Reading KNNL: 17.3-17.7 23-1 Topic Overview Linear Combinations and Contrasts Pairwise Comparisons and Multiple Testing Adjustments

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### ANOVA - Analysis of Variance

ANOVA - Analysis of Variance ANOVA - Analysis of Variance Extends independent-samples t test Compares the means of groups of independent observations Don t be fooled by the name. ANOVA does not compare

### ANOVA Analysis of Variance

ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

### c. The factor is the type of TV program that was watched. The treatment is the embedded commercials in the TV programs.

STAT E-150 - Statistical Methods Assignment 9 Solutions Exercises 12.8, 12.13, 12.75 For each test: Include appropriate graphs to see that the conditions are met. Use Tukey's Honestly Significant Difference

### A posteriori multiple comparison tests

A posteriori multiple comparison tests 09/30/12 1 Recall the Lakes experiment Source of variation SS DF MS F P Lakes 48.933 2 24.467 5.872 0.017 Error 50.000 12 4.167 Total 98.933 14 The ANOVA tells us

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### ANOVA MULTIPLE CHOICE QUESTIONS. In the following multiple-choice questions, select the best answer.

ANOVA MULTIPLE CHOICE QUESTIONS In the following multiple-choice questions, select the best answer. 1. Analysis of variance is a statistical method of comparing the of several populations. a. standard

### Analysis of Variance ANOVA

Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

### ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups

ANOVA ANOVA Analysis of Variance Chapter 6 A procedure for comparing more than two groups independent variable: smoking status non-smoking one pack a day > two packs a day dependent variable: number of

### SUBMODELS (NESTED MODELS) AND ANALYSIS OF VARIANCE OF REGRESSION MODELS

1 SUBMODELS (NESTED MODELS) AND ANALYSIS OF VARIANCE OF REGRESSION MODELS We will assume we have data (x 1, y 1 ), (x 2, y 2 ),, (x n, y n ) and make the usual assumptions of independence and normality.

### An example ANOVA situation. 1-Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters)

An example ANOVA situation Example (Treating Blisters) 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### Tukey s HSD (Honestly Significant Difference).

Agenda for Week 4 (Tuesday, Jan 26) Week 4 Hour 1 AnOVa review. Week 4 Hour 2 Multiple Testing Tukey s HSD (Honestly Significant Difference). Week 4 Hour 3 (Thursday) Two-way AnOVa. Sometimes you ll need

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### UNDERSTANDING THE ONE-WAY ANOVA

UNDERSTANDING The One-way Analysis of Variance (ANOVA) is a procedure for testing the hypothesis that K population means are equal, where K >. The One-way ANOVA compares the means of the samples or groups

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### Chris Slaughter, DrPH. GI Research Conference June 19, 2008

Chris Slaughter, DrPH Assistant Professor, Department of Biostatistics Vanderbilt University School of Medicine GI Research Conference June 19, 2008 Outline 1 2 3 Factors that Impact Power 4 5 6 Conclusions

### 4.4. Further Analysis within ANOVA

4.4. Further Analysis within ANOVA 1) Estimation of the effects Fixed effects model: α i = µ i µ is estimated by a i = ( x i x) if H 0 : µ 1 = µ 2 = = µ k is rejected. Random effects model: If H 0 : σa

### Contrasts ask specific questions as opposed to the general ANOVA null vs. alternative

Chapter 13 Contrasts and Custom Hypotheses Contrasts ask specific questions as opposed to the general ANOVA null vs. alternative hypotheses. In a one-way ANOVA with a k level factor, the null hypothesis

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

### Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### 13: Additional ANOVA Topics. Post hoc Comparisons

13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior

### Math 62 Statistics Sample Exam Questions

Math 62 Statistics Sample Exam Questions 1. (10) Explain the difference between the distribution of a population and the sampling distribution of a statistic, such as the mean, of a sample randomly selected

### Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504

Inferential Statistics Katie Rommel-Esham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice

### AP Statistics 2002 Scoring Guidelines

AP Statistics 2002 Scoring Guidelines The materials included in these files are intended for use by AP teachers for course and exam preparation in the classroom; permission for any other use must be sought

### 0.1 Estimating and Testing Differences in the Treatment means

0. Estimating and Testing Differences in the Treatment means Is the F-test significant, we only learn that not all population means are the same, but through this test we can not determine where the differences

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### 1 Basic ANOVA concepts

Math 143 ANOVA 1 Analysis of Variance (ANOVA) Recall, when we wanted to compare two population means, we used the 2-sample t procedures. Now let s expand this to compare k 3 population means. As with the

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

### Chapter 11: Two Variable Regression Analysis

Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

### Chapter 7. One-way ANOVA

Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

### Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

### EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

### individualdifferences

1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

### Background to ANOVA. One-Way Analysis of Variance: ANOVA. The One-Way ANOVA Summary Table. Hypothesis Test in One-Way ANOVA

Background to ANOVA One-Way Analysis of Variance: ANOVA Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Recall from

### Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 7 Multiple Linear Regression (Contd.) This is my second lecture on Multiple Linear Regression

### Null Hypothesis Significance Testing Signifcance Level, Power, t-tests. 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Simple and composite hypotheses Simple hypothesis: the sampling distribution is

### Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

### Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

### Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

### Two-Sample T-Test from Means and SD s

Chapter 07 Two-Sample T-Test from Means and SD s Introduction This procedure computes the two-sample t-test and several other two-sample tests directly from the mean, standard deviation, and sample size.

### Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

### Basic Statistcs Formula Sheet

Basic Statistcs Formula Sheet Steven W. ydick May 5, 0 This document is only intended to review basic concepts/formulas from an introduction to statistics course. Only mean-based procedures are reviewed,

### Paired vs. 2 sample comparisons. Comparing means. Paired comparisons allow us to account for a lot of extraneous variation.

Comparing means! Tests with one categorical and one numerical variable Paired vs. sample comparisons! Goal: to compare the mean of a numerical variable for different groups. Paired comparisons allow us

### UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### The Analysis of Variance ANOVA

-3σ -σ -σ +σ +σ +3σ The Analysis of Variance ANOVA Lecture 0909.400.0 / 0909.400.0 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -σ -σ +σ +σ +3σ Analysis

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test

Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics

About Single Factor ANOVAs TABLE OF CONTENTS About Single Factor ANOVAs... 1 What is a SINGLE FACTOR ANOVA... 1 Single Factor ANOVA... 1 Calculating Single Factor ANOVAs... 2 STEP 1: State the hypotheses...

### Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.

Two-way ANOVA, II Post-hoc comparisons & two-way analysis of variance 9.7 4/9/4 Post-hoc testing As before, you can perform post-hoc tests whenever there s a significant F But don t bother if it s a main

### Bivariate Analysis. Comparisons of proportions: Chi Square Test (X 2 test) Variable 1. Variable 2 2 LEVELS >2 LEVELS CONTINUOUS

Bivariate Analysis Variable 1 2 LEVELS >2 LEVELS CONTINUOUS Variable 2 2 LEVELS X 2 chi square test >2 LEVELS X 2 chi square test CONTINUOUS t-test X 2 chi square test X 2 chi square test ANOVA (F-test)

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### Hypothesis testing S2

Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

### research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

### One-Way Analysis of Variance (ANOVA) with Tukey s HSD Post-Hoc Test

One-Way Analysis of Variance (ANOVA) with Tukey s HSD Post-Hoc Test Prepared by Allison Horst for the Bren School of Environmental Science & Management Introduction When you are comparing two samples to

### Testing: is my coin fair?

Testing: is my coin fair? Formally: we want to make some inference about P(head) Try it: toss coin several times (say 7 times) Assume that it is fair ( P(head)= ), and see if this assumption is compatible

### 1 Overview. Fisher s Least Significant Difference (LSD) Test. Lynne J. Williams Hervé Abdi

In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 2010 Fisher s Least Significant Difference (LSD) Test Lynne J. Williams Hervé Abdi 1 Overview When an analysis of variance

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### 13 Two-Sample T Tests

www.ck12.org CHAPTER 13 Two-Sample T Tests Chapter Outline 13.1 TESTING A HYPOTHESIS FOR DEPENDENT AND INDEPENDENT SAMPLES 270 www.ck12.org Chapter 13. Two-Sample T Tests 13.1 Testing a Hypothesis for

### Multiple-Comparison Procedures

Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### SPSS Guide: Tests of Differences

SPSS Guide: Tests of Differences I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### Contents 1. Contents

Contents 1 Contents 3 K-sample Methods 2 3.1 Setup............................ 2 3.2 Classic Method Based on Normality Assumption..... 3 3.3 Permutation F -test.................... 5 3.4 Kruskal-Wallis

### The calculations lead to the following values: d 2 = 46, n = 8, s d 2 = 4, s d = 2, SEof d = s d n s d n

EXAMPLE 1: Paired t-test and t-interval DBP Readings by Two Devices The diastolic blood pressures (DBP) of 8 patients were determined using two techniques: the standard method used by medical personnel

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Review on Univariate Analysis of Variance (ANOVA) Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016

Review on Univariate Analysis of Variance (ANOVA) Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016 Analysis of Variance Analysis of Variance (ANOVA) ~ special case of multiple regression,

### Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

### Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

### Regression in ANOVA. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Regression in ANOVA James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 30 Regression in ANOVA 1 Introduction 2 Basic Linear

### Aus: Statnotes: Topics in Multivariate Analysis, by G. David Garson (Zugriff am

Aus: Statnotes: Topics in Multivariate Analysis, by G. David Garson http://faculty.chass.ncsu.edu/garson/pa765/anova.htm (Zugriff am 20.10.2010) Planned multiple comparison t-tests, also just called "multiple

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Difference of Means and ANOVA Problems

Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Chapter 16 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible

### Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21

1 Factorial ANOVA The ANOVA designs we have dealt with up to this point, known as simple ANOVA or oneway ANOVA, had only one independent grouping variable or factor. However, oftentimes a researcher has

### COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

### Regression analysis in the Assistant fits a model with one continuous predictor and one continuous response and can fit two types of models:

This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. The simple regression procedure in the

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some