# Section 13, Part 1 ANOVA. Analysis Of Variance

Size: px
Start display at page: Transcription

1 Section 13, Part 1 ANOVA Analysis Of Variance

2 Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability Distributions Sampling distributions and CLT Statistical Inference Estimating parameters, confidence intervals of means and proportions Hypothesis tests of means and proportions PubH 6414 Section 13, Part 1 2

3 Course Overview Statistical Inference still to cover ANOVA: Section 13, Part 1 Comparing means between three or more groups Inference about Odds Ratios and Relative Risks: Section 13, Part 2 Confidence Intervals of the Odds Ratio and Relative Risk Simple Linear Regression: Section 14 Describing the linear relationship between two variables PubH 6414 Section 13, Part 1 3

4 ANOVA Many studies involve comparisons between more than two groups of subjects. If the outcome is continuous, ANOVA can be used to compare the means between groups. ANOVA is an abbreviation for the full name of the method: ANalysis Of Variance The test for ANOVA is the ANOVA F-test PubH 6414 Section 13, Part 1 4

5 ANOVA was developed in the 1920 s by R.A. Fisher He was described by Anders Hald as "a genius who almost single-handedly created the foundations for modern statistical science The ANOVA F-test is named for Fisher PubH 6414 Section 13, Part 1 5

6 Why use ANOVA instead of multiple t-tests? If you are comparing means between more than two groups, why not just do several two sample t-tests to compare the mean from one group with the mean from each of the other groups? One problem with this approach is the increasing number of tests as the number of groups increases PubH 6414 Section 13, Part 1 6

7 The problem with multiple t-tests The probability of making a Type I error increases as the number of tests increase. If the probability of a Type I error for the analysis is set at 0.05 and 10 t-tests are done, the overall probability of a Type I error for the set of tests = 1 (0.95) 10 = 0.40* instead of 0.05 *Note: The formula for calculating overall Type I error on page 164 in text is incorrect. PubH 6414 Section 13, Part 1 7

8 Multiple Comparisons Problem Another way to describe the multiple comparisons problem is to think about the meaning of an alpha level = 0.05 Alpha of 0.05 implies that, by chance, there will be one Type I error in every 20 tests: 1/20 = This means that, by chance the null hypothesis will be incorrectly rejected once in every 20 tests As the number of tests increases, the probability of finding a significant result by chance increases. PubH 6414 Section 13, Part 1 8

9 ANOVA: a single test for simultaneous comparison The advantage of using ANOVA over multiple t-tests is that the ANOVA F-test will identify if any of the group means are significantly different from at least one of the other group means with a single test. If the significance level is set at α, the probability of a Type I error for the ANOVA F-test = α regardless of the number of groups being compared. PubH 6414 Section 13, Part 1 9

10 ANOVA Hypotheses The Null hypothesis for ANOVA is that the means for all groups are equal: H µ = µ = µ =... = µ o : The Alternative hypothesis for ANOVA is that at least two of the means are not equal. Rejecting the null hypothesis doesn t require that ALL means are significantly different from each other. If at least two of the means are significantly different the null hypothesis is rejected k PubH 6414 Section 13, Part 1 10

11 Analysis of Variance ANOVA is used to compare means between three or more groups, so why is it called Analysis of VARIANCE? ANOVA is based on a comparison of two different sources of variation: The average variation of observations within each group around the group mean The average variation of the group means around the grand mean The grand mean is the mean of all observations PubH 6414 Section 13, Part 1 11

12 Variability Among and Within Groups The average variation of observations within each group around the group mean is called the within group variability The average variation of the group means around the grand mean is called the among group variability The ANOVA F-test statistic is the ratio of the average variability among groups to the average variability within groups. F = average variability among groups average variability within groups PubH 6414 Section 13, Part 1 12

13 Small variability among groups compared to variability within groups: small F-statistic Group 1 mean Grand mean Within Group Group 2 mean Group 3 mean Source: Introduction to the Practice of Statistics, Moore and McCabe PubH 6414 Section 13, Part 1 13

14 Large variability among groups relative to variability within groups: large F-statistic Group 1 mean Within group Group 2 mean Grand mean Group 3 mean Source: Introduction to the Practice of Statistics, Moore and McCabe PubH 6414 Section 13, Part 1 14

15 ANOVA: F-statistic and F-distribution The F-distribution is indexed by two degrees of freedom Numerator df = number of groups minus 1 Denominator df = total sample size minus number of groups The shape of the F-distribution varies depending on the degrees of freedom. PubH 6414 Section 13, Part 1 15

16 5 different F-Distributions (Wikipedia) PubH 6414 Section 13, Part 1 16

17 Partitioned difference from the grand mean The difference of each observation, X, from the grand mean can be partitioned into two parts: The difference between the individual observation and the group mean. The difference between the group mean and the grand mean ( X X ) = ( X X ) + ( X X j j ) Grand mean Group Mean PubH 6414 Section 13, Part 1 17

18 Sums of Squares This partitioned relationship is also true for the sums of the squared differences which are called sums of squares. In the formula below X ij refers to an individual observation i in group j 2 ( X X ) = ( X X ) + ij ij j 2 ( X j X ) 2 SS T = SS E + SS A In words, the total sum of squares (SS T ) is equal to the Error Sum of squares (SS E ) plus the sum of squares among groups (SS A ) PubH 6414 Section 13, Part 1 18

19 Mean Squares MS A is the mean square among groups= SS A / ( j-1) MS A has j-1 degrees of freedom where j= number of groups MS E is the error mean square= SS E / (N-j) MS E has N-j degrees of freedom where N= total number of observations and j= number of groups PubH 6414 Section 13, Part 1 19

20 F-statistic calculated Recall that the F-statistic is the ratio of the average variability among groups divided by the average variability within groups The F-statistic = MS A / MS E PubH 6414 Section 13, Part 1 20

21 ANOVA Assumptions There are some assumptions that should be met before using ANOVA: The observations are from a random sample and they are independent from each other The observations are assumed to be normally distributed within each group ANOVA is still appropriate if this assumption is not met but the sample size in each group is large (> 30) The variances are approximately equal between groups If the ratio of the largest SD / smallest SD < 2, this assumption is considered to be met. It is not required to have equal sample sizes in all groups. PubH 6414 Section 13, Part 1 21

22 ANOVA: the steps Step 1. State the Hypotheses Step 2. Calculate Test statistic Step 3. Calculate the p-value Step 4. Conclusion If you fail to reject the null hypothesis, stop If the null hypothesis is rejected indicating that at least two means are different, proceed with post-hoc tests to identify the differences PubH 6414 Section 13, Part 1 22

23 ANOVA Example Researchers were interested in evaluating whether different therapy methods had an effect on mobility among elderly patients. 18 subjects were enrolled in the study and were randomly assigned to one of three treatment groups Control no therapy Trt. 1 Physical therapy only Trt. 2 Physical therapy with support group After 8 weeks subjects responded to a series of questions from which a mobility score was calculated. PubH 6414 Section 13, Part 1 23

24 Data for ANOVA example The hypothetical data below represent mobility scores for subjects in the 3 treatment groups. A higher score indicates better mobility Assume that mobility scores are approximately normally distributed Control Trt. 1 Trt PubH 6414 Section 13, Part 1 24

25 ANOVA: 1. State the Hypotheses Step 1. State the Hypotheses Null Hypothesis: µ control = µ trt 1 = µ trt 2 Alternative Hypothesis: at least two of the means (µ control, µ trt 1, µ trt 2 ) are not equal ANOVA will identify if at least two of the means are significantly different PubH 6414 Section 13, Part 1 25

26 Identify the test and statistic The test for comparing more than 2 means is ANOVA Before proceeding with ANOVA check that the variances are approximately equal Calculate the sample SD for each group Control Trt. 1 Trt. 2 SD If the ratio of the largest SD / smallest SD < 2 the variances are approximately equal 5.33 / 4.86 = 1.1 so ANOVA is appropriate The test statistic is the ANOVA F-statistic PubH 6414 Section 13, Part 1 26

27 Significance Level and Critical Values Set the significance level use the conventional alpha = 0.05 Identify the critical value from the F- distribution with 2, 15 df Numerator df = number of groups minus 1 = 3-1 = 2 Denominator df = total N minus number of groups = 18-3 = 15 PubH 6414 Section 13, Part 1 27

28 F-distribution for 2, 15 df F-distribution for the F-statistic for 18 observations and 3 groups has 2, 15 df The critical value for a = 0.05 in the F 2, 15 distribution is 3.68 F Distribution for 2, 15 df f(x) X PubH 6414 Section 13, Part 1 28

29 F distribution in R To get critical value: >qf(probability, dfnum, dfden) i.e. > qf(0.95, 2, 15)  To get p-value: >1-pf(F, dfnum, dfden) i.e. > 1-pf(3.68, 2, 15)  PubH 6414 Section 13, Part 1 29

30 ANOVA step 2: Calculate the F-statistic First calculate the grand mean, group means and group SD 2 Grand mean = sum of all 18 observations divided by 18 = 41.7 Group Group Mean SD 2 Control Trt Trt Notice that none of the three group means are equal to the grand mean and that there are differences between the group means. ANOVA will identify if any of these differences are significant. PubH 6414 Section 13, Part 1 30

31 Calculate SS A and MS A SS A = ij 2 ( X j X ) = n j ( X j X j ) 2 SS A = 6*( ) 2 + 6*( ) 2 + 6*( ) 2 = 292 MS A Divide the SS A by the degrees of freedom (j-1) MS A = 292 / (3-1) = 292 / 2 = 146 PubH 6414 Section 13, Part 1 31

32 Calculate SS E and MS E SS E = ij X X 2 ( ij j ) = ( j 1)( j j n s ) 2 SS E = 5* * *28.4 =382 MS E = SS E divided by the degrees of freedom (N-j) MS E = 382 / (18-3) = 382 / 15 = PubH 6414 Section 13, Part 1 32

33 Step 3: Calculate p-value The ANOVA F-statistic = MS A /MS E ANOVA F-statistic for this example = 146/25.47 = 5.73 The p-value is the area beyond 5.73 under the F-distribution with 2, 15 df. PubH 6414 Section 13, Part 1 33

34 F-distribution for 2, 15 df > 1-pf(5.73, 2, 15)  F Distribution for 2, 15 df f(x) X PubH 6414 Section 13, Part 1 34

35 ANOVA Step 4: Conclusion The null hypothesis of equal means is rejected by the critical value method (5.73 > 3.68) or the p-value method (0.014 < 0.05) We can conclude that AT LEAST two of the means are significantly different. PubH 6414 Section 13, Part 1 35

36 ANOVA table for Example SUMMARY Groups Count Sum Average Variance Column Column Column ANOVA Source of Variation SS df MS F P-value F crit Between Groups Within Groups Total SS A SS E MS A = SS A /df MS E =SS E /df PubH 6414 Section 13, Part 1 36

37 ANOVA: Post-hoc tests A significant ANOVA F-test is evidence that not all group means are equal but it does not identify where the differences exist. Methods used to find group differences after the ANOVA null hypothesis has been rejected are called post-hoc tests. Post-hoc is Latin for after-this Post-hoc comparisons should only be done when the ANOVA F-test is significant. PubH 6414 Section 13, Part 1 37

38 Adjusting the α-level for Post-hoc comparisons When post-hoc multiple comparisons are being done it is customary to adjust the α-level of each individual comparison so that the overall experiment significance level remains at 0.05 PubH 6414 Section 13, Part 1 38

39 Bonferroni Post-hoc comparisons For an ANOVA with 3 groups, there are 3 combinations of t-tests. A conservative adjustment is to divide 0.05 by the number of comparisons. For our example with 3 groups the adjusted α-level = 0.05/3 = A difference will only be considered significant if the p-value for the t-test statistic < PubH 6414 Section 13, Part 1 39

40 Post-hoc Comparisons for Example Our example data had 3 groups so there are three two-sample post-hoc t-tests: Control group compared to Treatment 1 group Control group compared to Treatment 2 group Treatment 1 group compared to Treatment 2 group PubH 6414 Section 13, Part 1 40

41 Post-hoc Comparisons for Example We know from the significant F-test (p= 0.014) that at least two of the groups have significantly different means. A two-sample t-test will be done for each comparison and the result will be considered significant if the p-value < PubH 6414 Section 13, Part 1 41

42 Multiple Comparison Procedure H : µ = µ vs. H : µ µ o group another group A group another group xgroup xanothergroup t df =, where df = 1 1 MSE ( + ) n n group anothergroup N - j PubH 6414 Section 13, Part 1 42

43 Post-hoc comparison: Control and Treatment 1 Results of the two-sample t-test assuming equal variance to compare mean mobility scores between the control group and treatment 1 group: Control mean score = 36 Treatment 1 mean score = 44 P-value for two-sample t-test = This is not significant at the adjusted α-level of but could be considered marginally significant since it is close to Conclusion: After adjusting for multiple comparisons, the control group mean mobility score is marginally significantly different than the mean mobility score for the group that received physical therapy (p = ). PubH 6414 Section 13, Part 1 43

44 Post-hoc comparison: Control and Treatment 2 A two-sample t-test assuming equal variance is done to compare mean mobility scores between the control group and treatment 2 group: Control mean score = 36 Treatment 2 mean score = 45 P-value for two-sample t-test = This is a significant difference at the adjusted a-level of Conclusion: After adjusting for multiple comparisons, the control group mean mobility score is significantly different than the mean mobility score for the group that received physical therapy and a support group (p = 0.012). PubH 6414 Section 13, Part 1 44

45 Post-hoc comparison: Treatment 1 and Treatment 2 A two-sample t-test assuming equal variance is done to compare mean mobility scores between the two treatment groups. Treatment 1 mean score = 44 Treatment 2 mean score = 45 P-value for two-sample t-test = There is not a significant difference between these two treatment groups. Conclusion: There is no significant difference in mean mobility score between the two treatment groups (p = 0.743). PubH 6414 Section 13, Part 1 45

46 ANOVA results summary ANOVA was done to evaluate differences in mean mobility score between three groups: a control group, a group that received physical therapy only and a group that received physical therapy and a support group. The significant ANOVA F-test result indicated that at least two of the mean mobility scores were significantly different. PubH 6414 Section 13, Part 1 46

47 ANOVA results summary Post-hoc t-tests with adjusted α-level of were done. Results of the post-hoc comparisons indicated that the treatment group with both physical therapy and a support group had a significantly different mean mobility score than the control group (p = 0.012); the treatment group with physical therapy only had a marginally significantly different mean mobility score than the control group (p = ); there was no significant difference in mean mobility score between the two treatment groups. PubH 6414 Section 13, Part 1 47

48 Post-hoc test Procedures There are other procedures for adjusting the α-level for multiple comparisons which you may see referenced in the Medical Literature and which are described in the text Bonferroni procedure Dunnett s Procedure Newman-Keuls Procedure Scheffe s Procedure Tukey s HSD Procedure Duncan Multiple range Procedure Many statistical software packages can accommodate these adjustments but we won t use these in this class PubH 6414 Section 13, Part 1 48

49 Other ANOVA Procedures More than one factor can be included for two, three or four-way ANOVA PH6415 A continuous variable can be added to the model this is Analysis of Covariance (ANCOVA) PH6415 Repeated Measures ANOVA can handle replicated measurements on the same subject. PubH 6414 Section 13, Part 1 49

50 What if ANOVA is used to compare the means between two groups? The F-statistic for ANOVA with only two groups is equal to the t-statistic squared. F = t 2 when both tests (ANOVA and t-test) are applied to the same two group comparison of means. PubH 6414 Section 13, Part 1 50

### Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### 1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### ANOVA ANOVA. Two-Way ANOVA. One-Way ANOVA. When to use ANOVA ANOVA. Analysis of Variance. Chapter 16. A procedure for comparing more than two groups ANOVA ANOVA Analysis of Variance Chapter 6 A procedure for comparing more than two groups independent variable: smoking status non-smoking one pack a day > two packs a day dependent variable: number of

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### " Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

### 13: Additional ANOVA Topics. Post hoc Comparisons 13: Additional ANOVA Topics Post hoc Comparisons ANOVA Assumptions Assessing Group Variances When Distributional Assumptions are Severely Violated Kruskal-Wallis Test Post hoc Comparisons In the prior

### Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### 1 Basic ANOVA concepts Math 143 ANOVA 1 Analysis of Variance (ANOVA) Recall, when we wanted to compare two population means, we used the 2-sample t procedures. Now let s expand this to compare k 3 population means. As with the

### Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

### ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### individualdifferences 1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

### UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

### One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

### Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9. Two-way ANOVA, II Post-hoc comparisons & two-way analysis of variance 9.7 4/9/4 Post-hoc testing As before, you can perform post-hoc tests whenever there s a significant F But don t bother if it s a main

### Multiple-Comparison Procedures Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical

### Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

### Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### The Analysis of Variance ANOVA -3σ -σ -σ +σ +σ +3σ The Analysis of Variance ANOVA Lecture 0909.400.0 / 0909.400.0 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -σ -σ +σ +σ +3σ Analysis

### Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other 1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

### 1 Overview. Fisher s Least Significant Difference (LSD) Test. Lynne J. Williams Hervé Abdi In Neil Salkind (Ed.), Encyclopedia of Research Design. Thousand Oaks, CA: Sage. 2010 Fisher s Least Significant Difference (LSD) Test Lynne J. Williams Hervé Abdi 1 Overview When an analysis of variance

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

### 12: Analysis of Variance. Introduction 1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

### One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate 1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

### Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

### Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

### Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. 277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies

### Chapter 14: Repeated Measures Analysis of Variance (ANOVA) Chapter 14: Repeated Measures Analysis of Variance (ANOVA) First of all, you need to recognize the difference between a repeated measures (or dependent groups) design and the between groups (or independent

### EXCEL Analysis TookPak [Statistical Analysis] 1. First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it: EXCEL Analysis TookPak [Statistical Analysis] 1 First of all, check to make sure that the Analysis ToolPak is installed. Here is how you do it: a. From the Tools menu, choose Add-Ins b. Make sure Analysis

### Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable Application: This statistic has two applications that can appear very different,

### xtmixed & denominator degrees of freedom: myth or magic xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or

### Unit 31: One-Way ANOVA Unit 31: One-Way ANOVA Summary of Video A vase filled with coins takes center stage as the video begins. Students will be taking part in an experiment organized by psychology professor John Kelly in which

### CHAPTER 13. Experimental Design and Analysis of Variance CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

### Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

### Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

### Chapter 19 Split-Plot Designs Chapter 19 Split-Plot Designs Split-plot designs are needed when the levels of some treatment factors are more difficult to change during the experiment than those of others. The designs have a nested

### Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities. Consider a study in which How many subjects? The importance of sample size calculations Office of Research Protections Brown Bag Series KB Boomer, Ph.D. Director, boomer@stat.psu.edu A researcher conducts

### Solutions to Homework 10 Statistics 302 Professor Larget s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the

### Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

### Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

### General Regression Formulae ) (N-2) (1 - r 2 YX General Regression Formulae Single Predictor Standardized Parameter Model: Z Yi = β Z Xi + ε i Single Predictor Standardized Statistical Model: Z Yi = β Z Xi Estimate of Beta (Beta-hat: β = r YX (1 Standard

### SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### 3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

### Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Last time, we used the mean of one sample to test against the hypothesis that the true mean was a particular

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

### CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42 Overview Introduction Overview Overview Introduction Finding

### Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

### Analysis of Variance. MINITAB User s Guide 2 3-1 3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

### Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

### Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

### CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### Testing for Lack of Fit Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

### Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

### Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

### The ANOVA for 2x2 Independent Groups Factorial Design The ANOVA for 2x2 Independent Groups Factorial Design Please Note: In the analyses above I have tried to avoid using the terms "Independent Variable" and "Dependent Variable" (IV and DV) in order to emphasize Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

### MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Sample Practice problems - chapter 12-1 and 2 proportions for inference - Z Distributions Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide

### Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

### THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

### UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA) UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

### General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

### An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

### Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

### UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

### Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

### Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

### An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

### Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences