CS 147: Computer Systems Performance Analysis

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "CS 147: Computer Systems Performance Analysis"

Transcription

1 CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42

2 Overview Introduction Overview Overview Introduction Finding Effects Calculating Errors ANOVA Allocation Analysis Verifying Assumptions Unequal Sample Sizes Finding Effects Calculating Errors ANOVA Allocation Analysis Verifying Assumptions Unequal Sample Sizes 2 / 42

3 3 / 42 Introduction Characteristics of One-Factor Experiments Introduction Characteristics of One-Factor Experiments Characteristics of One-Factor Experiments Useful if there s only one important categorical factor with more than two interesting alternatives Methods reduce to 2 1 factorial designs if only two choices If single variable isn t categorical, should use regression instead Method allows multiple replications Useful if there s only one important categorical factor with more than two interesting alternatives Methods reduce to 2 1 factorial designs if only two choices If single variable isn t categorical, should use regression instead Method allows multiple replications

4 Introduction Comparing Truly Comparable Options Introduction Comparing Truly Comparable Options Comparing Truly Comparable Options Evaluating single workload on multiple machines Trying different options for single component Applying single suite of programs to different compilers Evaluating single workload on multiple machines Trying different options for single component Applying single suite of programs to different compilers 4 / 42

5 5 / 42 Introduction When to Avoid It Introduction When to Avoid It When to Avoid It Incomparable factors E.g., measuring vastly different workloads on single system Numerical factors Won t predict any untested levels Regression usually better choice Related entries across level Use two-factor design instead Incomparable factors E.g., measuring vastly different workloads on single system Numerical factors Won t predict any untested levels Regression usually better choice Related entries across level Use two-factor design instead

6 6 / 42 An Example One-Factor Experiment An Example One-Factor Experiment An Example One-Factor Experiment Choosing authentication server for single-sized messages Four different servers are available Performance measured by response time Lower is better Choosing authentication server for single-sized messages Four different servers are available Performance measured by response time Lower is better

7 7 / 42 The One-Factor Model The One-Factor Model The One-Factor Model yij = µ + αj + eij yij is i th response with factor set at level j µ is mean response αj is effect of alternative j αj = 0 eij is error term eij = 0 y ij = µ + α j + e ij y ij is i th response with factor set at level j µ is mean response α j is effect of alternative j αj = 0 e ij is error term eij = 0

8 One-Factor Experiments With Replications One-Factor Experiments With Replications One-Factor Experiments With Replications Initially, assume r replications at each alternative of factor Assuming a alternatives, we have a total of ar observations Model is thus r a a r a yij = arµ + r αj + eij i=1 j=1 j=1 i=1 j=1 Initially, assume r replications at each alternative of factor Assuming a alternatives, we have a total of ar observations Model is thus r a y ij = arµ + r i=1 j=1 a r α j + a j=1 i=1 j=1 e ij 8 / 42

9 Sample Data for Our Example Sample Data for Our Example Sample Data for Our Example Four alternatives, with four replications each (measured in seconds) A B C D Four alternatives, with four replications each (measured in seconds) A B C D / 42

10 10 / 42 Finding Effects Computing Effects Finding Effects Computing Effects Computing Effects Need to figure out µ and αj We have various yij s Errors should add to zero: r a eij = 0 i=1 j=1 Similarly, effects should add to zero: a αj = 0 j=1 Need to figure out µ and α j We have various y ij s Errors should add to zero: r a e ij = 0 i=1 j=1 Similarly, effects should add to zero: a α j = 0 j=1

11 11 / 42 Finding Effects Calculating µ Finding Effects Calculating µ Calculating µ By definition, sum of errors and sum of effects are both zero: r a yij = arµ i=1 j=1 And thus, µ is equal to grand mean of all responses µ = 1 r a yij = y ar i=1 j=1 By definition, sum of errors and sum of effects are both zero: r a y ij = arµ i=1 j=1 And thus, µ is equal to grand mean of all responses µ = 1 ar r a y ij = y i=1 j=1

12 12 / 42 Finding Effects Calculating µ for Our Example Finding Effects Calculating µ for Our Example Calculating µ for Our Example Thus, µ = yij 4 4 i=1 j=1 = = Thus, µ = i=1 j=1 = = y ij

13 13 / 42 Finding Effects Calculating α j Finding Effects Calculating α j Calculating αj αj is vector of responses One for each alternative of the factor To find vector, find column means y j = 1 r yij r i=1 Separate mean for each j Can calculate directly from observations α j is vector of responses One for each alternative of the factor To find vector, find column means y j = 1 r r i=1 y ij Separate mean for each j Can calculate directly from observations

14 14 / 42 Finding Effects Calculating Column Mean Finding Effects Calculating Column Mean Calculating Column Mean We know that yij is defined to be yij = µ + αj + eij So, y j = 1 r (µ + αj + eij) r i=1 ( ) = 1 r rµ + rαj + eij r i=1 We know that y ij is defined to be y ij = µ + α j + e ij So, y j = 1 r = 1 r r (µ + α j + e ij ) i=1 ( rµ + rα j + ) r e ij i=1

15 15 / 42 Finding Effects Calculating Parameters Finding Effects Calculating Parameters Calculating Parameters Sum of errors for any given row is zero, so y j = 1 (rµ + rαj + 0) r = µ + αj So we can solve for αj: αj = y j µ = y j y Sum of errors for any given row is zero, so y j = 1 r (rµ + rα j + 0) = µ + α j So we can solve for α j : α j = y j µ = y j y

16 Finding Effects Parameters for Our Example Finding Effects Parameters for Our Example Parameters for Our Example Server A B C D Col. Mean Subtract µ from column means to get parameters: Parameters Server A B C D Col. Mean Subtract µ from column means to get parameters: Parameters / 42

17 17 / 42 Calculating Errors Estimating Experimental Errors Calculating Errors Estimating Experimental Errors Estimating Experimental Errors Estimated response is ŷij = µ + αij But we measured actual responses Multiple responses per alternative So we can estimate amount of error in estimated response Use methods similar to those used in other types of experiment designs Estimated response is ŷ ij = µ + α ij But we measured actual responses Multiple responses per alternative So we can estimate amount of error in estimated response Use methods similar to those used in other types of experiment designs

18 Calculating Errors Sum of Squared Errors Calculating Errors Sum of Squared Errors Sum of Squared Errors SSE estimates variance of the errors: r a SSE = eij 2 i=1 j=1 We can calculate SSE directly from model and observations Also can find indirectly from its relationship to other error terms SSE estimates variance of the errors: SSE = r a i=1 j=1 e 2 ij We can calculate SSE directly from model and observations Also can find indirectly from its relationship to other error terms 18 / 42

19 Calculating Errors SSE for Our Example Calculating Errors SSE for Our Example SSE for Our Example Calculated directly: SSE = (.96 ( )) 2 + (1.05 ( )) (.75 ( )) 2 + (1.22 ( )) (.93 ( )) 2 =.3425 Calculated directly: SSE = (.96 ( )) 2 + (1.05 ( )) (.75 ( )) 2 + (1.22 ( )) (.93 ( )) 2 = / 42

20 20 / 42 ANOVA Allocation Allocating Variation ANOVA Allocation Allocating Variation Allocating Variation To allocate variation for model, start by squaring both sides of model equation yij 2 = µ 2 + αj 2 + eij 2 + 2µαj + 2µeij + 2αjeij yij 2 = µ 2 + αj 2 + eij 2 i,j i,j i,j i,j + cross-products Cross-product terms add up to zero To allocate variation for model, start by squaring both sides of model equation i,j y 2 ij = µ 2 + α 2 j + e 2 ij + 2µα j + 2µe ij + 2α j e ij yij 2 = µ 2 + αj 2 + i,j i,j i,j + cross-products Cross-product terms add up to zero e 2 ij

21 21 / 42 ANOVA Allocation Variation In Sum of Squares Terms ANOVA Allocation Variation In Sum of Squares Terms Variation In Sum of Squares Terms SSY = SS0 + SSA + SSE SSY = yij 2 i,j SS0 = SSA = r i=1 r i=1 a µ 2 = arµ 2 j=1 a αj 2 j=1 = r a αj 2 j=1 Gives another way to calculate SSE SSY = SS0 + SSA + SSE SSY = i,j y 2 ij SS0 = SSA = r i=1 j=1 r a µ 2 = arµ 2 a a αj 2 = r αj 2 i=1 j=1 j=1 Gives another way to calculate SSE

22 22 / 42 ANOVA Allocation Sum of Squares Terms for Our Example ANOVA Allocation Sum of Squares Terms for Our Example Sum of Squares Terms for Our Example SSY = SS0 = SSA = So SSE must equal = Matches our earlier SSE calculation SSY = SS0 = SSA = So SSE must equal = Matches our earlier SSE calculation

23 ANOVA Allocation Assigning Variation ANOVA Allocation Assigning Variation Assigning Variation SST is total variation SST = SSY SS0 = SSA + SSE Part of total variation comes from model Part comes from experimental errors A good model explains a lot of variation SST is total variation SST = SSY SS0 = SSA + SSE Part of total variation comes from model Part comes from experimental errors A good model explains a lot of variation 23 / 42

24 ANOVA Allocation Assigning Variation in Our Example ANOVA Allocation Assigning Variation in Our Example Assigning Variation in Our Example SST = SSY SS0 = SSA = SSE =.3425 Percentage of variation explained by server choice: = = 8.97% SST = SSY SS0 = SSA = SSE =.3425 Percentage of variation explained by server choice: = = 8.97% 24 / 42

25 25 / 42 ANOVA Analysis Analysis of Variance ANOVA Analysis Analysis of Variance Analysis of Variance Percentage of variation explained can be large or small Regardless of size, may or may not be statistically significant To determine significance, use ANOVA procedure Assumes normally distributed errors Percentage of variation explained can be large or small Regardless of size, may or may not be statistically significant To determine significance, use ANOVA procedure Assumes normally distributed errors

26 26 / 42 ANOVA Analysis Running ANOVA ANOVA Analysis Running ANOVA Running ANOVA Easiest to set up tabular method Like method used in regression models Only slight differences Basically, determine ratio of Mean Squared of A (parameters) to Mean Squared Errors Then check against F-table value for number of degrees of freedom Easiest to set up tabular method Like method used in regression models Only slight differences Basically, determine ratio of Mean Squared of A (parameters) to Mean Squared Errors Then check against F-table value for number of degrees of freedom

27 27 / 42 Sum of Squares F- Computed Component % of Variation ij N y SS0 = Nµ 2 1 y y SST = SSY SS0 100 N 1 A e SSE = SST SSA SSA SST SSE SST N = ar a 1 N a se = MSE Mean Square Degrees of Freedom F- Computed MSA MSE F- Table ANOVA Analysis ANOVA Table for One-Factor Experiments ANOVA Analysis ANOVA Table for One-Factor Experiments ANOVA Table for One-Factor Experiments y SSY = y 2 SSA = r α 2 j MSA = SSA a 1 MSE = SSE N a F[ 1 α; a 1, N a] Component Sum of Squares y SSY = y 2 ij % of Variation Degrees of Freedom y SS0 = Nµ 2 1 y y SST = SSY SS0 100 N 1 A e SSA = r α 2 j SSE = SST SSA SSA SST SSE SST N = ar N a 1 N a s e = MSE Mean Square MSA = SSA a 1 MSE = SSE N a MSA MSE F- Table F[ 1 α; a 1, N a]

28 28 / 42 ANOVA Analysis ANOVA Procedure for Our Example ANOVA Analysis ANOVA Procedure for Our Example ANOVA Procedure for Our Example Com- % of Degrees F- po- Sum of Varia- of Free- Mean Com- F- nent Squares tion dom Square puted Table y y y y A e Component Sum of Squares % of Variation Degrees of Freedom Mean Square F- Computed F- Table y y y y A e

29 ANOVA Analysis Interpretation of Sample ANOVA ANOVA Analysis Interpretation of Sample ANOVA Interpretation of Sample ANOVA Done at 90% level F-computed is.394 Table entry at 90% level with n = 3 and m = 12 is 2.61 Thus, servers are not significantly different Done at 90% level F-computed is.394 Table entry at 90% level with n = 3 and m = 12 is 2.61 Thus, servers are not significantly different 29 / 42

30 30 / 42 Verifying Assumptions One-Factor Experiment Assumptions Verifying Assumptions One-Factor Experiment Assumptions One-Factor Experiment Assumptions Analysis of one-factor experiments makes the usual assumptions: Effects of factors are additive Errors are additive Errors are independent of factor alternatives Errors are normally distributed Errors have same variance at all alternatives How do we tell if these are correct? Analysis of one-factor experiments makes the usual assumptions: Effects of factors are additive Errors are additive Errors are independent of factor alternatives Errors are normally distributed Errors have same variance at all alternatives How do we tell if these are correct?

31 31 / 42 Verifying Assumptions Visual Diagnostic Tests Verifying Assumptions Visual Diagnostic Tests Visual Diagnostic Tests Similar to those done before Residuals vs. predicted response Normal quantile-quantile plot Residuals vs. experiment number Similar to those done before Residuals vs. predicted response Normal quantile-quantile plot Residuals vs. experiment number

32 32 / 42 Verifying Assumptions Residuals vs. Predicted for Example Verifying Assumptions Residuals vs. Predicted for Example Residuals vs. Predicted for Example

33 33 / 42 Verifying Assumptions Residuals vs. Predicted, Slightly Revised Verifying Assumptions Residuals vs. Predicted, Slightly Revised Residuals vs. Predicted, Slightly Revised In the alternate rendering, the predictions for server D are shown in blue so they can be distinguished from server C

34 34 / 42 Verifying Assumptions What Does The Plot Tell Us? Verifying Assumptions What Does The Plot Tell Us? What Does The Plot Tell Us? Analysis assumed size of errors was unrelated to factor alternatives Plot tells us something entirely different Very different spread of residuals for different factors Thus, one-factor analysis is not appropriate for this data Compare individual alternatives instead Use pairwise confidence intervals Analysis assumed size of errors was unrelated to factor alternatives Plot tells us something entirely different Very different spread of residuals for different factors Thus, one-factor analysis is not appropriate for this data Compare individual alternatives instead Use pairwise confidence intervals

35 Verifying Assumptions Could We Have Figured This Out Sooner? Verifying Assumptions Could We Have Figured This Out Sooner? Could We Have Figured This Out Sooner? Yes! Look at original data Look at calculated parameters Model says C & D are identical Even cursory examination of data suggests otherwise Yes! Look at original data Look at calculated parameters Model says C & D are identical Even cursory examination of data suggests otherwise 35 / 42

36 Verifying Assumptions Looking Back at the Data Verifying Assumptions Looking Back at the Data Looking Back at the Data A B C D Parameters: A B C D Parameters: / 42

37 37 / 42 Verifying Assumptions Quantile-Quantile Plot for Example Verifying Assumptions Quantile-Quantile Plot for Example Quantile-Quantile Plot for Example

38 Verifying Assumptions What Does This Plot Tell Us? Verifying Assumptions What Does This Plot Tell Us? What Does This Plot Tell Us? Overall, errors are normally distributed If we only did quantile-quantile plot, we d think everything was fine The lesson: test ALL assumptions, not just one or two Overall, errors are normally distributed If we only did quantile-quantile plot, we d think everything was fine The lesson: test ALL assumptions, not just one or two 38 / 42

39 39 / 42 Verifying Assumptions One-Factor Confidence Intervals Verifying Assumptions One-Factor Confidence Intervals One-Factor Confidence Intervals Estimated parameters are random variables Thus, can compute confidence intervals Basic method is same as for confidence intervals on 2 k r design effects Find standard deviation of parameters Use that to calculate confidence intervals Possible typo in book, p. 336, example 20.6, in formula for calculating αj Also might be typo on p. 335: degrees of freedom is a(r 1), not r(a 1) Estimated parameters are random variables Thus, can compute confidence intervals Basic method is same as for confidence intervals on 2 k r design effects Find standard deviation of parameters Use that to calculate confidence intervals Possible typo in book, p. 336, example 20.6, in formula for calculating α j Also might be typo on p. 335: degrees of freedom is a(r 1), not r(a 1)

40 Verifying Assumptions Confidence Intervals For Example Parameters Verifying Assumptions Confidence Intervals For Example Parameters Confidence Intervals For Example Parameters se =.158 Standard deviation of µ =.040 Standard deviation of αj = % confidence interval for µ = (.932, 1.10) 95% CI for α1 = (.225,.074) 95% CI for α2 = (.148,.151) 95% CI for α3 = (.113,.186) 95% CI for α4 = (.113,.186) s e =.158 Standard deviation of µ =.040 Standard deviation of α j = % confidence interval for µ = (.932, 1.10) 95% CI for α 1 = (.225,.074) 95% CI for α 2 = (.148,.151) 95% CI for α 3 = (.113,.186) 95% CI for α 4 = (.113,.186) 40 / 42

41 Unequal Sample Sizes Unequal Sample Sizes in One-Factor Experiments Unequal Sample Sizes Unequal Sample Sizes in One-Factor Experiments Unequal Sample Sizes in One-Factor Experiments Don t really need identical replications for all alternatives Only slight extra difficulty See book example for full details Don t really need identical replications for all alternatives Only slight extra difficulty See book example for full details 41 / 42

42 42 / 42 Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Model is the same Effects are weighted by number of replications for that alternative: a rjaj = 0 j=1 Slightly different formulas for degrees of freedom Model is the same Effects are weighted by number of replications for that alternative: a r j a j = 0 j=1 Slightly different formulas for degrees of freedom

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

More information

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ 1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

More information

An example ANOVA situation. 1-Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters)

An example ANOVA situation. 1-Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters) An example ANOVA situation Example (Treating Blisters) 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13. Experimental Design and Analysis of Variance CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate 1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information

Chapter 11: Two Variable Regression Analysis

Chapter 11: Two Variable Regression Analysis Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

The Analysis of Variance ANOVA

The Analysis of Variance ANOVA -3σ -σ -σ +σ +σ +3σ The Analysis of Variance ANOVA Lecture 0909.400.0 / 0909.400.0 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -σ -σ +σ +σ +3σ Analysis

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

0.1 Multiple Regression Models

0.1 Multiple Regression Models 0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Notes on Applied Linear Regression

Notes on Applied Linear Regression Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

More information

1 Basic ANOVA concepts

1 Basic ANOVA concepts Math 143 ANOVA 1 Analysis of Variance (ANOVA) Recall, when we wanted to compare two population means, we used the 2-sample t procedures. Now let s expand this to compare k 3 population means. As with the

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

Simple Linear Regression

Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

More information

An analysis method for a quantitative outcome and two categorical explanatory variables.

An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

More information

Fixed vs. Random Effects

Fixed vs. Random Effects Statistics 203: Introduction to Regression and Analysis of Variance Fixed vs. Random Effects Jonathan Taylor - p. 1/19 Today s class Implications for Random effects. One-way random effects ANOVA. Two-way

More information

Robust Design: Statistical Analysis for Taguchi Methods

Robust Design: Statistical Analysis for Taguchi Methods Robust Design: Statistical Analysis for Taguchi Methods In the previous lecture, we saw how to set up the design of a series of experiments to test different configurations of potential designs under different

More information

Multiple Regression Analysis in Minitab 1

Multiple Regression Analysis in Minitab 1 Multiple Regression Analysis in Minitab 1 Suppose we are interested in how the exercise and body mass index affect the blood pressure. A random sample of 10 males 50 years of age is selected and their

More information

Residuals. Residuals = ª Department of ISM, University of Alabama, ST 260, M23 Residuals & Minitab. ^ e i = y i - y i

Residuals. Residuals = ª Department of ISM, University of Alabama, ST 260, M23 Residuals & Minitab. ^ e i = y i - y i A continuation of regression analysis Lesson Objectives Continue to build on regression analysis. Learn how residual plots help identify problems with the analysis. M23-1 M23-2 Example 1: continued Case

More information

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239 STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

More information

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur

Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Regression Analysis Prof. Soumen Maity Department of Mathematics Indian Institute of Technology, Kharagpur Lecture - 7 Multiple Linear Regression (Contd.) This is my second lecture on Multiple Linear Regression

More information

Full Factorial Design of Experiments

Full Factorial Design of Experiments Full Factorial Design of Experiments 0 Module Objectives Module Objectives By the end of this module, the participant will: Generate a full factorial design Look for factor interactions Develop coded orthogonal

More information

Analysis of Variance. MINITAB User s Guide 2 3-1

Analysis of Variance. MINITAB User s Guide 2 3-1 3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

More information

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis. Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

ANOVA MULTIPLE CHOICE QUESTIONS. In the following multiple-choice questions, select the best answer.

ANOVA MULTIPLE CHOICE QUESTIONS. In the following multiple-choice questions, select the best answer. ANOVA MULTIPLE CHOICE QUESTIONS In the following multiple-choice questions, select the best answer. 1. Analysis of variance is a statistical method of comparing the of several populations. a. standard

More information

ANOVA Analysis of Variance

ANOVA Analysis of Variance ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

More information

Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Statistics II Final Exam - January Use the University stationery to give your answers to the following questions. Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly

More information

Analysis of Variance and Design of Experiments

Analysis of Variance and Design of Experiments CHAPTER 11 Analysis of Variance and Design of Experiments LEARNING OBJECTIVES The focus of this chapter is the design of experiments and the analysis of variance, thereby enabling you to: 1. Describe an

More information

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

More information

In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a

In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Factorial Analysis of Variance

Factorial Analysis of Variance Chapter 560 Factorial Analysis of Variance Introduction A common task in research is to compare the average response across levels of one or more factor variables. Examples of factor variables are income

More information

Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design 26 October 2012

Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design 26 October 2012 Petter Mostad Matematisk Statistik Chalmers Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design 26 October 2012 1. (a) We first compute the probability that all bottles

More information

Lecture - 32 Regression Modelling Using SPSS

Lecture - 32 Regression Modelling Using SPSS Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 32 Regression Modelling Using SPSS (Refer

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

MULTIPLE REGRESSION WITH CATEGORICAL DATA

MULTIPLE REGRESSION WITH CATEGORICAL DATA DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

More information

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients (

, then the form of the model is given by: which comprises a deterministic component involving the three regression coefficients ( Multiple regression Introduction Multiple regression is a logical extension of the principles of simple linear regression to situations in which there are several predictor variables. For instance if we

More information

5 Analysis of Variance models, complex linear models and Random effects models

5 Analysis of Variance models, complex linear models and Random effects models 5 Analysis of Variance models, complex linear models and Random effects models In this chapter we will show any of the theoretical background of the analysis. The focus is to train the set up of ANOVA

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Chapter 19 Split-Plot Designs

Chapter 19 Split-Plot Designs Chapter 19 Split-Plot Designs Split-plot designs are needed when the levels of some treatment factors are more difficult to change during the experiment than those of others. The designs have a nested

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Sub Module Design of experiments

Sub Module Design of experiments Sub Module 1.6 6. Design of experiments Goal of experiments: Experiments help us in understanding the behavior of a (mechanical) system Data collected by systematic variation of influencing factors helps

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

N-Way Analysis of Variance

N-Way Analysis of Variance N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data

More information

Experiment Design and Analysis of a Mobile Aerial Wireless Mesh Network for Emergencies

Experiment Design and Analysis of a Mobile Aerial Wireless Mesh Network for Emergencies Experiment Design and Analysis of a Mobile Aerial Wireless Mesh Network for Emergencies 1. Introduction 2. Experiment Design 2.1. System Components 2.2. Experiment Topology 3. Experiment Analysis 3.1.

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Experimental Designs (revisited)

Experimental Designs (revisited) Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described

More information

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters

MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters MAT140: Applied Statistical Methods Summary of Calculating Confidence Intervals and Sample Sizes for Estimating Parameters Inferences about a population parameter can be made using sample statistics for

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Topic 9. Factorial Experiments [ST&D Chapter 15]

Topic 9. Factorial Experiments [ST&D Chapter 15] Topic 9. Factorial Experiments [ST&D Chapter 5] 9.. Introduction In earlier times factors were studied one at a time, with separate experiments devoted to each factor. In the factorial approach, the investigator

More information

Review on Univariate Analysis of Variance (ANOVA) Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016

Review on Univariate Analysis of Variance (ANOVA) Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016 Review on Univariate Analysis of Variance (ANOVA) Pekka Malo 30E00500 Quantitative Empirical Research Spring 2016 Analysis of Variance Analysis of Variance (ANOVA) ~ special case of multiple regression,

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Multiple-Comparison Procedures

Multiple-Comparison Procedures Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical

More information

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How

More information

ANOVA - Analysis of Variance

ANOVA - Analysis of Variance ANOVA - Analysis of Variance ANOVA - Analysis of Variance Extends independent-samples t test Compares the means of groups of independent observations Don t be fooled by the name. ANOVA does not compare

More information

Lecture 7: Binomial Test, Chisquare

Lecture 7: Binomial Test, Chisquare Lecture 7: Binomial Test, Chisquare Test, and ANOVA May, 01 GENOME 560, Spring 01 Goals ANOVA Binomial test Chi square test Fisher s exact test Su In Lee, CSE & GS suinlee@uw.edu 1 Whirlwind Tour of One/Two

More information

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

More information

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Null Hypothesis H 0. The null hypothesis (denoted by H 0 Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

Chapter 4 and 5 solutions

Chapter 4 and 5 solutions Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Comparing three or more groups (one-way ANOVA...)

Comparing three or more groups (one-way ANOVA...) Page 1 of 36 Comparing three or more groups (one-way ANOVA...) You've measured a variable in three or more groups, and the means (and medians) are distinct. Is that due to chance? Or does it tell you the

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

More information

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test... Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

More information

individualdifferences

individualdifferences 1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

More information

Multiple Regression in SPSS STAT 314

Multiple Regression in SPSS STAT 314 Multiple Regression in SPSS STAT 314 I. The accompanying data is on y = profit margin of savings and loan companies in a given year, x 1 = net revenues in that year, and x 2 = number of savings and loan

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Inference for Regression

Inference for Regression Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about

More information

Discrete with many values are often treated as continuous: Minnesota math scores, years of education.

Discrete with many values are often treated as continuous: Minnesota math scores, years of education. Lecture 9 1. Continuous and categorical predictors 2. Indicator variables 3. General linear model: child s IQ example 4. Proc GLM and Proc Reg 5. ANOVA example: Rat diets 6. Factorial design: 2 2 7. Interactions

More information

Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

More information

Repeatability and Reproducibility

Repeatability and Reproducibility Repeatability and Reproducibility Copyright 1999 by Engineered Software, Inc. Repeatability is the variability of the measurements obtained by one person while measuring the same item repeatedly. This

More information

Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21

Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21 1 Factorial ANOVA The ANOVA designs we have dealt with up to this point, known as simple ANOVA or oneway ANOVA, had only one independent grouping variable or factor. However, oftentimes a researcher has

More information

12.1 Inference for Linear Regression

12.1 Inference for Linear Regression 12.1 Inference for Linear Regression Least Squares Regression Line y = a + bx You might want to refresh your memory of LSR lines by reviewing Chapter 3! 1 Sample Distribution of b p740 Shape Center Spread

More information

ACTM State Exam-Statistics

ACTM State Exam-Statistics ACTM State Exam-Statistics For the 25 multiple-choice questions, make your answer choice and record it on the answer sheet provided. Once you have completed that section of the test, proceed to the tie-breaker

More information

Lecture 23 Multiple Comparisons & Contrasts

Lecture 23 Multiple Comparisons & Contrasts Lecture 23 Multiple Comparisons & Contrasts STAT 512 Spring 2011 Background Reading KNNL: 17.3-17.7 23-1 Topic Overview Linear Combinations and Contrasts Pairwise Comparisons and Multiple Testing Adjustments

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Lecture 18 Linear Regression

Lecture 18 Linear Regression Lecture 18 Statistics Unit Andrew Nunekpeku / Charles Jackson Fall 2011 Outline 1 1 Situation - used to model quantitative dependent variable using linear function of quantitative predictor(s). Situation

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information