# CS 147: Computer Systems Performance Analysis

Size: px
Start display at page:

Transcription

1 CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42

2 Overview Introduction Overview Overview Introduction Finding Effects Calculating Errors ANOVA Allocation Analysis Verifying Assumptions Unequal Sample Sizes Finding Effects Calculating Errors ANOVA Allocation Analysis Verifying Assumptions Unequal Sample Sizes 2 / 42

3 3 / 42 Introduction Characteristics of One-Factor Experiments Introduction Characteristics of One-Factor Experiments Characteristics of One-Factor Experiments Useful if there s only one important categorical factor with more than two interesting alternatives Methods reduce to 2 1 factorial designs if only two choices If single variable isn t categorical, should use regression instead Method allows multiple replications Useful if there s only one important categorical factor with more than two interesting alternatives Methods reduce to 2 1 factorial designs if only two choices If single variable isn t categorical, should use regression instead Method allows multiple replications

4 Introduction Comparing Truly Comparable Options Introduction Comparing Truly Comparable Options Comparing Truly Comparable Options Evaluating single workload on multiple machines Trying different options for single component Applying single suite of programs to different compilers Evaluating single workload on multiple machines Trying different options for single component Applying single suite of programs to different compilers 4 / 42

5 5 / 42 Introduction When to Avoid It Introduction When to Avoid It When to Avoid It Incomparable factors E.g., measuring vastly different workloads on single system Numerical factors Won t predict any untested levels Regression usually better choice Related entries across level Use two-factor design instead Incomparable factors E.g., measuring vastly different workloads on single system Numerical factors Won t predict any untested levels Regression usually better choice Related entries across level Use two-factor design instead

6 6 / 42 An Example One-Factor Experiment An Example One-Factor Experiment An Example One-Factor Experiment Choosing authentication server for single-sized messages Four different servers are available Performance measured by response time Lower is better Choosing authentication server for single-sized messages Four different servers are available Performance measured by response time Lower is better

7 7 / 42 The One-Factor Model The One-Factor Model The One-Factor Model yij = µ + αj + eij yij is i th response with factor set at level j µ is mean response αj is effect of alternative j αj = 0 eij is error term eij = 0 y ij = µ + α j + e ij y ij is i th response with factor set at level j µ is mean response α j is effect of alternative j αj = 0 e ij is error term eij = 0

8 One-Factor Experiments With Replications One-Factor Experiments With Replications One-Factor Experiments With Replications Initially, assume r replications at each alternative of factor Assuming a alternatives, we have a total of ar observations Model is thus r a a r a yij = arµ + r αj + eij i=1 j=1 j=1 i=1 j=1 Initially, assume r replications at each alternative of factor Assuming a alternatives, we have a total of ar observations Model is thus r a y ij = arµ + r i=1 j=1 a r α j + a j=1 i=1 j=1 e ij 8 / 42

9 Sample Data for Our Example Sample Data for Our Example Sample Data for Our Example Four alternatives, with four replications each (measured in seconds) A B C D Four alternatives, with four replications each (measured in seconds) A B C D / 42

10 10 / 42 Finding Effects Computing Effects Finding Effects Computing Effects Computing Effects Need to figure out µ and αj We have various yij s Errors should add to zero: r a eij = 0 i=1 j=1 Similarly, effects should add to zero: a αj = 0 j=1 Need to figure out µ and α j We have various y ij s Errors should add to zero: r a e ij = 0 i=1 j=1 Similarly, effects should add to zero: a α j = 0 j=1

11 11 / 42 Finding Effects Calculating µ Finding Effects Calculating µ Calculating µ By definition, sum of errors and sum of effects are both zero: r a yij = arµ i=1 j=1 And thus, µ is equal to grand mean of all responses µ = 1 r a yij = y ar i=1 j=1 By definition, sum of errors and sum of effects are both zero: r a y ij = arµ i=1 j=1 And thus, µ is equal to grand mean of all responses µ = 1 ar r a y ij = y i=1 j=1

12 12 / 42 Finding Effects Calculating µ for Our Example Finding Effects Calculating µ for Our Example Calculating µ for Our Example Thus, µ = yij 4 4 i=1 j=1 = = Thus, µ = i=1 j=1 = = y ij

13 13 / 42 Finding Effects Calculating α j Finding Effects Calculating α j Calculating αj αj is vector of responses One for each alternative of the factor To find vector, find column means y j = 1 r yij r i=1 Separate mean for each j Can calculate directly from observations α j is vector of responses One for each alternative of the factor To find vector, find column means y j = 1 r r i=1 y ij Separate mean for each j Can calculate directly from observations

14 14 / 42 Finding Effects Calculating Column Mean Finding Effects Calculating Column Mean Calculating Column Mean We know that yij is defined to be yij = µ + αj + eij So, y j = 1 r (µ + αj + eij) r i=1 ( ) = 1 r rµ + rαj + eij r i=1 We know that y ij is defined to be y ij = µ + α j + e ij So, y j = 1 r = 1 r r (µ + α j + e ij ) i=1 ( rµ + rα j + ) r e ij i=1

15 15 / 42 Finding Effects Calculating Parameters Finding Effects Calculating Parameters Calculating Parameters Sum of errors for any given row is zero, so y j = 1 (rµ + rαj + 0) r = µ + αj So we can solve for αj: αj = y j µ = y j y Sum of errors for any given row is zero, so y j = 1 r (rµ + rα j + 0) = µ + α j So we can solve for α j : α j = y j µ = y j y

16 Finding Effects Parameters for Our Example Finding Effects Parameters for Our Example Parameters for Our Example Server A B C D Col. Mean Subtract µ from column means to get parameters: Parameters Server A B C D Col. Mean Subtract µ from column means to get parameters: Parameters / 42

17 17 / 42 Calculating Errors Estimating Experimental Errors Calculating Errors Estimating Experimental Errors Estimating Experimental Errors Estimated response is ŷij = µ + αij But we measured actual responses Multiple responses per alternative So we can estimate amount of error in estimated response Use methods similar to those used in other types of experiment designs Estimated response is ŷ ij = µ + α ij But we measured actual responses Multiple responses per alternative So we can estimate amount of error in estimated response Use methods similar to those used in other types of experiment designs

18 Calculating Errors Sum of Squared Errors Calculating Errors Sum of Squared Errors Sum of Squared Errors SSE estimates variance of the errors: r a SSE = eij 2 i=1 j=1 We can calculate SSE directly from model and observations Also can find indirectly from its relationship to other error terms SSE estimates variance of the errors: SSE = r a i=1 j=1 e 2 ij We can calculate SSE directly from model and observations Also can find indirectly from its relationship to other error terms 18 / 42

19 Calculating Errors SSE for Our Example Calculating Errors SSE for Our Example SSE for Our Example Calculated directly: SSE = (.96 ( )) 2 + (1.05 ( )) (.75 ( )) 2 + (1.22 ( )) (.93 ( )) 2 =.3425 Calculated directly: SSE = (.96 ( )) 2 + (1.05 ( )) (.75 ( )) 2 + (1.22 ( )) (.93 ( )) 2 = / 42

20 20 / 42 ANOVA Allocation Allocating Variation ANOVA Allocation Allocating Variation Allocating Variation To allocate variation for model, start by squaring both sides of model equation yij 2 = µ 2 + αj 2 + eij 2 + 2µαj + 2µeij + 2αjeij yij 2 = µ 2 + αj 2 + eij 2 i,j i,j i,j i,j + cross-products Cross-product terms add up to zero To allocate variation for model, start by squaring both sides of model equation i,j y 2 ij = µ 2 + α 2 j + e 2 ij + 2µα j + 2µe ij + 2α j e ij yij 2 = µ 2 + αj 2 + i,j i,j i,j + cross-products Cross-product terms add up to zero e 2 ij

21 21 / 42 ANOVA Allocation Variation In Sum of Squares Terms ANOVA Allocation Variation In Sum of Squares Terms Variation In Sum of Squares Terms SSY = SS0 + SSA + SSE SSY = yij 2 i,j SS0 = SSA = r i=1 r i=1 a µ 2 = arµ 2 j=1 a αj 2 j=1 = r a αj 2 j=1 Gives another way to calculate SSE SSY = SS0 + SSA + SSE SSY = i,j y 2 ij SS0 = SSA = r i=1 j=1 r a µ 2 = arµ 2 a a αj 2 = r αj 2 i=1 j=1 j=1 Gives another way to calculate SSE

22 22 / 42 ANOVA Allocation Sum of Squares Terms for Our Example ANOVA Allocation Sum of Squares Terms for Our Example Sum of Squares Terms for Our Example SSY = SS0 = SSA = So SSE must equal = Matches our earlier SSE calculation SSY = SS0 = SSA = So SSE must equal = Matches our earlier SSE calculation

23 ANOVA Allocation Assigning Variation ANOVA Allocation Assigning Variation Assigning Variation SST is total variation SST = SSY SS0 = SSA + SSE Part of total variation comes from model Part comes from experimental errors A good model explains a lot of variation SST is total variation SST = SSY SS0 = SSA + SSE Part of total variation comes from model Part comes from experimental errors A good model explains a lot of variation 23 / 42

24 ANOVA Allocation Assigning Variation in Our Example ANOVA Allocation Assigning Variation in Our Example Assigning Variation in Our Example SST = SSY SS0 = SSA = SSE =.3425 Percentage of variation explained by server choice: = = 8.97% SST = SSY SS0 = SSA = SSE =.3425 Percentage of variation explained by server choice: = = 8.97% 24 / 42

25 25 / 42 ANOVA Analysis Analysis of Variance ANOVA Analysis Analysis of Variance Analysis of Variance Percentage of variation explained can be large or small Regardless of size, may or may not be statistically significant To determine significance, use ANOVA procedure Assumes normally distributed errors Percentage of variation explained can be large or small Regardless of size, may or may not be statistically significant To determine significance, use ANOVA procedure Assumes normally distributed errors

26 26 / 42 ANOVA Analysis Running ANOVA ANOVA Analysis Running ANOVA Running ANOVA Easiest to set up tabular method Like method used in regression models Only slight differences Basically, determine ratio of Mean Squared of A (parameters) to Mean Squared Errors Then check against F-table value for number of degrees of freedom Easiest to set up tabular method Like method used in regression models Only slight differences Basically, determine ratio of Mean Squared of A (parameters) to Mean Squared Errors Then check against F-table value for number of degrees of freedom

27 27 / 42 Sum of Squares F- Computed Component % of Variation ij N y SS0 = Nµ 2 1 y y SST = SSY SS0 100 N 1 A e SSE = SST SSA SSA SST SSE SST N = ar a 1 N a se = MSE Mean Square Degrees of Freedom F- Computed MSA MSE F- Table ANOVA Analysis ANOVA Table for One-Factor Experiments ANOVA Analysis ANOVA Table for One-Factor Experiments ANOVA Table for One-Factor Experiments y SSY = y 2 SSA = r α 2 j MSA = SSA a 1 MSE = SSE N a F[ 1 α; a 1, N a] Component Sum of Squares y SSY = y 2 ij % of Variation Degrees of Freedom y SS0 = Nµ 2 1 y y SST = SSY SS0 100 N 1 A e SSA = r α 2 j SSE = SST SSA SSA SST SSE SST N = ar N a 1 N a s e = MSE Mean Square MSA = SSA a 1 MSE = SSE N a MSA MSE F- Table F[ 1 α; a 1, N a]

28 28 / 42 ANOVA Analysis ANOVA Procedure for Our Example ANOVA Analysis ANOVA Procedure for Our Example ANOVA Procedure for Our Example Com- % of Degrees F- po- Sum of Varia- of Free- Mean Com- F- nent Squares tion dom Square puted Table y y y y A e Component Sum of Squares % of Variation Degrees of Freedom Mean Square F- Computed F- Table y y y y A e

29 ANOVA Analysis Interpretation of Sample ANOVA ANOVA Analysis Interpretation of Sample ANOVA Interpretation of Sample ANOVA Done at 90% level F-computed is.394 Table entry at 90% level with n = 3 and m = 12 is 2.61 Thus, servers are not significantly different Done at 90% level F-computed is.394 Table entry at 90% level with n = 3 and m = 12 is 2.61 Thus, servers are not significantly different 29 / 42

30 30 / 42 Verifying Assumptions One-Factor Experiment Assumptions Verifying Assumptions One-Factor Experiment Assumptions One-Factor Experiment Assumptions Analysis of one-factor experiments makes the usual assumptions: Effects of factors are additive Errors are additive Errors are independent of factor alternatives Errors are normally distributed Errors have same variance at all alternatives How do we tell if these are correct? Analysis of one-factor experiments makes the usual assumptions: Effects of factors are additive Errors are additive Errors are independent of factor alternatives Errors are normally distributed Errors have same variance at all alternatives How do we tell if these are correct?

31 31 / 42 Verifying Assumptions Visual Diagnostic Tests Verifying Assumptions Visual Diagnostic Tests Visual Diagnostic Tests Similar to those done before Residuals vs. predicted response Normal quantile-quantile plot Residuals vs. experiment number Similar to those done before Residuals vs. predicted response Normal quantile-quantile plot Residuals vs. experiment number

32 32 / 42 Verifying Assumptions Residuals vs. Predicted for Example Verifying Assumptions Residuals vs. Predicted for Example Residuals vs. Predicted for Example

33 33 / 42 Verifying Assumptions Residuals vs. Predicted, Slightly Revised Verifying Assumptions Residuals vs. Predicted, Slightly Revised Residuals vs. Predicted, Slightly Revised In the alternate rendering, the predictions for server D are shown in blue so they can be distinguished from server C

34 34 / 42 Verifying Assumptions What Does The Plot Tell Us? Verifying Assumptions What Does The Plot Tell Us? What Does The Plot Tell Us? Analysis assumed size of errors was unrelated to factor alternatives Plot tells us something entirely different Very different spread of residuals for different factors Thus, one-factor analysis is not appropriate for this data Compare individual alternatives instead Use pairwise confidence intervals Analysis assumed size of errors was unrelated to factor alternatives Plot tells us something entirely different Very different spread of residuals for different factors Thus, one-factor analysis is not appropriate for this data Compare individual alternatives instead Use pairwise confidence intervals

35 Verifying Assumptions Could We Have Figured This Out Sooner? Verifying Assumptions Could We Have Figured This Out Sooner? Could We Have Figured This Out Sooner? Yes! Look at original data Look at calculated parameters Model says C & D are identical Even cursory examination of data suggests otherwise Yes! Look at original data Look at calculated parameters Model says C & D are identical Even cursory examination of data suggests otherwise 35 / 42

36 Verifying Assumptions Looking Back at the Data Verifying Assumptions Looking Back at the Data Looking Back at the Data A B C D Parameters: A B C D Parameters: / 42

37 37 / 42 Verifying Assumptions Quantile-Quantile Plot for Example Verifying Assumptions Quantile-Quantile Plot for Example Quantile-Quantile Plot for Example

38 Verifying Assumptions What Does This Plot Tell Us? Verifying Assumptions What Does This Plot Tell Us? What Does This Plot Tell Us? Overall, errors are normally distributed If we only did quantile-quantile plot, we d think everything was fine The lesson: test ALL assumptions, not just one or two Overall, errors are normally distributed If we only did quantile-quantile plot, we d think everything was fine The lesson: test ALL assumptions, not just one or two 38 / 42

39 39 / 42 Verifying Assumptions One-Factor Confidence Intervals Verifying Assumptions One-Factor Confidence Intervals One-Factor Confidence Intervals Estimated parameters are random variables Thus, can compute confidence intervals Basic method is same as for confidence intervals on 2 k r design effects Find standard deviation of parameters Use that to calculate confidence intervals Possible typo in book, p. 336, example 20.6, in formula for calculating αj Also might be typo on p. 335: degrees of freedom is a(r 1), not r(a 1) Estimated parameters are random variables Thus, can compute confidence intervals Basic method is same as for confidence intervals on 2 k r design effects Find standard deviation of parameters Use that to calculate confidence intervals Possible typo in book, p. 336, example 20.6, in formula for calculating α j Also might be typo on p. 335: degrees of freedom is a(r 1), not r(a 1)

40 Verifying Assumptions Confidence Intervals For Example Parameters Verifying Assumptions Confidence Intervals For Example Parameters Confidence Intervals For Example Parameters se =.158 Standard deviation of µ =.040 Standard deviation of αj = % confidence interval for µ = (.932, 1.10) 95% CI for α1 = (.225,.074) 95% CI for α2 = (.148,.151) 95% CI for α3 = (.113,.186) 95% CI for α4 = (.113,.186) s e =.158 Standard deviation of µ =.040 Standard deviation of α j = % confidence interval for µ = (.932, 1.10) 95% CI for α 1 = (.225,.074) 95% CI for α 2 = (.148,.151) 95% CI for α 3 = (.113,.186) 95% CI for α 4 = (.113,.186) 40 / 42

41 Unequal Sample Sizes Unequal Sample Sizes in One-Factor Experiments Unequal Sample Sizes Unequal Sample Sizes in One-Factor Experiments Unequal Sample Sizes in One-Factor Experiments Don t really need identical replications for all alternatives Only slight extra difficulty See book example for full details Don t really need identical replications for all alternatives Only slight extra difficulty See book example for full details 41 / 42

42 42 / 42 Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Model is the same Effects are weighted by number of replications for that alternative: a rjaj = 0 j=1 Slightly different formulas for degrees of freedom Model is the same Effects are weighted by number of replications for that alternative: a r j a j = 0 j=1 Slightly different formulas for degrees of freedom

### One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Chapter 7. One-way ANOVA

Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

### One-Way Analysis of Variance (ANOVA) Example Problem

One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

### The Analysis of Variance ANOVA

-3σ -σ -σ +σ +σ +3σ The Analysis of Variance ANOVA Lecture 0909.400.0 / 0909.400.0 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -σ -σ +σ +σ +3σ Analysis

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

### Notes on Applied Linear Regression

Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

### 1 Basic ANOVA concepts

Math 143 ANOVA 1 Analysis of Variance (ANOVA) Recall, when we wanted to compare two population means, we used the 2-sample t procedures. Now let s expand this to compare k 3 population means. As with the

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### MULTIPLE REGRESSION WITH CATEGORICAL DATA

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting

### ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

### Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Multiple-Comparison Procedures

Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical

### Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

### Coefficient of Determination

Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

### An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

### individualdifferences

1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Analysis of Variance. MINITAB User s Guide 2 3-1

3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

### STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

### Topic 9. Factorial Experiments [ST&D Chapter 15]

Topic 9. Factorial Experiments [ST&D Chapter 5] 9.. Introduction In earlier times factors were studied one at a time, with separate experiments devoted to each factor. In the factorial approach, the investigator

### Simple Regression Theory II 2010 Samuel L. Baker

SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### N-Way Analysis of Variance

N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data

### Experimental Designs (revisited)

Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Testing Research and Statistical Hypotheses

Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### 5 Analysis of Variance models, complex linear models and Random effects models

5 Analysis of Variance models, complex linear models and Random effects models In this chapter we will show any of the theoretical background of the analysis. The focus is to train the set up of ANOVA

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Chapter 19 Split-Plot Designs

Chapter 19 Split-Plot Designs Split-plot designs are needed when the levels of some treatment factors are more difficult to change during the experiment than those of others. The designs have a nested

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

### Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Typical Inference Problem Definition of Sampling Distribution 3 Approaches to Understanding Sampling Dist. Applying 68-95-99.7 Rule

### Experiment Design and Analysis of a Mobile Aerial Wireless Mesh Network for Emergencies

Experiment Design and Analysis of a Mobile Aerial Wireless Mesh Network for Emergencies 1. Introduction 2. Experiment Design 2.1. System Components 2.2. Experiment Topology 3. Experiment Analysis 3.1.

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

### The ANOVA for 2x2 Independent Groups Factorial Design

The ANOVA for 2x2 Independent Groups Factorial Design Please Note: In the analyses above I have tried to avoid using the terms "Independent Variable" and "Dependent Variable" (IV and DV) in order to emphasize

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Repeatability and Reproducibility

Repeatability and Reproducibility Copyright 1999 by Engineered Software, Inc. Repeatability is the variability of the measurements obtained by one person while measuring the same item repeatedly. This

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

### Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Confidence Intervals for the Difference Between Two Means

Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

### 4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

### Solutions to Homework 10 Statistics 302 Professor Larget

s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the

### Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Statistical Confidence Calculations

Statistical Confidence Calculations Statistical Methodology Omniture Test&Target utilizes standard statistics to calculate confidence, confidence intervals, and lift for each campaign. The student s T

### Difference of Means and ANOVA Problems

Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Chapter 4 and 5 solutions

Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,

### THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

### ANOVA. February 12, 2015

ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

### Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

### Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Linear Models in STATA and ANOVA

Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

### Analysis of Variance ANOVA

Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Simple treatment structure

Chapter 3 Simple treatment structure 3.1 Replication of control treatments Suppose that treatment 1 is a control treatment and that treatments 2,..., t are new treatments which we want to compare with

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Multiple regression - Matrices

Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,

### Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

### Confidence Intervals for Cp

Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Simple Linear Regression

STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

### Randomized Block Analysis of Variance

Chapter 565 Randomized Block Analysis of Variance Introduction This module analyzes a randomized block analysis of variance with up to two treatment factors and their interaction. It provides tables of

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### How To Check For Differences In The One Way Anova

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent