CS 147: Computer Systems Performance Analysis
|
|
|
- Thomasina Williamson
- 9 years ago
- Views:
Transcription
1 CS 147: Computer Systems Performance Analysis One-Factor Experiments CS 147: Computer Systems Performance Analysis One-Factor Experiments 1 / 42
2 Overview Introduction Overview Overview Introduction Finding Effects Calculating Errors ANOVA Allocation Analysis Verifying Assumptions Unequal Sample Sizes Finding Effects Calculating Errors ANOVA Allocation Analysis Verifying Assumptions Unequal Sample Sizes 2 / 42
3 3 / 42 Introduction Characteristics of One-Factor Experiments Introduction Characteristics of One-Factor Experiments Characteristics of One-Factor Experiments Useful if there s only one important categorical factor with more than two interesting alternatives Methods reduce to 2 1 factorial designs if only two choices If single variable isn t categorical, should use regression instead Method allows multiple replications Useful if there s only one important categorical factor with more than two interesting alternatives Methods reduce to 2 1 factorial designs if only two choices If single variable isn t categorical, should use regression instead Method allows multiple replications
4 Introduction Comparing Truly Comparable Options Introduction Comparing Truly Comparable Options Comparing Truly Comparable Options Evaluating single workload on multiple machines Trying different options for single component Applying single suite of programs to different compilers Evaluating single workload on multiple machines Trying different options for single component Applying single suite of programs to different compilers 4 / 42
5 5 / 42 Introduction When to Avoid It Introduction When to Avoid It When to Avoid It Incomparable factors E.g., measuring vastly different workloads on single system Numerical factors Won t predict any untested levels Regression usually better choice Related entries across level Use two-factor design instead Incomparable factors E.g., measuring vastly different workloads on single system Numerical factors Won t predict any untested levels Regression usually better choice Related entries across level Use two-factor design instead
6 6 / 42 An Example One-Factor Experiment An Example One-Factor Experiment An Example One-Factor Experiment Choosing authentication server for single-sized messages Four different servers are available Performance measured by response time Lower is better Choosing authentication server for single-sized messages Four different servers are available Performance measured by response time Lower is better
7 7 / 42 The One-Factor Model The One-Factor Model The One-Factor Model yij = µ + αj + eij yij is i th response with factor set at level j µ is mean response αj is effect of alternative j αj = 0 eij is error term eij = 0 y ij = µ + α j + e ij y ij is i th response with factor set at level j µ is mean response α j is effect of alternative j αj = 0 e ij is error term eij = 0
8 One-Factor Experiments With Replications One-Factor Experiments With Replications One-Factor Experiments With Replications Initially, assume r replications at each alternative of factor Assuming a alternatives, we have a total of ar observations Model is thus r a a r a yij = arµ + r αj + eij i=1 j=1 j=1 i=1 j=1 Initially, assume r replications at each alternative of factor Assuming a alternatives, we have a total of ar observations Model is thus r a y ij = arµ + r i=1 j=1 a r α j + a j=1 i=1 j=1 e ij 8 / 42
9 Sample Data for Our Example Sample Data for Our Example Sample Data for Our Example Four alternatives, with four replications each (measured in seconds) A B C D Four alternatives, with four replications each (measured in seconds) A B C D / 42
10 10 / 42 Finding Effects Computing Effects Finding Effects Computing Effects Computing Effects Need to figure out µ and αj We have various yij s Errors should add to zero: r a eij = 0 i=1 j=1 Similarly, effects should add to zero: a αj = 0 j=1 Need to figure out µ and α j We have various y ij s Errors should add to zero: r a e ij = 0 i=1 j=1 Similarly, effects should add to zero: a α j = 0 j=1
11 11 / 42 Finding Effects Calculating µ Finding Effects Calculating µ Calculating µ By definition, sum of errors and sum of effects are both zero: r a yij = arµ i=1 j=1 And thus, µ is equal to grand mean of all responses µ = 1 r a yij = y ar i=1 j=1 By definition, sum of errors and sum of effects are both zero: r a y ij = arµ i=1 j=1 And thus, µ is equal to grand mean of all responses µ = 1 ar r a y ij = y i=1 j=1
12 12 / 42 Finding Effects Calculating µ for Our Example Finding Effects Calculating µ for Our Example Calculating µ for Our Example Thus, µ = yij 4 4 i=1 j=1 = = Thus, µ = i=1 j=1 = = y ij
13 13 / 42 Finding Effects Calculating α j Finding Effects Calculating α j Calculating αj αj is vector of responses One for each alternative of the factor To find vector, find column means y j = 1 r yij r i=1 Separate mean for each j Can calculate directly from observations α j is vector of responses One for each alternative of the factor To find vector, find column means y j = 1 r r i=1 y ij Separate mean for each j Can calculate directly from observations
14 14 / 42 Finding Effects Calculating Column Mean Finding Effects Calculating Column Mean Calculating Column Mean We know that yij is defined to be yij = µ + αj + eij So, y j = 1 r (µ + αj + eij) r i=1 ( ) = 1 r rµ + rαj + eij r i=1 We know that y ij is defined to be y ij = µ + α j + e ij So, y j = 1 r = 1 r r (µ + α j + e ij ) i=1 ( rµ + rα j + ) r e ij i=1
15 15 / 42 Finding Effects Calculating Parameters Finding Effects Calculating Parameters Calculating Parameters Sum of errors for any given row is zero, so y j = 1 (rµ + rαj + 0) r = µ + αj So we can solve for αj: αj = y j µ = y j y Sum of errors for any given row is zero, so y j = 1 r (rµ + rα j + 0) = µ + α j So we can solve for α j : α j = y j µ = y j y
16 Finding Effects Parameters for Our Example Finding Effects Parameters for Our Example Parameters for Our Example Server A B C D Col. Mean Subtract µ from column means to get parameters: Parameters Server A B C D Col. Mean Subtract µ from column means to get parameters: Parameters / 42
17 17 / 42 Calculating Errors Estimating Experimental Errors Calculating Errors Estimating Experimental Errors Estimating Experimental Errors Estimated response is ŷij = µ + αij But we measured actual responses Multiple responses per alternative So we can estimate amount of error in estimated response Use methods similar to those used in other types of experiment designs Estimated response is ŷ ij = µ + α ij But we measured actual responses Multiple responses per alternative So we can estimate amount of error in estimated response Use methods similar to those used in other types of experiment designs
18 Calculating Errors Sum of Squared Errors Calculating Errors Sum of Squared Errors Sum of Squared Errors SSE estimates variance of the errors: r a SSE = eij 2 i=1 j=1 We can calculate SSE directly from model and observations Also can find indirectly from its relationship to other error terms SSE estimates variance of the errors: SSE = r a i=1 j=1 e 2 ij We can calculate SSE directly from model and observations Also can find indirectly from its relationship to other error terms 18 / 42
19 Calculating Errors SSE for Our Example Calculating Errors SSE for Our Example SSE for Our Example Calculated directly: SSE = (.96 ( )) 2 + (1.05 ( )) (.75 ( )) 2 + (1.22 ( )) (.93 ( )) 2 =.3425 Calculated directly: SSE = (.96 ( )) 2 + (1.05 ( )) (.75 ( )) 2 + (1.22 ( )) (.93 ( )) 2 = / 42
20 20 / 42 ANOVA Allocation Allocating Variation ANOVA Allocation Allocating Variation Allocating Variation To allocate variation for model, start by squaring both sides of model equation yij 2 = µ 2 + αj 2 + eij 2 + 2µαj + 2µeij + 2αjeij yij 2 = µ 2 + αj 2 + eij 2 i,j i,j i,j i,j + cross-products Cross-product terms add up to zero To allocate variation for model, start by squaring both sides of model equation i,j y 2 ij = µ 2 + α 2 j + e 2 ij + 2µα j + 2µe ij + 2α j e ij yij 2 = µ 2 + αj 2 + i,j i,j i,j + cross-products Cross-product terms add up to zero e 2 ij
21 21 / 42 ANOVA Allocation Variation In Sum of Squares Terms ANOVA Allocation Variation In Sum of Squares Terms Variation In Sum of Squares Terms SSY = SS0 + SSA + SSE SSY = yij 2 i,j SS0 = SSA = r i=1 r i=1 a µ 2 = arµ 2 j=1 a αj 2 j=1 = r a αj 2 j=1 Gives another way to calculate SSE SSY = SS0 + SSA + SSE SSY = i,j y 2 ij SS0 = SSA = r i=1 j=1 r a µ 2 = arµ 2 a a αj 2 = r αj 2 i=1 j=1 j=1 Gives another way to calculate SSE
22 22 / 42 ANOVA Allocation Sum of Squares Terms for Our Example ANOVA Allocation Sum of Squares Terms for Our Example Sum of Squares Terms for Our Example SSY = SS0 = SSA = So SSE must equal = Matches our earlier SSE calculation SSY = SS0 = SSA = So SSE must equal = Matches our earlier SSE calculation
23 ANOVA Allocation Assigning Variation ANOVA Allocation Assigning Variation Assigning Variation SST is total variation SST = SSY SS0 = SSA + SSE Part of total variation comes from model Part comes from experimental errors A good model explains a lot of variation SST is total variation SST = SSY SS0 = SSA + SSE Part of total variation comes from model Part comes from experimental errors A good model explains a lot of variation 23 / 42
24 ANOVA Allocation Assigning Variation in Our Example ANOVA Allocation Assigning Variation in Our Example Assigning Variation in Our Example SST = SSY SS0 = SSA = SSE =.3425 Percentage of variation explained by server choice: = = 8.97% SST = SSY SS0 = SSA = SSE =.3425 Percentage of variation explained by server choice: = = 8.97% 24 / 42
25 25 / 42 ANOVA Analysis Analysis of Variance ANOVA Analysis Analysis of Variance Analysis of Variance Percentage of variation explained can be large or small Regardless of size, may or may not be statistically significant To determine significance, use ANOVA procedure Assumes normally distributed errors Percentage of variation explained can be large or small Regardless of size, may or may not be statistically significant To determine significance, use ANOVA procedure Assumes normally distributed errors
26 26 / 42 ANOVA Analysis Running ANOVA ANOVA Analysis Running ANOVA Running ANOVA Easiest to set up tabular method Like method used in regression models Only slight differences Basically, determine ratio of Mean Squared of A (parameters) to Mean Squared Errors Then check against F-table value for number of degrees of freedom Easiest to set up tabular method Like method used in regression models Only slight differences Basically, determine ratio of Mean Squared of A (parameters) to Mean Squared Errors Then check against F-table value for number of degrees of freedom
27 27 / 42 Sum of Squares F- Computed Component % of Variation ij N y SS0 = Nµ 2 1 y y SST = SSY SS0 100 N 1 A e SSE = SST SSA SSA SST SSE SST N = ar a 1 N a se = MSE Mean Square Degrees of Freedom F- Computed MSA MSE F- Table ANOVA Analysis ANOVA Table for One-Factor Experiments ANOVA Analysis ANOVA Table for One-Factor Experiments ANOVA Table for One-Factor Experiments y SSY = y 2 SSA = r α 2 j MSA = SSA a 1 MSE = SSE N a F[ 1 α; a 1, N a] Component Sum of Squares y SSY = y 2 ij % of Variation Degrees of Freedom y SS0 = Nµ 2 1 y y SST = SSY SS0 100 N 1 A e SSA = r α 2 j SSE = SST SSA SSA SST SSE SST N = ar N a 1 N a s e = MSE Mean Square MSA = SSA a 1 MSE = SSE N a MSA MSE F- Table F[ 1 α; a 1, N a]
28 28 / 42 ANOVA Analysis ANOVA Procedure for Our Example ANOVA Analysis ANOVA Procedure for Our Example ANOVA Procedure for Our Example Com- % of Degrees F- po- Sum of Varia- of Free- Mean Com- F- nent Squares tion dom Square puted Table y y y y A e Component Sum of Squares % of Variation Degrees of Freedom Mean Square F- Computed F- Table y y y y A e
29 ANOVA Analysis Interpretation of Sample ANOVA ANOVA Analysis Interpretation of Sample ANOVA Interpretation of Sample ANOVA Done at 90% level F-computed is.394 Table entry at 90% level with n = 3 and m = 12 is 2.61 Thus, servers are not significantly different Done at 90% level F-computed is.394 Table entry at 90% level with n = 3 and m = 12 is 2.61 Thus, servers are not significantly different 29 / 42
30 30 / 42 Verifying Assumptions One-Factor Experiment Assumptions Verifying Assumptions One-Factor Experiment Assumptions One-Factor Experiment Assumptions Analysis of one-factor experiments makes the usual assumptions: Effects of factors are additive Errors are additive Errors are independent of factor alternatives Errors are normally distributed Errors have same variance at all alternatives How do we tell if these are correct? Analysis of one-factor experiments makes the usual assumptions: Effects of factors are additive Errors are additive Errors are independent of factor alternatives Errors are normally distributed Errors have same variance at all alternatives How do we tell if these are correct?
31 31 / 42 Verifying Assumptions Visual Diagnostic Tests Verifying Assumptions Visual Diagnostic Tests Visual Diagnostic Tests Similar to those done before Residuals vs. predicted response Normal quantile-quantile plot Residuals vs. experiment number Similar to those done before Residuals vs. predicted response Normal quantile-quantile plot Residuals vs. experiment number
32 32 / 42 Verifying Assumptions Residuals vs. Predicted for Example Verifying Assumptions Residuals vs. Predicted for Example Residuals vs. Predicted for Example
33 33 / 42 Verifying Assumptions Residuals vs. Predicted, Slightly Revised Verifying Assumptions Residuals vs. Predicted, Slightly Revised Residuals vs. Predicted, Slightly Revised In the alternate rendering, the predictions for server D are shown in blue so they can be distinguished from server C
34 34 / 42 Verifying Assumptions What Does The Plot Tell Us? Verifying Assumptions What Does The Plot Tell Us? What Does The Plot Tell Us? Analysis assumed size of errors was unrelated to factor alternatives Plot tells us something entirely different Very different spread of residuals for different factors Thus, one-factor analysis is not appropriate for this data Compare individual alternatives instead Use pairwise confidence intervals Analysis assumed size of errors was unrelated to factor alternatives Plot tells us something entirely different Very different spread of residuals for different factors Thus, one-factor analysis is not appropriate for this data Compare individual alternatives instead Use pairwise confidence intervals
35 Verifying Assumptions Could We Have Figured This Out Sooner? Verifying Assumptions Could We Have Figured This Out Sooner? Could We Have Figured This Out Sooner? Yes! Look at original data Look at calculated parameters Model says C & D are identical Even cursory examination of data suggests otherwise Yes! Look at original data Look at calculated parameters Model says C & D are identical Even cursory examination of data suggests otherwise 35 / 42
36 Verifying Assumptions Looking Back at the Data Verifying Assumptions Looking Back at the Data Looking Back at the Data A B C D Parameters: A B C D Parameters: / 42
37 37 / 42 Verifying Assumptions Quantile-Quantile Plot for Example Verifying Assumptions Quantile-Quantile Plot for Example Quantile-Quantile Plot for Example
38 Verifying Assumptions What Does This Plot Tell Us? Verifying Assumptions What Does This Plot Tell Us? What Does This Plot Tell Us? Overall, errors are normally distributed If we only did quantile-quantile plot, we d think everything was fine The lesson: test ALL assumptions, not just one or two Overall, errors are normally distributed If we only did quantile-quantile plot, we d think everything was fine The lesson: test ALL assumptions, not just one or two 38 / 42
39 39 / 42 Verifying Assumptions One-Factor Confidence Intervals Verifying Assumptions One-Factor Confidence Intervals One-Factor Confidence Intervals Estimated parameters are random variables Thus, can compute confidence intervals Basic method is same as for confidence intervals on 2 k r design effects Find standard deviation of parameters Use that to calculate confidence intervals Possible typo in book, p. 336, example 20.6, in formula for calculating αj Also might be typo on p. 335: degrees of freedom is a(r 1), not r(a 1) Estimated parameters are random variables Thus, can compute confidence intervals Basic method is same as for confidence intervals on 2 k r design effects Find standard deviation of parameters Use that to calculate confidence intervals Possible typo in book, p. 336, example 20.6, in formula for calculating α j Also might be typo on p. 335: degrees of freedom is a(r 1), not r(a 1)
40 Verifying Assumptions Confidence Intervals For Example Parameters Verifying Assumptions Confidence Intervals For Example Parameters Confidence Intervals For Example Parameters se =.158 Standard deviation of µ =.040 Standard deviation of αj = % confidence interval for µ = (.932, 1.10) 95% CI for α1 = (.225,.074) 95% CI for α2 = (.148,.151) 95% CI for α3 = (.113,.186) 95% CI for α4 = (.113,.186) s e =.158 Standard deviation of µ =.040 Standard deviation of α j = % confidence interval for µ = (.932, 1.10) 95% CI for α 1 = (.225,.074) 95% CI for α 2 = (.148,.151) 95% CI for α 3 = (.113,.186) 95% CI for α 4 = (.113,.186) 40 / 42
41 Unequal Sample Sizes Unequal Sample Sizes in One-Factor Experiments Unequal Sample Sizes Unequal Sample Sizes in One-Factor Experiments Unequal Sample Sizes in One-Factor Experiments Don t really need identical replications for all alternatives Only slight extra difficulty See book example for full details Don t really need identical replications for all alternatives Only slight extra difficulty See book example for full details 41 / 42
42 42 / 42 Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Changes To Handle Unequal Sample Sizes Model is the same Effects are weighted by number of replications for that alternative: a rjaj = 0 j=1 Slightly different formulas for degrees of freedom Model is the same Effects are weighted by number of replications for that alternative: a r j a j = 0 j=1 Slightly different formulas for degrees of freedom
One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate
1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,
1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Chapter 7. One-way ANOVA
Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks
One-Way Analysis of Variance (ANOVA) Example Problem
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
Section 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
CHAPTER 13. Experimental Design and Analysis of Variance
CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection
The Analysis of Variance ANOVA
-3σ -σ -σ +σ +σ +3σ The Analysis of Variance ANOVA Lecture 0909.400.0 / 0909.400.0 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -σ -σ +σ +σ +3σ Analysis
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups
One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
One-Way Analysis of Variance
One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We
Statistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
Introduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
Notes on Applied Linear Regression
Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:
1 Basic ANOVA concepts
Math 143 ANOVA 1 Analysis of Variance (ANOVA) Recall, when we wanted to compare two population means, we used the 2-sample t procedures. Now let s expand this to compare k 3 population means. As with the
Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science [email protected]
Dept of Information Science [email protected] October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic
MULTIPLE REGRESSION WITH CATEGORICAL DATA
DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 86 MULTIPLE REGRESSION WITH CATEGORICAL DATA I. AGENDA: A. Multiple regression with categorical variables. Coding schemes. Interpreting
ABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
UNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
Session 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
Univariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
Recall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
Multiple-Comparison Procedures
Multiple-Comparison Procedures References A good review of many methods for both parametric and nonparametric multiple comparisons, planned and unplanned, and with some discussion of the philosophical
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
Coefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
An analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
individualdifferences
1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,
Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
Module 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
Analysis of Variance. MINITAB User s Guide 2 3-1
3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced
STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239
STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent
Topic 9. Factorial Experiments [ST&D Chapter 15]
Topic 9. Factorial Experiments [ST&D Chapter 5] 9.. Introduction In earlier times factors were studied one at a time, with separate experiments devoted to each factor. In the factorial approach, the investigator
Simple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
Regression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
How To Run Statistical Tests in Excel
How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting
Chapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
Study Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
N-Way Analysis of Variance
N-Way Analysis of Variance 1 Introduction A good example when to use a n-way ANOVA is for a factorial design. A factorial design is an efficient way to conduct an experiment. Each observation has data
Experimental Designs (revisited)
Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
Testing Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
Exercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
5 Analysis of Variance models, complex linear models and Random effects models
5 Analysis of Variance models, complex linear models and Random effects models In this chapter we will show any of the theoretical background of the analysis. The focus is to train the set up of ANOVA
Simple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
Final Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
Chapter 19 Split-Plot Designs
Chapter 19 Split-Plot Designs Split-plot designs are needed when the levels of some treatment factors are more difficult to change during the experiment than those of others. The designs have a nested
Section 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
International Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: [email protected] 1. Introduction
Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions
Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Typical Inference Problem Definition of Sampling Distribution 3 Approaches to Understanding Sampling Dist. Applying 68-95-99.7 Rule
Experiment Design and Analysis of a Mobile Aerial Wireless Mesh Network for Emergencies
Experiment Design and Analysis of a Mobile Aerial Wireless Mesh Network for Emergencies 1. Introduction 2. Experiment Design 2.1. System Components 2.2. Experiment Topology 3. Experiment Analysis 3.1.
Module 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)
UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design
The ANOVA for 2x2 Independent Groups Factorial Design
The ANOVA for 2x2 Independent Groups Factorial Design Please Note: In the analyses above I have tried to avoid using the terms "Independent Variable" and "Dependent Variable" (IV and DV) in order to emphasize
NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
Repeatability and Reproducibility
Repeatability and Reproducibility Copyright 1999 by Engineered Software, Inc. Repeatability is the variability of the measurements obtained by one person while measuring the same item repeatedly. This
Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
Quadratic forms Cochran s theorem, degrees of freedom, and all that
Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, [email protected] Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us
Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?
Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky [email protected] This is the first case in what I expect will be a series of case studies. While I mention
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Data Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
Confidence Intervals for the Difference Between Two Means
Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means
4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
Solutions to Homework 10 Statistics 302 Professor Larget
s to Homework 10 Statistics 302 Professor Larget Textbook Exercises 7.14 Rock-Paper-Scissors (Graded for Accurateness) In Data 6.1 on page 367 we see a table, reproduced in the table below that shows the
Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model
Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
Statistical Confidence Calculations
Statistical Confidence Calculations Statistical Methodology Omniture Test&Target utilizes standard statistics to calculate confidence, confidence intervals, and lift for each campaign. The student s T
Difference of Means and ANOVA Problems
Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly
Regression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
Chapter 4 and 5 solutions
Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory,
THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
ANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
Two-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.
Chi-square Goodness of Fit Test The chi-square test is designed to test differences whether one frequency is different from another frequency. The chi-square test is designed for use with data on a nominal
Statistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
Linear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples
Analysis of Variance ANOVA
Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.
Multivariate Analysis of Variance (MANOVA)
Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
Regression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
Projects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
Multiple regression - Matrices
Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected]
Analysing Questionnaires using Minitab (for SPSS queries contact -) [email protected] Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
Confidence Intervals for Cp
Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
Simple Linear Regression
STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze
Randomized Block Analysis of Variance
Chapter 565 Randomized Block Analysis of Variance Introduction This module analyzes a randomized block analysis of variance with up to two treatment factors and their interaction. It provides tables of
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
Estimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
How To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS
CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent
Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.
1 Commands in JMP and Statcrunch Below are a set of commands in JMP and Statcrunch which facilitate a basic statistical analysis. The first part concerns commands in JMP, the second part is for analysis
