Chapter 4 and 5 solutions

Save this PDF as:

Size: px
Start display at page:

Transcription

1 Chapter 4 and 5 solutions 4.4. Three different washing solutions are being compared to study their effectiveness in retarding bacteria growth in five gallon milk containers. The analysis is done in a laboratory, and only three trials can be run on any day. Because days could represent a potential source of variability, the experimenter decides to use a randomized block design. Observations are taken for four days, and the data are shown here. Analyze the data from this experiment (use α = 0.05) and draw conclusions. Days Solution Design Expert Output Response: Growth ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Block Model significant A Residual Cor Total The Model F value of 40.7 implies the model is significant. There is only a 0.03% chance that a "Model F Value" this large could occur due to noise. Std. Dev..94 R Squared

2 Mean Adj R Squared C.V Pred R Squared PRESS Adeq Precision Treatment Means (Adjusted, If Necessary) Estimated Mean Standard Error Mean Standard t for H0 Treatment Difference DF Error Coeff=0 Prob > t 1 vs vs vs There is a difference between the means of the three solutions. The Fisher LSD procedure indicates that solution 3 is significantly different than the other two Consider the hardness testing experiment described in Section 4.1. Suppose that the experiment was conducted as described and the following Rockwell C scale data (coded by subtracting 40 units) obtained: Coupon Tip

3 (a) Analyize the data from this experiment. There is a difference between the means of the four tips. Design Expert Output Response: Hardness ANOVA for Selected Factorial Model Analysis of variance table [Terms added sequentially (first to last)] Sum of Mean F Source Squares DF Square Value Prob > F Bock Model significant A Residual E 003 Cor Total The Model F value of implies the model is significant. There is only a 0.09% chance that a "Model F Value" this large could occur due to noise. Std. Dev R Squared Mean 9.63 Adj R Squared C.V Pred R Squared PRESS 0.5 Adeq Precision Treatment Means (Adjusted, If Necessary) Estimated Mean Standard Error

4 Mean Standard t for H0 Treatment Difference DF Error Coeff=0 Prob > t 1 vs vs vs vs vs vs (b) Use the Fisher LSD method to make comparisons among the four tips to determine specifically which tips differ in mean hardness readings. Based on the LSD bars in the Design Expert plot below, the mean of tip 4 differs from the means of tips 1,, and 3. The LSD method identifies a marginal difference between the means of tips and One Factor Plot 9.95 Hardness A: Tip

5 (c) Analyze the residuals from this experiment. The residual plots below do not identify any violations to the assumptions. Normal Plot of vs. Predicted Normal % Probability Residual Predicted vs. Tip Tip 4.8. A consumer products company relies on direct mail marketing pieces as a major component of its advertising campaigns. The company has three different designs for a new brochure and want to evaluate their effectiveness, as there are substantial differences in costs between the three designs. The company decides to test the three designs by mailing 5,000 samples of each to potential customers in four different regions of the country. Since there are known regional differences in the customer base, regions are considered as blocks. The number of responses to each mailing is shown below.

6 Region Design NE NW SE SW (a) Analyze the data from this experiment. The residuals of the analsysis below identify concerns with the normality and equality of variance assumptions. As a result, a square root transformation was applied as shown in the second ANOVA table. The residuals of both analysis are presented for comparison in part (c) of this problem. The analysis concludes that there is a difference between the mean number of responses for the three designs. Design Expert Output Response: Number of responses ANOVA for Selected Factorial Model Analysis of variance table [Terms added sequentially (first to last)] Sum of Mean F Source Squares DF Square Value Prob > F Block Model significant A Residual Cor Total 1.45E The Model F value of implies the model is significant. There is only a 0.0% chance that a "Model F Value" this large could occur due to noise.

7 Std. Dev R Squared Mean Adj R Squared C.V Pred R Squared PRESS Adeq Precision Treatment Means (Adjusted, If Necessary) Estimated Mean Standard Error Mean Standard t for H0 Treatment Difference DF Error Coeff=0 Prob > t 1 vs vs vs Design Expert Output for Transformed Data Response: Number of responsestransform:square root Constant: 0 ANOVA for Selected Factorial Model Analysis of variance table [Terms added sequentially (first to last)] Sum of Mean F Source Squares DF Square Value Prob > F Block Model significant A Residual Cor Total

8 The Model F value of implies the model is significant. There is only a 0.01% chance that a "Model F Value" this large could occur due to noise. Std. Dev R Squared Mean 18.5 Adj R Squared C.V Pred R Squared PRESS 1.05 Adeq Precision Treatment Means (Adjusted, If Necessary) Estimated Mean Standard Error Mean Standard t for H0 Treatment Difference DF Error Coeff=0 Prob > t 1 vs vs vs < (b) Use the Fisher LSD method to make comparisons among the three designs to determine specifically which designs differ in mean response rate. Based on the LSD bars in the Design Expert plot below, designs 1 and 3 do not differ; however, design is different than designs 1 and 3.

9 One Factor Plot Sqrt(Number of responses) A: Design (c) Analyze the residuals from this experiment. The first set of residual plots presented below represent the untransformed data. Concerns with normality as well as inequality of variance are presented. The second set of residual plots represent transformed data and do not identify significant violations of the assumptions. The residuals vs. design plot indicates a slight inequality of variance; however, not a strong violation and an improvement over the non transformed data. Normal Plot of vs. Predicted Normal % Probability Residual Predicted

10 vs. Design Design The following are the square root transformed data residual plots. Normal Plot of vs. Predicted Normal % Probability Residual Predicted

11 vs. Design Design 4.1. An aluminum master alloy manufacturer produces grain refiners in ingot form. The company produces the product in four furnaces. Each furnace is known to have its own unique operating characteristics, so any experiment run in the foundry that involves more than one furnace will consider furnaces as a nuisance variable. The process engineers suspect that stirring rate impacts the grain size of the product. Each furnace can be run at four different stirring rates. A randomized block design is run for a particular refiner and the resulting grain size data is as follows. Furnace Stirring Rate (a) Is there any evidence that stirring rate impacts grain size? Design Expert Output Response: Grain Size ANOVA for Selected Factorial Model

12 Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Block Model not significant A Residual Cor Total The "Model F value" of 0.85 implies the model is not significant relative to the noise. There is a % chance that a "Model F value" this large could occur due to noise. Std. Dev..95 R Squared 0.13 Mean 7.69 Adj R Squared C.V Pred R Squared PRESS 46.7 Adeq Precision Treatment Means (Adjusted, If Necessary) Estimated Mean Standard Error Mean Standard t for H0 Treatment Difference DF Error Coeff=0 Prob > t 1 vs vs vs

13 vs vs vs The analysis of variance shown above indicates that there is no difference in mean grain size due to the different stirring rates. (b) Graph the residuals from this experiment on a normal probability plot. Interpret this plot. Normal plot of residuals 99 Normal % probability Residual The plot indicates that normality assumption is valid. (c) Plot the residuals versus furnace and stirring rate. Does this plot convey any useful information?

14 vs. Stirring Rate vs. Furnace Stirring Rate Furnace The variance is consistent at different stirring rates. Not only does this validate the assumption of uniform variance, it also identifies that the different stirring rates do not affect variance. (d) What should the process engineers recommend concerning the choice of stirring rate and furnace for this particular grain refiner if small grain size is desirable? There really is no effect due to the stirring rate The effect of five different ingredients (A, B, C, D, E) on reaction time of a chemical process is being studied. Each batch of new material is only large enough to permit five runs to be made. Furthermore, each run requires approximately 1 1/ hours, so only five runs can be made in one day. The experimenter decides to run the experiment as a Latin square so that day and batch effects can be systematically controlled. She obtains the data that follow. Analyze the data from this experiment (use α = 0.05) and draw conclusions. Day Batch A=8 B=7 D=1 C=7 E=3 C=11 E= A=7 D=3 B=8 3 B=4 A=9 C=10 E=1 D=5

15 4 D=6 C=8 E=6 B=6 A=10 5 E=4 D= B=3 A=8 C=8 The Minitab output below identifies the ingredients as having a significant effect on reaction time. Minitab Output General Linear Model Factor Type Levels Values Batch random Day random Catalyst fixed 5 A B C D E Analysis of Variance for Time, using Adjusted SS for Tests Source DF Seq SS Adj SS Adj MS F P Catalyst Batch Day Error Total The yield of a chemical process is being studied. The two most important variables are thought to be the pressure and the temperature. Three levels of each factor are selected, and a factorial experiment with two replicates is performed. The yield data follow: Pressure Temperature

16 (a) Analyze the data and draw conclusions. Use α = Both pressure (A) and temperature (B) are significant, the interaction is not. Response:Surface Finish Design Expert Output ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model significant A B AB Residual Lack of Fit Pure Error Cor Total The Model F value of 8.00 implies the model is significant. There is only a 0.6% chance that a "Model F Value" this large could occur due to noise. Values of "Prob > F" less than indicate model terms are significant. In this case A, B are significant model terms.

17 Values greater than indicate the model terms are not significant. If there are many insignificant model terms (not counting those required to support hierarchy), model reduction may improve your model. (b) Prepare appropriate residual plots and comment on the model s adequacy. The residual plots show no serious deviations from the assumptions. vs. Predicted Normal plot of residuals E Normal % probability E Predicted Residual vs. Temperature vs. Pressure E E Temperature Pressure

18 (c) Under what conditions would you operate this process? DESIGN-EXPERT Plot Yield Interaction Graph Temperature X = A: Pressure Y = B: Temperature Design Points B1 150 B 160 B3 170 Yield Pressure Set pressure at 15 and Temperature at the high level, 170 degrees C, as this gives the highest yield. The standard analysis of variance treats all design factors as if they were qualitative. In this case, both factors are quantitative, so some further analysis can be performed. In Section 5.5, we show how response curves and surfaces can be fit to the data from a factorial experiment with at least one quantative factor. Since both factors in this problem are quantitative and have three levels, we can fit linear and quadratic effects of both temperature and pressure, exactly as in Example 5.5 in the text. The Design Expert output, including the response surface plots, now follows. Response:Surface Finish Design Expert Output ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model < significant A B

19 A < B AB Residual Lack of Fit 7.639E E not significant Pure Error Cor Total The Model F value of implies the model is significant. There is only a 0.01% chance that a "Model F Value" this large could occur due to noise. Values of "Prob > F" less than indicate model terms are significant. In this case A, B, A, B are significant model terms. Values greater than indicate the model terms are not significant. If there are many insignificant model terms (not counting those required to support hierarchy), model reduction may improve your model. Std. Dev. 0.1 R-Squared Mean Adj R-Squared C.V Pred R-Squared PRESS 0.4 Adeq Precision Coefficient Standard 95% CI 95% CI Factor Estimate DF Error Low High VIF Intercept A Pressure B Temperature E A B AB E

20 Final Equation in Terms of Coded Factors: Yield = * A * B 0.41 * A +0.4 * B * A * B Final Equation in Terms of Actual Factors: Yield = * Pressure * Temperature E 003 * Pressure E 003 * Temperature E 004 * Pressure * Temperature

21 Yield 90.7 B: Temperature Yield A: Pressure A: Pres s ure B: Temperature An engineer suspects that the surface finish of a metal part is influenced by the feed rate and the depth of cut. She selects three feed rates and four depths of cut. She then conducts a factorial experiment and obtains the following data: Depth of Cut (in) Feed Rate (in/min)

22 (a) Analyze the data and draw conclusions. Use α = The depth (A) and feed rate (B) are significant, as is the interaction (AB). Response: Surface Finish Design Expert Output ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model < significant A < B < AB Residual Lack of Fit Pure Error Cor Total The Model F value of implies the model is significant. There is only a 0.01% chance that a "Model F Value" this large could occur due to noise. Values of "Prob > F" less than indicate model terms are significant. In this case A, B, AB are significant model terms. (b) Prepare appropriate residual plots and comment on the model s adequacy.

23 The residual plots shown indicate nothing unusual. vs. Predicted Normal plot of residuals Normal % probability Predicted Residual vs. Feed Rate vs. Depth of Cut Feed Rate Depth of Cut (c) Obtain point estimates of the mean surface finish at each feed rate. Feed Rate Average

24 DESIGN-EXPERT Plot Surface Finish X = B: Feed Rate 114 One Factor Plot Warning! Factor involved in an interaction. Actual Factor A: Depth of Cut = Average Surface Finish Feed Rate (d) Find P values for the tests in part (a). The P values are given in the computer output in part (a) For the data in Problem 5.3, compute a 95 percent interval estimate of the mean difference in response for feed rates of 0.0 and 0.5 in/min. We wish to find a confidence interval on μ 1 μ, where μ 1 is the mean surface finish for 0.0 in/min and μ is the mean surface finish for 0.5 in/min. y y t MS E ( n 1) μ 1 μ y1.. y.. + tα, ab( n 1) n α, ab ) MS n E (8.7) ( ) ± (.064) = 16 ± Therefore, the 95% confidence interval for μ μ is ±

25 5.6. An article in Industrial Quality Control (1956, pp. 5 8) describes an experiment to investigate the effect of the type of glass and the type of phosphor on the brightness of a television tube. The response variable is the current necessary (in microamps) to obtain a specified brightness level. The data are as follows: Glass Phosphor Type Type (a) Is there any indication that either factor influences brightness? Use α = Both factors, phosphor type (A) and Glass type (B) influence brightness. Response: Current in microamps Design Expert Output ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model < significant A B <

26 AB Residual Lack of Fit Pure Error Cor Total The Model F value of implies the model is significant. There is only a 0.01% chance that a "Model F Value" this large could occur due to noise. Values of "Prob > F" less than indicate model terms are significant. In this case A, B are significant model terms. (b) Do the two factors interact? Use α = There is no interaction effect. (c) Analyze the residuals from this experiment. The residual plot of residuals versus phosphor content indicates a very slight inequality of variance. It is not serious enough to be of concern, however.

27 vs. Predicted Normal plot of residuals Normal % probability Predicted Residual vs. Glass Type vs. Phosphor Type Glass Type Phosphor Type 5.7. Johnson and Leone (Statistics and Experimental Design in Engineering and the Physical Sciences, Wiley 1977) describe an experiment to investigate the warping of copper plates. The two factors studied were the temperature and the copper content of the plates. The response variable was a measure of the amount of warping. The data were as follows: Copper Content (%) Temperature ( C)

28 50 17,0 16,1 4, 8,7 75 1,9 18,13 17,1 7, ,1 18,1 5,3 30,3 15 1,17 3,1 3, 9,31 (a) Is there any indication that either factor affects the amount of warping? Is there any interaction between the factors? Use α = Both factors, copper content (A) and temperature (B) affect warping, the interaction does not. Response: Warping Design Expert Output ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model < significant A < B AB Residual Lack of Fit Pure Error Cor Total The Model F value of 9.5 implies the model is significant. There is only a 0.01% chance that a "Model F Value" this large could occur due to noise. Values of "Prob > F" less than indicate model terms are significant. In this case A, B are significant model terms.

29 (b) Analyze the residuals from this experiment. There is nothing unusual about the residual plots. Normal plot of residuals vs. Predicted Normal % probability E E Residual Predicted vs. Copper Content vs. Temperature E E Copper Content Temperature (c) Plot the average warping at each level of copper content and compare them to an appropriately scaled t distribution. Describe the differences in the effects of the different levels of copper content on warping. If low warping is desirable, what level of copper content would you specify?

30 Design Expert Output Factor Name Level Low Level High Level A Copper Content B Temperature Average Prediction SE Mean 95% CI low 95% CI high SE Pred 95% PI low 95% PI high Warping Factor Name Level Low Level High Level A Copper Content B Temperature Average Prediction SE Mean 95% CI low 95% CI high SE Pred 95% PI low 95% PI high Warping Factor Name Level Low Level High Level A Copper Content B Temperature Average Prediction SE Mean 95% CI low 95% CI high SE Pred 95% PI low 95% PI high Warping Factor Name Level Low Level High Level A Copper Content B Temperature Average Prediction SE Mean 95% CI low 95% CI high SE Pred 95% PI low 95% PI high

31 Warping Use a copper content of 40 for the lowest warping. MS S = E = = 0.9 b 8 Scaled t Distribution Cu=40 Cu=60 Cu=80 Cu= Warping (d) Suppose that temperature cannot be easily controlled in the environment in which the copper plates are to be used. Does this change your answer for part (c)? Use a copper of content of 40. This is the same as for part (c).

32 DESIGN-EXPERT Plot Warping X = A: Copper Content Y = B: Temperature Design Points Interaction Graph Temperature B1 50 B 75 B3 100 B4 15 Warping Copper Content 5.9. A mechanical engineer is studying the thrust force developed by a drill press. He suspects that the drilling speed and the feed rate of the material are the most important factors. He selects four feed rates and uses a high and low drill speed chosen to represent the extreme operating conditions. He obtains the following results. Analyze the data and draw conclusions. Use α = (A) Feed Rate (B) Drill Speed

33 Design Expert Output Response: Force ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model significant A < B AB Residual E 003 Lack of Fit Pure Error E 003 Cor Total The Model F value of implies the model is significant. There is only a 0.05% chance that a "Model F Value" this large could occur due to noise. Values of "Prob > F" less than indicate model terms are significant. In this case A, B, AB are significant model terms. The factors speed and feed rate, as well as the interaction is important.

34 DESIGN-EXPERT Plot Force Interaction Graph Drill Speed X = B: Feed Rate Y = A: Drill Speed Design Points A1 15 A Force Feed Rate The standard analysis of variance treats all design factors as if they were qualitative. In this case, both factors are quantitative, so some further analysis can be performed. In Section 5.5, we show how response curves and surfaces can be fit to the data from a factorial experiment with at least one quantative factor. Since both factors in this problem are quantitative, we can fit polynomial effects of both speed and feed rate, exactly as in Example 5.5 in the text. The Design Expert output with only the significant terms retained, including the response surface plots, now follows. Response: Force Design Expert Output ANOVA for Selected Factorial Model Analysis of variance table [Partial sum of squares] Sum of Mean F Source Squares DF Square Value Prob > F Model significant A B B Residual E 003 Lack of Fit significant Pure Error E 003 Cor Total

35 The Model F value of implies the model is significant. There is only a 0.08% chance that a "Model F Value" this large could occur due to noise. Values of "Prob > F" less than indicate model terms are significant. In this case A, B are significant model terms. Values greater than indicate the model terms are not significant. If there are many insignificant model terms (not counting those required to support hierarchy), model reduction may improve your model. Std. Dev R-Squared Mean.77 Adj R-Squared C.V..9 Pred R-Squared PRESS 0.14 Adeq Precision 9.69 Coefficient Standard 95% CI 95% CI Factor Estimate DF Error Low High VIF Intercept A Drill Speed B Feed Rate B Final Equation in Terms of Coded Factors: Force = * A * B * B Final Equation in Terms of Actual Factors:

36 Force = E 003 * Drill Speed * Feed Rate * Feed Rate Force B: Feed Rate Force A: Drill Speed B: Feed Rate A: Drill Speed

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

Chapter 2 Simple Comparative Experiments Solutions

Solutions from Montgomery, D. C. () Design and Analysis of Experiments, Wiley, NY Chapter Simple Comparative Experiments Solutions - The breaking strength of a fiber is required to be at least 5 psi. Past

Data and Regression Analysis. Lecturer: Prof. Duane S. Boning. Rev 10

Data and Regression Analysis Lecturer: Prof. Duane S. Boning Rev 10 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3.

Multiple Regression YX1 YX2 X1X2 YX1.X2

Multiple Regression Simple or total correlation: relationship between one dependent and one independent variable, Y versus X Coefficient of simple determination: r (or r, r ) YX YX XX Partial correlation:

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

Full Factorial Design of Experiments

Full Factorial Design of Experiments 0 Module Objectives Module Objectives By the end of this module, the participant will: Generate a full factorial design Look for factor interactions Develop coded orthogonal

Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005 Philip J. Ramsey, Ph.D., Mia L. Stephens, MS, Marie Gaudard, Ph.D. North Haven Group, http://www.northhavengroup.com/

e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 08/11/2016 Structure This Week What is a linear model? How

Unit 6: A Latin Square Design

Minitab Notes for STAT 605 Dept. of Statistics CSU East Bay Unit 6: A Latin Square Design 6.. The Data An oil company tested four different blends of gasoline for fuel efficiency according to a Latin square

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

Ch. 12 The Analysis of Variance. can be modelled as normal random variables. come from populations having the same variance.

Ch. 12 The Analysis of Variance Residual Analysis and Model Assessment Model Assumptions: The measurements X i,j can be modelled as normal random variables. are statistically independent. come from populations

In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a

Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects

The Simple Linear Regression Model: Specification and Estimation

Chapter 3 The Simple Linear Regression Model: Specification and Estimation 3.1 An Economic Model Suppose that we are interested in studying the relationship between household income and expenditure on

Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

INTERPRETING THE REPEATED-MEASURES ANOVA

INTERPRETING THE REPEATED-MEASURES ANOVA USING THE SPSS GENERAL LINEAR MODEL PROGRAM RM ANOVA In this scenario (based on a RM ANOVA example from Leech, Barrett, and Morgan, 2005) each of 12 participants

Coefficient of Determination

Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the

Residuals. Residuals = ª Department of ISM, University of Alabama, ST 260, M23 Residuals & Minitab. ^ e i = y i - y i

A continuation of regression analysis Lesson Objectives Continue to build on regression analysis. Learn how residual plots help identify problems with the analysis. M23-1 M23-2 Example 1: continued Case

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

12.13 RESIDUAL ANALYSIS IN MULTIPLE REGRESSION (OPTIONAL)

12.13 Analysis in Multiple Regression (Optional) 1 Although Excel and MegaStat are emphasized in Business Statistics in Practice, Second Canadian Edition, some examples in the additional material on Connect

One-Way ANOVA using SPSS 11.0. SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

1 One-Way ANOVA using SPSS 11.0 This section covers steps for testing the difference between three or more group means using the SPSS ANOVA procedures found in the Compare Means analyses. Specifically,

Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

Minitab Tutorials for Design and Analysis of Experiments. Table of Contents

Table of Contents Introduction to Minitab...2 Example 1 One-Way ANOVA...3 Determining Sample Size in One-way ANOVA...8 Example 2 Two-factor Factorial Design...9 Example 3: Randomized Complete Block Design...14

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

Topic 9. Factorial Experiments [ST&D Chapter 15]

Topic 9. Factorial Experiments [ST&D Chapter 5] 9.. Introduction In earlier times factors were studied one at a time, with separate experiments devoted to each factor. In the factorial approach, the investigator

Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

HOW TO USE MINITAB: DESIGN OF EXPERIMENTS. Noelle M. Richard 08/27/14

HOW TO USE MINITAB: DESIGN OF EXPERIMENTS 1 Noelle M. Richard 08/27/14 CONTENTS 1. Terminology 2. Factorial Designs When to Use? (preliminary experiments) Full Factorial Design General Full Factorial Design

Name: Student ID#: Serial #:

STAT 22 Business Statistics II- Term3 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS Department Of Mathematics & Statistics DHAHRAN, SAUDI ARABIA STAT 22: BUSINESS STATISTICS II Third Exam July, 202 9:20

Design and analysis of experiments

Tropical animal feeding: a manual for research workers 155 Chapter 8 Design and analysis of experiments This chapter was contributed by Andrew Speedy, University of Oxford, UK. The objective is to assist

Simple Linear Regression

Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

Demonstrating Systematic Sampling

Demonstrating Systematic Sampling Julie W. Pepe, University of Central Florida, Orlando, Florida Abstract A real data set involving the number of reference requests at a university library, will be used

Multiple Regression Analysis in Minitab 1

Multiple Regression Analysis in Minitab 1 Suppose we are interested in how the exercise and body mass index affect the blood pressure. A random sample of 10 males 50 years of age is selected and their

How To Run Statistical Tests in Excel

How To Run Statistical Tests in Excel Microsoft Excel is your best tool for storing and manipulating data, calculating basic descriptive statistics such as means and standard deviations, and conducting

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

Inference for Regression

Simple Linear Regression Inference for Regression The simple linear regression model Estimating regression parameters; Confidence intervals and significance tests for regression parameters Inference about

Analysis of Covariance

Analysis of Covariance 1. Introduction The Analysis of Covariance (generally known as ANCOVA) is a technique that sits between analysis of variance and regression analysis. It has a number of purposes

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Polynomial Regression POLYNOMIAL AND MULTIPLE REGRESSION Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model. It is a form of linear regression

12-1 Multiple Linear Regression Models

12-1.1 Introduction Many applications of regression analysis involve situations in which there are more than one regressor variable. A regression model that contains more than one regressor variable is

CHAPTER 9: SERIAL CORRELATION

Serial correlation (or autocorrelation) is the violation of Assumption 4 (observations of the error term are uncorrelated with each other). Pure Serial Correlation This type of correlation tends to be

Lecture - 32 Regression Modelling Using SPSS

Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 32 Regression Modelling Using SPSS (Refer

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

Multiple Regression in SPSS STAT 314

Multiple Regression in SPSS STAT 314 I. The accompanying data is on y = profit margin of savings and loan companies in a given year, x 1 = net revenues in that year, and x 2 = number of savings and loan

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

International Statistical Institute, 56th Session, 2007: Phil Everson

Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

SELF-TEST: SIMPLE REGRESSION

ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

Confidence Intervals for the Difference Between Two Means

Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

REGRESSION LINES IN STATA

REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

Analysis of Variance. MINITAB User s Guide 2 3-1

3 Analysis of Variance Analysis of Variance Overview, 3-2 One-Way Analysis of Variance, 3-5 Two-Way Analysis of Variance, 3-11 Analysis of Means, 3-13 Overview of Balanced ANOVA and GLM, 3-18 Balanced

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1 > catch

Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009

Petter Mostad Matematisk Statistik Chalmers Suggested solution for exam in MSA830: Statistical Analysis and Experimental Design October 2009 1. (a) To use a t-test, one must assume that both groups of

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Open book and note Calculator OK Multiple Choice 1 point each MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Find the mean for the given sample data.

Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

Parameter Setting for Rod Push Production Process on Multiple Responses : A Case Study in Motorcycle Parts Factory

Research rticle Parameter Setting for Rod Push Production Process on Multiple Responses : Case Study in Motorcycle Parts Factory Erawin Thavorn * and Prapaisri Sudasna-na-yudthya epartment of Industrial

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

0.1 Multiple Regression Models

0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

PARAMETER OPTIMIZATION IN VERTICAL MACHINING CENTER CNC FOR EN45 (STEEL ALLOY) USING RESPONSE SURFACE METHODOLOGY

International Journal of Mechanical Engineering and Technology (IJMET) Volume 7, Issue 2, March-April 2016, pp. 288 299, Article ID: IJMET_07_02_031 Available online at http://www.iaeme.com/ijmet/issues.asp?jtype=ijmet&vtype=7&itype=2

Multicollinearity in Regression Models

00 Jeeshim and KUCC65. (003-05-09) Multicollinearity.doc Introduction Multicollinearity in Regression Models Multicollinearity is a high degree of correlation (linear dependency) among several independent

Determining Optimal Moulding Process Parameters by Two Level Factorial Design with Center Points

Determining Optimal Moulding Process Parameters by Two Level Factorial Design with Center Points S.Rajalingam 1* Awang Bono 2 and Jumat bin Sulaiman 3 1 School of Engineering and Science, Curtin University,

Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21

1 Factorial ANOVA The ANOVA designs we have dealt with up to this point, known as simple ANOVA or oneway ANOVA, had only one independent grouping variable or factor. However, oftentimes a researcher has

ACTM State Exam-Statistics

ACTM State Exam-Statistics For the 25 multiple-choice questions, make your answer choice and record it on the answer sheet provided. Once you have completed that section of the test, proceed to the tie-breaker

Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

A. Karpinski

Chapter 3 Multiple Linear Regression Page 1. Overview of multiple regression 3-2 2. Considering relationships among variables 3-3 3. Extending the simple regression model to multiple predictors 3-4 4.

Soci708 Statistics for Sociologists

Soci708 Statistics for Sociologists Module 11 Multiple Regression 1 François Nielsen University of North Carolina Chapel Hill Fall 2009 1 Adapted from slides for the course Quantitative Methods in Sociology

Correlation and Regression

Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look

4.4. Further Analysis within ANOVA

4.4. Further Analysis within ANOVA 1) Estimation of the effects Fixed effects model: α i = µ i µ is estimated by a i = ( x i x) if H 0 : µ 1 = µ 2 = = µ k is rejected. Random effects model: If H 0 : σa

Paired Differences and Regression

Paired Differences and Regression Students sometimes have difficulty distinguishing between paired data and independent samples when comparing two means. One can return to this topic after covering simple

Testing for Lack of Fit

Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

31. SIMPLE LINEAR REGRESSION VI: LEVERAGE AND INFLUENCE

31. SIMPLE LINEAR REGRESSION VI: LEVERAGE AND INFLUENCE These topics are not covered in the text, but they are important. Leverage If the data set contains outliers, these can affect the leastsquares fit.

This paper describes an interactive spreadsheet-based tool that can be used to generate data representative

Vol. 8, No. 2, January 2008, pp. 55 64 issn 1532-0545 08 0802 0055 informs I N F O R M S Transactions on Education An Interactive Spreadsheet-Based Tool to Support Teaching Design of Experiments S. T.

10. Analysis of Longitudinal Studies Repeat-measures analysis

Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.

Assessing Forecasting Error: The Prediction Interval. David Gerbing School of Business Administration Portland State University

Assessing Forecasting Error: The Prediction Interval David Gerbing School of Business Administration Portland State University November 27, 2015 Contents 1 Prediction Intervals 1 1.1 Modeling Error..................................

August 2012 EXAMINATIONS Solution Part I

August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

An analysis method for a quantitative outcome and two categorical explanatory variables.

Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

Week 5: Multiple Linear Regression

BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

CRJ Doctoral Comprehensive Exam Statistics Friday August 23, :00pm 5:30pm

CRJ Doctoral Comprehensive Exam Statistics Friday August 23, 23 2:pm 5:3pm Instructions: (Answer all questions below) Question I: Data Collection and Bivariate Hypothesis Testing. Answer the following

Topic 6: Randomized Complete Block Designs (RCBD's) [ST&D sections (except 9.6) and section 15.8]

Topic 6: Randomized Complete Block Designs (RCBD's) [ST&D sections 9. 9.7 (except 9.6) and section 5.8] 6.. Variability in the completely randomized design (CRD) In the CRD, it is assumed that all experimental

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

GLOSSARY FOR DESIGN OF EXPERIMENTS

GLOSSARY FOR DESIGN OF EXPERIMENTS Adaptation provenant du site Internet http://www.mathoptions.com ANOVA -- An acronym for "ANalysis Of VAriance," a statistical technique that separates the variation

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

Two-level designs. Table 1. Factors and Factor Levels

Two-level designs In this exercise, we will focus on the analysis of an unreplicated full factorial twolevel design, typically referred to as a 2 k design k factors, all crossed, with two levels each.

Lecture Notes Module 1

Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

CHAPTER 2 AND 10: Least Squares Regression

CHAPTER 2 AND 0: Least Squares Regression In chapter 2 and 0 we will be looking at the relationship between two quantitative variables measured on the same individual. General Procedure:. Make a scatterplot

Regression Analysis: Basic Concepts

The simple linear model Regression Analysis: Basic Concepts Allin Cottrell Represents the dependent variable, y i, as a linear function of one independent variable, x i, subject to a random disturbance

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

An example ANOVA situation. 1-Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters)

An example ANOVA situation Example (Treating Blisters) 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment