CHAPTER 2 AND 10: Least Squares Regression


 Adrian Pope
 1 years ago
 Views:
Transcription
1 CHAPTER 2 AND 0: Least Squares Regression In chapter 2 and 0 we will be looking at the relationship between two quantitative variables measured on the same individual. General Procedure:. Make a scatterplot and describe the form, direction and strength of the relationship. Note: fitting a line only makes sense if the overall pattern of the scatterplot is roughly linear. 2. Look for outliers and influential observations on the scatterplot. a. Note: Inference is not safe if there are influential points as results depend strongly on these few points. It is often helpful to rework the data without the influential variables and compare the results. 3. Find the correlation r to get a numerical measure of direction and strength of the linear relationship. 4. Find r 2, the fraction of variation in the values of y that is explained by the least squares regression of y on x. 5. If the data is reasonably linear, find the least squares regression line for the data. Note: The line can be used to predict y for a given x. 6. Make a residual plot and normal probability plot to check the regression assumptions. 7. If and only if your data was collected using random sampling techniques, you can look at the hypothesis tests and confidence interval for the correlation, slope and intercept. 8. If and only if your data was collected using random sampling techniques, you can look at the hypothesis tests and confidence intervals for the mean response and prediction intervals.
2 Association Between Variables: Two variables measured on the same individuals are associated if some values of one variable tend to occur more often with some values of the second variable than with other values of that variable. Just because two variables are associated doesn t mean that a change in one variable causes a change in the other. (Causation section 2.6) Also, the relationship between two variables might not tell the whole story. Other variables may affect the relationship. These other variables are called lurking variables. Positive association: When above average values of one variable tend to accompany aboveaverage values of the other, and below average values also tend to occur together. Negative association: When above average values of one variable tend to accompany below average values of the other and visa versa. No association: Hard to find a pattern showing a relationship between the variables. Response variable: measures an outcome of a study. Dependent variable Y Explanatory variable: explains or causes changes in the response variable. Independent variable X 2
3 Example : A forester has become adept at estimating the volume (in cubic feet) of trees on a particular site prior to a timber sale. Since his operation has now expanded, he would like to train another person to assist in estimating the cubic foot volume of trees. He decided to create a model that will allow him to obtain the actual tree volume based on his assistant s estimation. The forester selects a random sample of trees to be felled. For each tree, the assistant is to guess the cubic foot volume of the tree. The forester also obtains the actual foot volume after the tree has been chopped down. Below is his data: Tree Estimated Volume Actual Volume STEP : Make a scatterplot; describe the form, direction and strength of the relationship. Before doing the scatterplot you need to decide which variable is the explanatory variable and which is the response variable. For Example, identify the explanatory and the response variables. Explanatory Variable: Response Variable: A scatterplot shows the relationship between two quantitative variables measured on the same individual. The explanatory variable is plotted on the x axis; the response variable is plotted on the y axis. Look at the overall pattern. The overall pattern can be described by form, direction and strength. Form: is the scatterplot linear, quadratic, etc. Direction: is the association positive or negative? Strength: of the relationship. Describe the scatterplot in Example. Form: Strength: Direction: 3
4 Scatterplot Using SPSS: >Graphs >Scatter/Dot Select simple and click define Pull estimate into the X Axis box and actual into the Y Axis box then click OK. Note: To get the fitted line you need to double click on your graph to bring up the chart editor. You will then need to click on a button that looks like a sdatterplot with a fitted line through it. Select linear and then close. Estimated Volume versus Actual Volume of Trees Actual R Sq Linear = Estimate STEP 2: Look for outliers and influential observations on the scatterplot. Look for striking deviations from the overall pattern. Outlier: An observation that lies outside of the overall pattern of the other observations. Points that are outliers in the y direction of a scatterplot have large regression residuals, but other outliers need not have large residuals. Influential observations: an observation that if removed would markedly change the results of the regression calculation. Points that are outliers in the x direction of a scatterplot are often influential for the least squares regression line. Are there any outliers or influential observations in our data? Note: To add a categorical variable to a scatterplot, use a different plot color or symbol for each category. 4
5 STEP 3: Find the correlation r. The correlation measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r. x ix yiy r n s x s y Properties of correlation: It makes no difference which variable you call x and which you call y since correlation does not make use of the distinction between the explanatory variable and the response variable. Both variables need to be quantitative to calculate correlation. The correlation r does not change if we change the units of measurements of x, y, or both. A positive r corresponds to a positive relationship between the variables. A negative r corresponds to a negative relationship between the variables. ( r ) Values near 0 indicate a weak relationship and values close to or indicate a strong relationship. Correlation measures the strength of only a linear relationship. Like the mean and standard deviation, the correlation is not resistant. The correlation r is strongly affected by a few outlying observations. Use r with caution when outliers appear in the scatterplot. Correlation is not a complete description of two variable data. You should give the mean and standard deviations of both x and y along with the correlation. For Example, find the correlation between estimated volume and actual volume. We will use the Pearson Correlation output from SPSS here. 5
6 Note: The SPSS manual tells you where to find r using the least squares regression output, but this r is actually the ABSOLUTE VALUE OF r, so you need to figure out the sign yourself by looking at the association (positive or negative) of your data. Estimate Actual Correlations Pearson Correlation Sig. (2tailed) N Pearson Correlation Sig. (2tailed) N Estimate Actual.936** ** **. Correlation is significant at the 0.0 level (2tailed). STEP 4: Find r 2. r 2 is the percent of variation in y explained by the regression line (the closer to 00%, the better). We can get this from the regression output by squaring the correlation r. For Example, find the percent of variation in actual volume of trees explained by the regression line. STEP 5: Find the least squares regression line for the data. The least squares regression line of y on x is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible. We have data on an explanatory variable x and a response variable y for n individuals. The means and standard deviations of the data are x and s x for x and y and s y for y; the correlation between x and y is r. The regression Model for the population is: yi 0 x i The sample prediction equation of the least squares regression is: y b bx 0 The slope is: sy b r s x Where the slope measures the amount of change caused in the response variable when the explanatory variable is increased by one unit. The intercept is: b0 y bx 6
7 Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click OK. Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.936 a a. Predictors: (Constant), Estimate Model Regression Residual Total a. Predictors: (Constant), Estimate b. Dependent Variable: Actual ANOVA b Sum of Squares df Mean Square F Sig a Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients B Std. Error Beta t Sig For Example, find the least squares regression line. Based on the r 2 value you found previously, do you think this line will be useful for predicting actual tree volumes? For Example, use the regression line to predict the actual volume of a tree with an estimated volume of 3 cubic feet? We can use a regression line to make predictions as long as we follow the following rules: Only use the least squares regression line to find y for a specific value of x. (Don t use it to find x for a specific value of y!) Extrapolation involves using the line to find y values corresponding to x values that are outside the range of our data x values. Typically we want to avoid this since the line may not be valid for wide ranges of x values. 7
8 Example 2: (From Moore and McCabe fourth edition) During the period after birth, a male white rat gains 40 grams (g) per week till about 0 weeks of age. (This is unusually regular in his growth, but 40g per week is a realistic rate.) a. If the rat weighed 00g at birth, give an equation for his weight after a week. What is the slope of this line? b. Would you be willing to use this line to predict the rat s weight at age 2 years: Do the prediction and think about the reasonableness of the results? (There are 454 g in a pound. To help you asses the results, a large cat weighs about 0 pounds. Prediction Intervals Predicting a future observation under conditions similar to those used in the study. Since there is variability involved in using a model created from sample data, a prediction interval is better than a single prediction. They re related to confidence intervals. Use SPSS to calculate these intervals. Residual: The vertical distance between the observed y value and the corresponding predicted y value. residual e y y y ( b b x ) i i i i 0 i For example 2, find the residual for tree number and tree number 7. Assumptions for Regression Inference and Regression Model:. Repeated responses of y are independent of each other. This basically means the data comes from a simple random sample. (To check this assumption examine the way in which the units were selected.) 2. For any fixed value of x, the response y varies according to a normal distribution. (To check the assumption of normality you can do a normal probability plot of the residuals on SPSS). 3. The relationship is linear. (To check the linearity assumption, you can make a scatterplot or a residual plot of the data.) 4. The standard deviation of y is the same for all values of x. The value of is unknown. (To check for constant variability you can look at the residual plot of the data.) 8
9 STEP 6: Make a residual plot and normal probability plot to check the regression assumptions. It is always important to check that the assumptions of the regression model have been met to determine whether your results are valid. This is also important to do before you proceed with inference. Normal Probability plots: If your points fall in a relatively straight line, then you can assume that your response is relatively normal and the second assumption has been met. To check the normality assumption we make a normal probability plot by doing the following: Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Then select plots and click on the box for normal probability plot and click continue followed by OK. Normal PP Plot of Regression Standardized Residual.0 Dependent Variable: Actual 0.8 Expected Cum Prob Observed Cum Prob For example 2, has the normality assumption has been met? Residual Plots: A residual plot is a scatterplot of the regression residuals against the explanatory variable. It is used to assess the fit of a regression line and to check for a constant variability. The residual plot magnifies the deviations from the line to make the pattern easy to read. If the points are random with no pattern and approximately the same number of points above and below the center line, you can feel confident that assumptions three and four have been met. If you have a funnel shape this shows you that the assumption of constant variance has not been met. If you have some other pattern like a parabola, this shows you that the linearity assumption has not been met. 9
10 Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click on the save button, check unstandardized residuals, and click continue. The residuals will appear in the data editor. Make a scatterplot of the residuals on the y axis against the estimated volume on the x axis. To get the line at y=0, when in the chart editor right click on the graph and select Add Y Axis reference line. Then select reference line and plug in a zero for y axis position. Residual Plot For example 2, have the assumptions of linearity and constant variability been met? Unstandardized Residual Estimate Lastly, it is important to check for outliers and influential variables. Looking for high residuals or points that are far from the other points is important. Often we will want to do the analysis both with and without the outliers, particularly if they are influential variables as well. Are there any outliers or influential variables in example 2? 0
11 Scatterplot Without the Influential Variable Actual Derived from Actual 4.00 R Sq Linear = Estimate Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.86 a a. Predictors: (Constant), Estimate Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients B Std. Error Beta t Sig
12 STEP 7: Look at the hypothesis tests and confidence intervals Up until this point we have looked at some regression related concepts that can be used in an exploratory data analysis setting as well as a more formal setting. We will now look at inference for regression. Before we do this, however, it must be understood that the tests and confidence intervals that we find from now on out can only be found on data that has been collected using a random sampling technique such as simple random sampling. If we did not collect our data using a random sample, or if we have conducted a census, these techniques are meaningless. Test for a Zero Population Correlation: State null and alternative hypotheses H : 0 versus 0 a : 0 a Find the test statistic r n 2 t 2 r H, H : 0 or H : 0 where n is the sample size and r is the sample correlation Calculate the P value in terms of a random variable T having the ( 2) The P value for a test of H 0 against H a : 0 is PT ( t) H a : 0 is PT ( t) H : 0 is 2 P( T t ) a Compare the P value to the α level If P value α, then reject H 0 If P value > α, fail to then reject State your conclusions in terms of the problem H 0 a tn distribution. For example, test H : 0 versus 0 H a : 0 2
13 Confidence Intervals for Regression Slope and Intercept: A level C confidence interval for the intercept is 0 b t SE * 0 b 0 A level C confidence interval for the slope is b t SE * b SPSS will also give you these confidence intervals for 95%, but you may have to use the estimates for the coefficients and their standard errors to find the other confidence intervals. (Use the t table and n 2 degrees of freedom to get t*). Using SPSS: >Analyze >Regression >linear. Put estimate into the independent box and actual into the dependent box. Click on statistics and select confidence intervals and click continue followed by OK Model (Constant) Estimate a. Dependent Variable: Actual Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta For example 2, find the 95% and 99% confidence intervals for slope and y intercept. 3
14 Hypothesis test for the regression slope: State the null and Alternative Hypotheses H0 : 0 versus Ha : 0, a : 0 Find the test statistic t b, with df = n 2 SEb H or H : 0 (SPSS will give you the test statistic) Calculate the P value (SPSS will give you the 2 sided P value. If you have a one sided test, you will have to divide the P value by 2). Compare the P value to the α level If P value α, then reject H 0 If P value > α, then fail to reject H State your conclusions in terms of the problem 0 a The test statistic for correlation is numerically identical to the test statistic used to test slope. Therefore, you can read the test statistic and P value off of the SPSS output for the slope when doing a test for the correlation. For example 2, perform a significance test to see whether the slope of the regression line is positive. 4
15 Example 4: This example will use data that is part of a data set from Dr. T.N.K. Raju, Department of Neonatology, University of Illinois at Chicago. IMR=Infant Mortality rate PQLI=Physical Quality of life Index (Indicator of average wealth) Case PQLI IMR Case PQLI IMR How does the physical quality of life index affect infant mortality rate? Answer the questions below based on the output that follows. a. Describe the form, direction and strength of the relationship. b. What is the correlation? c. What percent of the variation in infant mortality rate is explained by the regression line? d. Give an estimate for the standard deviation of the model. (Find s.) e. Do a hypothesis test to test H : 0 0 versus H : 0. a f. What is the equation of the least squares regression line? g. Use the regression line to predict a PQLI of 25. 5
16 h. Is the prediction in part 6 good? Why? i. Find the residual for case. j. Find a 99% confidence interval for the slope. k. What assumptions need to be met for the above to be of use? How Physical Quality of Life affects Infant Mortality Rate 30 0 IMR PQLI Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.300 a a. Predictors: (Constant), PQLI Model (Constant) PQLI a. Dependent Variable: IMR Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta
17 Residual Plot Unstandardized Residual PQLI Normal PP Plot of Regression Standardized Residual.0 Dependent Variable: IMR 0.8 Expected Cum Prob Observed Cum Prob Does the relationship make sense? For example, does it make sense that the infant mortality rate will go up as the physical quality of life index gets better? What could be a potential lurking variable here? Now let s look at what happens if we add a categorical variable to the picture. Case PQLI IMR Location 7 0 rural rural rural urban rural urban rural urban urban rural urban 7
18 40 How Physical Quality of Life Affects Infant Mo LOCATION urban IMR rural PQLI 8
Chapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationSimple Linear Regression in SPSS STAT 314
Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a stepbystep guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationSimple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
More information, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.
BA 275 Review Problems  Week 9 (11/20/0611/24/06) CD Lessons: 69, 70, 1620 Textbook: pp. 520528, 111124, 133141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An
More informationLesson Lesson Outline Outline
Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and
More informationChapter 5. Regression
Topics covered in this chapter: Chapter 5. Regression Adding a Regression Line to a Scatterplot Regression Lines and Influential Observations Finding the Least Squares Regression Model Adding a Regression
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means Oneway ANOVA To test the null hypothesis that several population means are equal,
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationSIMPLE REGRESSION ANALYSIS
SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two
More informationDoing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:
Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, TTESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationBivariate Regression Analysis. The beginning of many types of regression
Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression
More informationExercise 1.12 (Pg. 2223)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationMind on Statistics. Chapter 3
Mind on Statistics Chapter 3 Section 3.1 1. Which one of the following is not appropriate for studying the relationship between two quantitative variables? A. Scatterplot B. Bar chart C. Correlation D.
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationMultiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
More informationStatistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!
Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare
More informationLinear Regression in SPSS
Linear Regression in SPSS Data: mangunkill.sav Goals: Examine relation between number of handguns registered (nhandgun) and number of man killed (mankill) checking Predict number of man killed using number
More informationCorrelation and Regression Analysis: SPSS
Correlation and Regression Analysis: SPSS Bivariate Analysis: Cyberloafing Predicted from Personality and Age These days many employees, during work hours, spend time on the Internet doing personal things,
More informationSydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.
Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under
More informationLinear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares
Linear Regression Chapter 5 Regression Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). We can then predict the average response for all subjects
More informationSPSS for Exploratory Data Analysis Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav)
Data used in this guide: studentp.sav (http://people.ysu.edu/~gchang/stat/studentp.sav) Organize and Display One Quantitative Variable (Descriptive Statistics, Boxplot & Histogram) 1. Move the mouse pointer
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationBivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2
Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS ttest X 2 X 2 AOVA (Ftest) ttest AOVA
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationUsing Minitab for Regression Analysis: An extended example
Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationYiming Peng, Department of Statistics. February 12, 2013
Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop
More informationFormula for linear models. Prediction, extrapolation, significance test against zero slope.
Formula for linear models. Prediction, extrapolation, significance test against zero slope. Last time, we looked the linear regression formula. It s the line that fits the data best. The Pearson correlation
More informationLecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation
Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage
More informationExample: Boats and Manatees
Figure 96 Example: Boats and Manatees Slide 1 Given the sample data in Table 91, find the value of the linear correlation coefficient r, then refer to Table A6 to determine whether there is a significant
More informationChapter 10. Analysis of Covariance. 10.1 Multiple regression
Chapter 10 Analysis of Covariance An analysis procedure for looking at group effects on a continuous outcome when some other continuous explanatory variable also has an effect on the outcome. This chapter
More informationAn analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression
Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationChapter 2 Probability Topics SPSS T tests
Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the OneSample T test has been explained. In this handout, we also give the SPSS methods to perform
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationSPSS: Descriptive and Inferential Statistics. For Windows
For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 ChiSquare Test... 10 2.2 T tests... 11 2.3 Correlation...
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationThe importance of graphing the data: Anscombe s regression examples
The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 3031, 2008 B. Weaver, NHRC 2008 1 The Objective
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationRegression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.
Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between
More informationPearson s correlation
Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationIntroductory Statistics Notes
Introductory Statistics Notes Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 354870348 Phone: (205) 3484431 Fax: (205) 3488648 August
More informationThe aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree
PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and
More informationChapter 9. Section Correlation
Chapter 9 Section 9.1  Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation
More informationLinear Models in STATA and ANOVA
Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 42 A Note on NonLinear Relationships 44 Multiple Linear Regression 45 Removal of Variables 48 Independent Samples
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationSimple Regression and Correlation
Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas
More informationChapter 10  Practice Problems 1
Chapter 10  Practice Problems 1 1. A researcher is interested in determining if one could predict the score on a statistics exam from the amount of time spent studying for the exam. In this study, the
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationHomework 8 Solutions
Math 17, Section 2 Spring 2011 Homework 8 Solutions Assignment Chapter 7: 7.36, 7.40 Chapter 8: 8.14, 8.16, 8.28, 8.36 (ad), 8.38, 8.62 Chapter 9: 9.4, 9.14 Chapter 7 7.36] a) A scatterplot is given below.
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationRelationships Between Two Variables: Scatterplots and Correlation
Relationships Between Two Variables: Scatterplots and Correlation Example: Consider the population of cars manufactured in the U.S. What is the relationship (1) between engine size and horsepower? (2)
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationABSORBENCY OF PAPER TOWELS
ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationOnce saved, if the file was zipped you will need to unzip it.
1 Commands in SPSS 1.1 Dowloading data from the web The data I post on my webpage will be either in a zipped directory containing a few files or just in one file containing data. Please learn how to unzip
More informationCourse Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.
SPSS Regressions Social Science Research Lab American University, Washington, D.C. Web. www.american.edu/provost/ctrl/pclabs.cfm Tel. x3862 Email. SSRL@American.edu Course Objective This course is designed
More informationWe are often interested in the relationship between two variables. Do people with more years of fulltime education earn higher salaries?
Statistics: Correlation Richard Buxton. 2008. 1 Introduction We are often interested in the relationship between two variables. Do people with more years of fulltime education earn higher salaries? Do
More informationUsing R for Linear Regression
Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationREGRESSION LINES IN STATA
REGRESSION LINES IN STATA THOMAS ELLIOTT 1. Introduction to Regression Regression analysis is about eploring linear relationships between a dependent variable and one or more independent variables. Regression
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationTechnology StepbyStep Using StatCrunch
Technology StepbyStep Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate
More informationChapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationAn analysis method for a quantitative outcome and two categorical explanatory variables.
Chapter 11 TwoWay ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that
More informationEPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM
EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable
More informationAP Statistics 2001 Solutions and Scoring Guidelines
AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationFEV1 (litres) Figure 1: Models for gas consumption and lung capacity
Simple Linear Regression: Reliability of predictions Richard Buxton. 2008. 1 Introduction We often use regression models to make predictions. In Figure 1 (a), we ve fitted a model relating a household
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationRegression stepbystep using Microsoft Excel
Step 1: Regression stepbystep using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationIndependent t Test (Comparing Two Means)
Independent t Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent ttest when to use the independent ttest the use of SPSS to complete an independent
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationStatistics and research
Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,
More informationSPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stemandleaf plots and extensive descriptive statistics. To run the Explore procedure,
More informationPredictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014
Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands March 2014 Chalie Patarapichayatham 1, Ph.D. William Fahle 2, Ph.D. Tracey R. Roden 3, M.Ed. 1 Research Assistant Professor in the
More informationKSTAT MINIMANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINIMANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationAP Statistics Section :12.2 Transforming to Achieve Linearity
AP Statistics Section :12.2 Transforming to Achieve Linearity In Chapter 3, we learned how to analyze relationships between two quantitative variables that showed a linear pattern. When twovariable data
More information