# Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Save this PDF as:

Size: px
Start display at page:

Download "Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology"

## Transcription

1 Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of Mississippi October 8, 009 Outline Overview, background, and some terms Simple regression in SPSS Correlation analysis in SPSS Overview of multiple regression in SPSS This talk is intended as an overview and obviously leaves out a number of very important concepts and issues. October 8, 009 Regression in SPSS

2 Overview, background, and some terms Introduction to regression analysis Linear regression vs. other types of regression There are many varieties of regression analysis. When used without qualification, regression usually refers to linear regression with estimation performed using ordinary least squares (OLS) procedures. Linear regression analysis is used for evaluating the relationship between one or more IVs (classically continuous, although in practice they can be discrete) and a single, continuous DV. The basics of conducting linear regression with SPSS is the focus of this talk. Regression analysis is a statistical tool for evaluating the relationship of one of more independent variables X, X,, X k to a single, continuous dependent variable Y. Is a very useful tool with many applications. Association versus causality The finding of a statistically significant association in a particular study (no matter how well done) does not establish a causal relationship. Although association is necessary for causality, it is not sufficient for causality. Association does not imply causality! True cause may be another (unmeasured?) variable. Much work has been published on causal inference making; this lecture is not directly about this topic. October 8, 009 Regression in SPSS 4

3 Introduction to regression analysis A model describes the relationship between variables; when we use regression analysis, we are developing statistical models. In a study relating blood pressure to age, it is unlikely that persons of the same age will have exactly the same observed blood pressure. This doesn t mean that we can t conclude that, on average, blood pressure increases with age or that we can t predict the expected blood pressure for a given age with an associated amount of variability. Statistical models versus deterministic models. October 8, 009 Regression in SPSS 5 Introduction to regression analysis Formal representation Mathematical model Y = f ( X) + e Response variable Model function Random error: Makes this a statistical model Y: dependent variable, response X: independent variables, explanatory variables, predictors October 8, 009 Regression in SPSS 6 3

4 Introduction to the GLM GLM: General Linear Model General = wide applicability of the model to problems of estimation and testing of hypotheses about parameters Linear = the regression function is a linear function of the parameters Model = provides a description between one response and at least one predictor variable (technically, we are dealing with the General Linear Univariate Linear Model (GLUM) one response). October 8, 009 Regression in SPSS 7 Introduction to the GLM Simple linear model in scalar form: yi = β 0 + β xi + ε i, i =,,, n Multiple regression in scalar form: yi = β0 + βxi + βxi + + βpxip + εi, i=,,, n Multivariate: multiple y s Multivariable: multiple x s October 8, 009 Regression in SPSS 8 4

5 Simple regression in SPSS Simple linear model The simplest (but by no means trivial) form of the general regression problem deals with one dependent variable Y and one independent variable X. Given a sample of n individuals (or other study units), we observe for each a value of X and a value of Y. We thus have n pairs of observations that can be denoted by (X,Y ), (X,Y ),, (X n,y n ), where subscripts refer to different individuals. These can (and should!) be plotted on a graph the scatterplot. October 8, 009 Regression in SPSS 0 5

6 The data points in all four panels have identical best fitting straight lines but all tell a different story. It is always helpful to plot your data. From Myers and Well (003) October 8, 009 Regression in SPSS Review of the mathematical properties of a straight Line Equation: y = β 0 + β x β 0 = y-intercept (value of y when x = 0) β = slope (amount of change in y for each -unit increase in x rate of change is constant given a straight line) From Kleinbaum et al. (008) October 8, 009 Regression in SPSS 6

7 Inference in the simple linear model We usually compute confidence intervals and/or test statistical hypotheses about unknown parameters. The estimators β 0 and β together with estimators of their variances, can be used to form confidence intervals and test statistics based on the t distribution. October 8, 009 Regression in SPSS 3 Inference about the slope and intercept Test for zero slope most important test of hypothesis in the simple linear model. H : β = 0 0 This is a test of whether X helps to predict Y using a straight-line model. October 8, 009 Regression in SPSS 4 7

8 FTR H 0 : X provides little or no help in predicting Y. Y is essentially as good as Y + β ( X X) for predicting Y. FTR H 0 : The true underlying relationship between X and Y is not linear. Reject H 0 : X provides significant information for predicting Y. The model Y + β ( X X) is far better than the naive model Y for predicting Y. Reject H 0 : Although there is statistical evidence of a linear component, a better model might include a curvilinear term. From Kleinbaum et al. (008) Just because the null is rejected, does not mean that the straight-line model is the best model. October 8, 009 Regression in SPSS 5 Inference about the slope and intercept Test for zero intercept H0: β 0 = 0 This is a test of whether the Y intercept is zero. This test is rarely scientifically meaningful. However, most recommend leaving the intercept in the model even if you FTR this null. October 8, 009 Regression in SPSS 6 8

9 From Kleinbaum et al. (008) October 8, 009 Regression in SPSS 7 One Way to do Simple Linear Regression in SPSS October 8, 009 Regression in SPSS 8 9

10 One Way to do Simple Linear Regression in SPSS To get CIs for β 0 and β. October 8, 009 Regression in SPSS 9 SPSS Output Model Model Summary b Adjusted Std. Error of R R Square R Square the Estimate.658 a a. Predictors: (Constant), AGE_X b. Dependent Variable: SBP_Y S YX ANOVA b Model Regression Residual Total a. Predictors: (Constant), AGE_X b. Dependent Variable: SBP_Y Sum of Squares df Mean Square F Sig a S YX β 0 S β 0 t tests and p values for tests on the intercept and slope Coefficients a 95% CI for β 0 and β Model (Constant) AGE_X a. Dependent Variable: SBP_Y Unstandardized Coefficients Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta October 8, 009 β S Regression in SPSS 0 β 0

11 The ANalysis Of VAriance (ANOVA) summary table Included in the linear regression output is a table labeled ANOVA. It is called an ANOVA table primarily because the basic information in the table consists of several estimates of variance. We can use these estimates to address the inferential questions of regression analysis. People usually associate the table with a statistical procedure called analysis of variance. October 8, 009 Regression in SPSS The ANalysis Of VAriance (ANOVA) summary table Regression and analysis of variance are closely related. Analysis of variance problems can be expressed in a regression framework. Therefore, we can use an ANOVA table to summarize the results from analysis of variance and from regression. October 8, 009 Regression in SPSS

12 The ANOVA summary table for the simple linear model There are slightly different ways to present the ANOVA summary table; we will work with the most common form. In general, the mean square terms are obtained by dividing the sum of squares by its degrees of freedom. The F statistic is obtained by dividing the regression mean square (i.e., the model mean square) by the residual mean square. October 8, 009 Regression in SPSS 3 SPSS Output SSY SSE df = # of estimated parameters (or just the # of predictor variables) ANOVA b Regression mean square F statistic (F value) Model Regression Residual Total a. Predictors: (Constant), AGE_X b. Dependent Variable: SBP_Y SSE SSY Sum of Squares df Mean Square F Sig a Residual mean square df = sample size (n) number of estimated parameters (or n # of predictor variables ). p-value October 8, 009 Regression in SPSS 4

13 The ANOVA summary table for the simple linear model SSY SSE r = SSY i This quantity varies between 0 and and represents the proportionate reduction in SSY due to using X to predict Y instead of Y. i r indicates % of the variation in Y explained with the help of X. n Where SSY = ( Yi Y) is the sum of the squared deviations of the observed i= n i i i= Y's from the mean Y and SSE = ( Y Y) is the sum of the squared deviations of observed Y's from the fitted regression line. i Because SSY represents the total variation of Y before accounting for the linear effect of X, SSY is called the total unexplained variation or the total sum of squares about (or corrected for) the mean. i Because SSE represents the amount of variation in the observed Y's that remain after accounting for the linear effect of X, SSY - SSE is called the sum of squares due to (or explained by) regression (SSR). October 8, 009 Regression in SPSS 5 The ANOVA summary table for the simple linear model i i= i With a little math, it turns out that SSY - SSE = SSR = ( Y Y), which represents the sum of the squared deviations of the predicted values from the the mean Y. n Total unexplained variation (SSY a.k.a. SST) = Variation due to regression (SSR) + Unexplained residual variation (SSE) Or symbolically: n n n ( Y ) ( ) ( ) i Y = Yi Y + Yi Yi i= i= i= Fundamental equation of regression analysis October 8, 009 Regression in SPSS 6 3

14 The ANOVA summary table for the simple linear model i The residual mean square (SSE/df) is the estimate SYX provided earlier and is given by: n SYX = ( Yi Yi ) = SSE, where r = number n r i= n r of estimated parameters (in the simple linear model case r = ). i If we assume the straight-line model is appropriate, this is an estimate of σ. i The regression mean square (SSR/df) provides an estimate of σ only if X does not help to predict Y - that is only if H0: β = 0 is true. i If β 0, the regression mean square will be inflated in proportion to the magnitude of β and will corresponding overestimate σ. October 8, 009 Regression in SPSS 7 The ANOVA summary table for the simple linear model i With a little statistical theory, we can show that the residual mean square (SSE/df) and the regression mean square (SSR/df) are statistically independent of one another. i Thus, if H0: β = 0 is true, their ratio represents the ratio of two independent estimates of the same variance σ. i Under the normality and independence assumtpions, such a ratio has the F distribution and the calculated value of F (the F statistic) can be used to test H0: "No significant straight-line relationship of Y on X" (i.e., H0: β = 0 or H0: ρ = 0). i This test is equivalent to the two-sided t test discussed earlier. Because, for v degrees of freedom: F = T so F = t, v v, v, α v, α/ October 8, 009 Regression in SPSS 8 4

15 SPSS Output ANOVA b Model Regression Residual Total a. Predictors: (Constant), AGE_X b. Dependent Variable: SBP_Y Sum of Squares df Mean Square F Sig a =.33 Model (Constant) AGE_X a. Dependent Variable: SBP_Y Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta NOTE : The F-test from the ANOVA summary table will test a different H in multiple regression; as we shall see, it is a test for significant overall regression. 0 October 8, 009 Regression in SPSS 9 Inferences about the regression line In addition to making inferences about the slope and the intercept, we may also want to perform tests and/or compute CIs concerning the regression line itself. For a given X = X, we may want a confidence interval for μ, the mean of Y at X. 0 YX 0 0 We may also want to test the hypothesis H : μ = μ, where μ is some hypothesized value of interest. (0) (0) 0 YX 0 YX 0 YX 0 In addition to drawing inferences about the specific points on the regression line, researchers find it useful to construct a CI for the regression line over the entire range of X-values Confidence bands for the regression line. October 8, 009 Regression in SPSS 30 5

16 Prediction of a new value We have just dealt with estimating the mean μyx at X = X 0 0. We may also (or instead) want to estimate the response Y of a single individual; that is, predict an individual's Y given his or her X = X 0. The point estimate remains the same: Y X 0 = β 0 + βx0. However, to contruct a prediction interval (technically is not a CI since Y is not a parameter), we need to include the additional variability of the Y scores around their conditional means. Thus, an estimator of an individual's response should naturally have more variability than an estimator of a group's mean response and thus prediction intervals are larger than confidence intervals given the same - α coverage. Like confidence bands for the regression line, we can construct prediction bands these will be wider than the confidence bands. October 8, 009 Regression in SPSS 3 From Kleinbaum et al. (008) October 8, 009 Regression in SPSS 3 6

17 Getting CIs and PIs in SPSS October 8, 009 Regression in SPSS 33 Y = Y + ( β X X) S 95% CI and PI Y X 0 October 8, 009 Regression in SPSS 34 Why this? 7

18 Scatterplots, Confidence Bands, and Prediction Bands in SPSS October 8, 009 Regression in SPSS 35 October 8, 009 Regression in SPSS 36 8

19 October 8, 009 Regression in SPSS 37 Correlation analysis in SPSS 9

20 The correlation coefficient (r) The correlation coefficient is an often-used statistic that provides a measure of how two random variables are linearly associated in a sample and has properties closely related to those of straight-line regression. We are generally referring to the Pearson product-moment correlation coefficient (there are other measures of correlation). October 8, 009 Regression in SPSS 39 The correlation coefficient (r) i i r = i= n n ( Xi X) ( Yi Y) = i= i= and S n i Cov( XY, ), where Cov(, ) i= X Y SXSY n X X Y β ( X X)( Y Y) r = = S r = S SSX SSY = and SY = n n SSXY SSX SSY n ( X X)( Y Y) i Sum of crossproducts Sum of squares October 8, 009 Regression in SPSS 40 0

21 Hypothesis tests for r Test of H 0 : ρ = 0 This is mathematically equivalent to the test of the null hypothesis H 0 : β = 0 described in a previous lecture. i Test statistic: r n T =, which has the t distribution with df = n r when the null hypothesis is true. i Will give the same numerical answer as: (0) β β (0) T = when β = 0 S β October 8, 009 Regression in SPSS 4 How to get a correlation coefficient in SPSS October 8, 009 Regression in SPSS 4

22 How to get a correlation coefficient in SPSS October 8, 009 Regression in SPSS 43 Output from correlation analysis SBP_Y AGE_X Correlations Pearson Correlation Sig. (-tailed) N Pearson Correlation Sig. (-tailed) N SBP_Y AGE_X.658** ** **. Correlation is significant at the 0.0 level ( t il d) Output from simple linear regression see last lecture r n T = = = 4.68 r Model (Constant) AGE_X a. Dependent Variable: SBP_Y Unstandardized Coefficients Coefficients a Standardized Coefficients B Std. Error Beta t Sig October 8, 009 Regression in SPSS 44

23 Overview of multiple regression in SPSS Multiple regression analysis Multiple regression analysis can be looked upon as an extension of straight-line regression analysis (the simple linear model - which involves only one independent variable) to the situation in which more than one independent variable must be considered. We can no longer think in terms of a line in twodimensional space; we now have to think of a hypersurface in multidimensional space this is more difficult to visualize, so we will rely on the multiple regression equation the GLM. October 8, 009 Regression in SPSS 46 3

24 The general linear model Simple linear model in scalar form: y = β 0 + β x + ε, i =,,, n i i i Multiple regression in scalar form: y = β + β x + β x + + β x + ε, i=,,, n i 0 i i p ip i where β, β, β,..., β are the regression 0 coefficients that need to be estimated and p X, X,..., X may all be separate basic variables p or some may be functions of a few basic variables. October 8, 009 Regression in SPSS 47 The general linear model Interpretation of β : p i In the simple linear model β was the slope - or the amount change in Y for each - unit change in X. p i In multiple regression, β, the regression coefficient, refers the amount of change in Y for each -unit change in X p, holding all of the other variables in the equation constant. October 8, 009 Regression in SPSS 48 4

26 Overview of hypothesis testing in multiple regression All of these tests can also be interpreted as a comparison of two models. One of these models will be referred to as the full or complete model and the other will be called the reduced model (i.e., the model to which the complete model reduces under the null hypothesis). y = β + β x + β x + ε i 0 i i i Full Model y = β + β x + ε i 0 i i Reduced Model i Under H : β = 0, the larger (full model) reduces to the smaller (reduced) model. 0 Thus, a test of H : β = 0 is essentially equivalent to determining which of these 0 two models is more appropriate. i The set of IVs in the reduced model is a subset of the IVs in the full model. The smaller of the two models is said to be nested within the larger model. October 8, 009 Regression in SPSS 5 Test for significant overall regression i Consider this model: yi = β0 + βxi + βxi + + βpxip + εi i H0 : β = β =... = βp = 0 or equivalently i H0 : All p independent variables considered together do not explain a significant amount of the variation in Y. i The full model is reduced to a model that only contains the intercept term, β0. Regression MS (SSY - SSE)/ p i F = = Residual MS SSE/( n- p-) n n i i i= i= where SSY = ( Y Y) and SSE = ( Y Y i ) i The computed value of F can then be compared with the critical value F from an F table or you can look at the p-value from your output. pn, p, α October 8, 009 Regression in SPSS 5 6

27 Test for significant overall regression If you reject the null, you can conclude that, based on the observed data, the set of independent variables significantly helps to predict Y. It means that one or more individual regression coefficients may be significantly different from 0 (it is possible to have a significant overall regression when none of the individual predictors are significant in the given model). The conclusion does not mean that all IVs are needed for significant prediction of Y we need further testing. If you fail to reject the null, then none of the individual regression coefficients will be significantly different from zero. October 8, 009 Regression in SPSS 53 ANOVA table for multiple regression Source SS df MS F n Regression ( Y Y) q SSR/ ( q ) MSR/ MSE F i= (Model) n Error ( Y Yi ) n q SSE/ ( n q) i= i n Total ( Y ) i i Y n = SSE SSY-SSE=SSR SSY (or SST) q = # estimated parameters (which is p+ in the terminology we used earlier in this lecture) i ( q, n q) Under null October 8, 009 Regression in SPSS 54 7

29 An example From Kleinbaum et al. (008) October 8, 009 Regression in SPSS 57 One way to do multiple regression in SPSS October 8, 009 Regression in SPSS 58 9

30 October 8, 009 Regression in SPSS 59 SPSS output: ANOVA Table SSY SSE df = # of estimated parameters (or just the # of predictor variables) ANOVA b Regression mean square F statistic (F value) Model Regression Residual Total Sum of Squares df Mean Square F Sig a a. Predictors: (Constant), AGE_X, HGT_X b. Dependent Variable: WGT_Y SSE SSY df = sample size (n) number of estimated parameters (or n # of predictor variables ). Residual mean square p-value i The critical value for α = 0.0 is F,9,0.99= 8.0. Since 5.95 > 8.0, we reject the null at the α = 0.0 level. i Note that this is the same as saying: since the p-value (0.00) is < 0.0, we reject the null at the α = 0.0 level. October 8, 009 Regression in SPSS 60 30

31 SPSS output: Parameter estimates and some significance tests Model (Constant) HGT_X AGE_X Unstandardized Coefficients a. Dependent Variable: WGT_Y Coefficients a Standardized Coefficients B Std. Error Beta t Sig What do the results tell us? October 8, 009 Regression in SPSS 6 Partial F tests One word of caution: It is possible for an independent variable to be highly correlated with a DV but have a non-significant regression coefficient in a multiple regression with other variables included in the model (i.e., a nonsignificant variable-added-last partial F test or the t test). Why? October 8, 009 Regression in SPSS 6 3

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario

Statistics 112 Regression Cheatsheet Section 1B - Ryan Rosario I have found that the best way to practice regression is by brute force That is, given nothing but a dataset and your mind, compute everything

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### Lecture 5 Hypothesis Testing in Multiple Linear Regression

Lecture 5 Hypothesis Testing in Multiple Linear Regression BIOST 515 January 20, 2004 Types of tests 1 Overall test Test for addition of a single variable Test for addition of a group of variables Overall

### Multiple Linear Regression

Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Bivariate Regression Analysis. The beginning of many types of regression

Bivariate Regression Analysis The beginning of many types of regression TOPICS Beyond Correlation Forecasting Two points to estimate the slope Meeting the BLUE criterion The OLS method Purpose of Regression

### Bivariate Analysis. Correlation. Correlation. Pearson's Correlation Coefficient. Variable 1. Variable 2

Bivariate Analysis Variable 2 LEVELS >2 LEVELS COTIUOUS Correlation Used when you measure two continuous variables. Variable 2 2 LEVELS X 2 >2 LEVELS X 2 COTIUOUS t-test X 2 X 2 AOVA (F-test) t-test AOVA

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

### Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### Multiple Regression: What Is It?

Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Regression, least squares

Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen

### Simple Regression and Correlation

Simple Regression and Correlation Today, we are going to discuss a powerful statistical technique for examining whether or not two variables are related. Specifically, we are going to talk about the ideas

### Lesson Lesson Outline Outline

Lesson 15 Linear Regression Lesson 15 Outline Review correlation analysis Dependent and Independent variables Least Squares Regression line Calculating l the slope Calculating the Intercept Residuals and

### Statistiek II. John Nerbonne. March 24, 2010. Information Science, Groningen Slides improved a lot by Harmut Fitz, Groningen!

Information Science, Groningen j.nerbonne@rug.nl Slides improved a lot by Harmut Fitz, Groningen! March 24, 2010 Correlation and regression We often wish to compare two different variables Examples: compare

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### Chapter 23. Inferences for Regression

Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily

### Outline. Correlation & Regression, III. Review. Relationship between r and regression

Outline Correlation & Regression, III 9.07 4/6/004 Relationship between correlation and regression, along with notes on the correlation coefficient Effect size, and the meaning of r Other kinds of correlation

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### Chapter 2 Probability Topics SPSS T tests

Chapter 2 Probability Topics SPSS T tests Data file used: gss.sav In the lecture about chapter 2, only the One-Sample T test has been explained. In this handout, we also give the SPSS methods to perform

### Notes on Applied Linear Regression

Notes on Applied Linear Regression Jamie DeCoster Department of Social Psychology Free University Amsterdam Van der Boechorststraat 1 1081 BT Amsterdam The Netherlands phone: +31 (0)20 444-8935 email:

### Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

### CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### Regression step-by-step using Microsoft Excel

Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

### SPSS: Descriptive and Inferential Statistics. For Windows

For Windows August 2012 Table of Contents Section 1: Summarizing Data...3 1.1 Descriptive Statistics...3 Section 2: Inferential Statistics... 10 2.1 Chi-Square Test... 10 2.2 T tests... 11 2.3 Correlation...

### Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple

### Using Minitab for Regression Analysis: An extended example

Using Minitab for Regression Analysis: An extended example The following example uses data from another text on fertilizer application and crop yield, and is intended to show how Minitab can be used to

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### , has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

### ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

### Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

### STAT 350 Practice Final Exam Solution (Spring 2015)

PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

### The importance of graphing the data: Anscombe s regression examples

The importance of graphing the data: Anscombe s regression examples Bruce Weaver Northern Health Research Conference Nipissing University, North Bay May 30-31, 2008 B. Weaver, NHRC 2008 1 The Objective

### Introduction to Regression and Data Analysis

Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

### Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014

Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands March 2014 Chalie Patarapichayatham 1, Ph.D. William Fahle 2, Ph.D. Tracey R. Roden 3, M.Ed. 1 Research Assistant Professor in the

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### 1 Simple Linear Regression I Least Squares Estimation

Simple Linear Regression I Least Squares Estimation Textbook Sections: 8. 8.3 Previously, we have worked with a random variable x that comes from a population that is normally distributed with mean µ and

### t-tests and F-tests in regression

t-tests and F-tests in regression Johan A. Elkink University College Dublin 5 April 2012 Johan A. Elkink (UCD) t and F-tests 5 April 2012 1 / 25 Outline 1 Simple linear regression Model Variance and R

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### Yiming Peng, Department of Statistics. February 12, 2013

Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### STA 4163 Lecture 10: Practice Problems

STA 463 Lecture 0: Practice Problems Problem.0: A study was conducted to determine whether a student's final grade in STA406 is linearly related to his or her performance on the MATH ability test before

### Lecture Notes Module 1

Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### 2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

### Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### EPS 625 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM

EPS 6 ANALYSIS OF COVARIANCE (ANCOVA) EXAMPLE USING THE GENERAL LINEAR MODEL PROGRAM ANCOVA One Continuous Dependent Variable (DVD Rating) Interest Rating in DVD One Categorical/Discrete Independent Variable

### Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

### Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

### Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:

### Section 1: Simple Linear Regression

Section 1: Simple Linear Regression Carlos M. Carvalho The University of Texas McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.

Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under

### Using JMP with a Specific

1 Using JMP with a Specific Example of Regression Ying Liu 10/21/ 2009 Objectives 2 Exploratory data analysis Simple liner regression Polynomial regression How to fit a multiple regression model How to

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Using R for Linear Regression

Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

### Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

### ANOVA Analysis of Variance

ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

### AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

### Regression Analysis. Data Calculations Output

Regression Analysis In an attempt to find answers to questions such as those posed above, empirical labour economists use a useful tool called regression analysis. Regression analysis is essentially a

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

### Fixed vs. Random Effects

Statistics 203: Introduction to Regression and Analysis of Variance Fixed vs. Random Effects Jonathan Taylor - p. 1/19 Today s class Implications for Random effects. One-way random effects ANOVA. Two-way

### Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

### Multiple Regression Analysis (ANCOVA)

Chapter 16 Multiple Regression Analysis (ANCOVA) In many cases biologists are interested in comparing regression equations of two or more sets of regression data. In these cases, the interest is in whether

### An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship

### One-Way Analysis of Variance

One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

### Factor B: Curriculum New Math Control Curriculum (B (B 1 ) Overall Mean (marginal) Females (A 1 ) Factor A: Gender Males (A 2) X 21

1 Factorial ANOVA The ANOVA designs we have dealt with up to this point, known as simple ANOVA or oneway ANOVA, had only one independent grouping variable or factor. However, oftentimes a researcher has

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

### Pearson s correlation

Pearson s correlation Introduction Often several quantitative variables are measured on each member of a sample. If we consider a pair of such variables, it is frequently of interest to establish if there

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable