Linear Regression in SPSS

Similar documents
Multiple Regression. Page 24

Doing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Chapter 23. Inferences for Regression

Directions for using SPSS

SPSS Guide: Regression Analysis

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red.

Chapter 7: Simple linear regression Learning Objectives

Chapter 5 Analysis of variance SPSS Analysis of variance

SPSS Tutorial, Feb. 7, 2003 Prof. Scott Allard

SPSS Notes (SPSS version 15.0)

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Simple linear regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

SPSS Introduction. Yi Li

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

Table of Contents. Preface

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Univariate Regression

An SPSS companion book. Basic Practice of Statistics

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

1.1. Simple Regression in Excel (Excel 2010).

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

Bill Burton Albert Einstein College of Medicine April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

An analysis method for a quantitative outcome and two categorical explanatory variables.

11. Analysis of Case-control Studies Logistic Regression

Additional sources Compilation of sources:

Data Analysis Tools. Tools for Summarizing Data

The Dummy s Guide to Data Analysis Using SPSS

Moderation. Moderation

SPSS-Applications (Data Analysis)

Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Didacticiel - Études de cas

Binary Logistic Regression

Regression Analysis: A Complete Example

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

January 26, 2009 The Faculty Center for Teaching and Learning

SAS Software to Fit the Generalized Linear Model

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

In this example, Mrs. Smith is looking to create graphs that represent the ethnic diversity of the 24 students in her 4 th grade class.

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

An introduction to IBM SPSS Statistics

Using Microsoft Excel to Plot and Analyze Kinetic Data

Chapter 15. Mixed Models Overview. A flexible approach to correlated data.

Multiple Regression Using SPSS

Developing Risk Adjustment Techniques Using the System for Assessing Health Care Quality in the

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Microsoft Excel. Qi Wei

A Basic Guide to Analyzing Individual Scores Data with SPSS

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Data Analysis for Marketing Research - Using SPSS

Simple Predictive Analytics Curtis Seare

Generalized Linear Models

Correlation and Regression Analysis: SPSS

ABSORBENCY OF PAPER TOWELS

Ordinal Regression. Chapter

Introduction to SPSS 16.0

Scatter Plots with Error Bars

Curve Fitting in Microsoft Excel By William Lee

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

2013 MBA Jump Start Program. Statistics Module Part 3

Calibration and Linear Regression Analysis: A Self-Guided Tutorial

PERFORMING REGRESSION ANALYSIS USING MICROSOFT EXCEL

Main Effects and Interactions

Regression step-by-step using Microsoft Excel

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Dealing with Data in Excel 2010

Excel Tutorial. Bio 150B Excel Tutorial 1

Analysis of categorical data: Course quiz instructions for SPSS

Spreadsheets and Laboratory Data Analysis: Excel 2003 Version (Excel 2007 is only slightly different)

Microsoft Excel Tutorial

SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

Drawing a histogram using Excel

Simple Linear Regression Inference

IBM SPSS Statistics for Beginners for Windows

SPSS Tests for Versions 9 to 13

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

SPSS TUTORIAL & EXERCISE BOOK

SPSS Explore procedure

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

MULTIPLE REGRESSION EXAMPLE

Logistic Regression.

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Absorbance Spectrophotometry: Analysis of FD&C Red Food Dye #40 Calibration Curve Procedure

13. Poisson Regression Analysis

FREE FALL. Introduction. Reference Young and Freedman, University Physics, 12 th Edition: Chapter 2, section 2.5

Introduction Course in SPSS - Evening 1

Multiple logistic regression analysis of cigarette use among high school students

Final Exam Practice Problem Answers

Modeling Lifetime Value in the Insurance Industry

Linear Models in STATA and ANOVA

How to set the main menu of STATA to default factory settings standards

LOGISTIC REGRESSION ANALYSIS

Chapter 7. One-way ANOVA

Spreadsheet software for linear regression analysis

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Transcription:

Linear Regression in SPSS Data: mangunkill.sav Goals: Examine relation between number of handguns registered (nhandgun) and number of man killed (mankill) checking Predict number of man killed using number of handguns registered I. View the Data with a Scatter Plot To create a scatter plot, click through Graphs\Scatter\Simple\Define. A Scatterplot dialog box will appear. Click Simple option and then click Define button. The Simple Scatterplot dialog box will appear. Click mankill to the Y-axis box and click nhandgun to the x-axis box. Then hit OK. To see fit line, double click on the scatter plot, click on Chart\Options, then check the Total box in the box labeled Fit Line in the upper right corner. 60 60 50 50 40 40 Number of People Killed 30 20 0 400 500 600 700 800 Number of People Killed 30 20 0 400 500 600 700 800 Number of Handguns Registered Number of Handguns Registered

Click the target button on the left end of the tool bar, the mouse pointer will change shape. Move the target pointer to the data point and click the left mouse button, the case number of the data point will appear on the chart. This will help you to identify the data point in your data sheet. II. Regression Analysis To perform the regression, click on Analyze\Regression\Linear. Place nhandgun in the Dependent box and place mankill in the Independent box. To obtain the 95% confidence interval for the slope, click on the Statistics button at the bottom and then put a check in the box for Confidence Intervals. Hit Continue and then hit OK. The independent variable (nhandgun) is said to be useful in predicting the dependent variable (mankill) when the level of significance (P-value labeled with Sig. on the Output) is below 0.05. Summary b Adjusted Std. Error of R R Square R Square the Estimate.94 a.886.877 4.2764 a. Predictors: (Constant), Number of Handguns Registered b. Dependent Variable: Number of People Killed Regression Residual Total ANOVA b Sum of Squares df Mean Square F Sig. 7.979 7.979 93.65.000 a 29.450 2 8.287 93.429 3 a. Predictors: (Constant), Number of Handguns Registered b. Dependent Variable: Number of People Killed (Constant) Number of Handguns Registered Unstandardized Coefficients a. Dependent Variable: Number of People Killed Coefficients a Standardi zed Coefficien ts 95% Confidence Interval for B B Std. Error Beta t Sig. Lower Bound Upper Bound -4.430 7.42-5.589.000-57.580-25.28.25.03.94 9.675.000.097.53 2

III. Residual Analysis To assess the fit of the model, when performing the regression, also click on the Save button at the bottom of the dialogue box. In the boxes labeled Predicted Values and Residuals, click Unstandardized in both boxes. In the Hit Continue and then hit OK. There will now be a column labeled res_ and a column labeled pre_ in the data window. To make a residual plot, click Graphs\Scatter\Simple\Define and put res_ in the Y-axis box and again put pre_ in the X-axis box. Hit OK. The goal of a residual plot is to see a random scatter of residuals. The normality test in the Explore option can be used to check for normality. 6 4 2 0 IV. Prediction Intervals Unstandardized Residual Unstandardized Predicted Value To calculate the mean prediction intervals and the individual prediction intervals, use the Save button that appears after clicking Analyze\Regression\Linear. Now in the box labeled Prediction Values, click on Unstandardized. This will give the predicted Y-values from the model. The data window will have a column labeled pre_. For the prediction intervals, in the boxes near the bottom labeled Prediction Intervals, put check marks in front of Mean and Individual. In the data window, will now be columns, labeled lmci_, umci_, lici_, and uici_. The first two columns are for the lower and upper bounds for the 95% mean prediction interval. (We are 95% sure that the average y for the given x values is within that interval.) The second two columns are the 95% prediction intervals. To do a prediction, simply enter the value of the predictor variable at the last row of the data sheet under the predictor variable and go through the model building. For instance, to predict the average number of people killed if the number of handguns registered is 700. The confidence interval for mean response is (4.49, 50.45), and for an individual response is (35.63, 56.3). -2-4 -6-8 -0 0 20 30 40 50 3

Multiple Regression in SPSS Data: WeightbyAgeHeight.sav Goals: Examine relation between weight (response) and age and height (explanatory) checking Predict weight I. View the Data with a Scatter Plot To create a scatter plot, click through Graphs\Scatter\Simple\Define. A Scatterplot dialog box will appear. Click Matrix option and then click Define button. The Simple Scatterplot dialog box will appear. Click age, weight and height variables into matrix variables box. Age Height Weight II. Regression Analysis To perform the regression, click on Analyze\Regression\Linear. Place weight in the Dependent box and place age and height in the Independent box. Click the Statistics button to select the collinearity diagnostics and click Continue. To obtain the 95% confidence interval for the slope, click on the Statistics button at the bottom and then put a check in the box for Confidence Intervals. Hit Continue and then hit OK. The independent variables (age, height) are useful in predicting the dependent variable (weight) when the level of significance (P-value labeled with Sig. on the Output) is below 0.05. (Constant) Age Height a. Dependent Variable: Weight Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts Collinearity Statistics t Sig. Tolerance VIF B Std. Error Beta -27.820 2.099-0.565.000.240.055.228 4.360.000.579.727 3.090.257.627 2.008.000.579.727 4

Stepwise Regression To perform stepwise regression for automatically selecting significant variables, check the Method drop down list and choose the desired one and click OK. SPSS will produce an output table to present the final model with a coefficients table. Interaction Term To examine the interaction between age and height variables, first create the interaction variable (intageht). Click Transform\Compute, and in the compute variable dialog box, enter a name for the interaction term, intageht. Click and enter age variable and click * for multiplying and then click and enter height variable in the Numeric Expression box and then click OK. To perform the regression, click on Analyze\Regression\Linear. Place weight in the Dependent box and place age, height and intageht in the Independent box. Click the Statistics button to select the collinearity diagnostics and click Continue, and then hit OK. In the coefficients table, VIF are all greater than 0 which implies collinearity. 5

(Constant) Age Height INTAGEHT a. Dependent Variable: Weight Unstandardized Coefficients Coefficients a Standardi zed Coefficien ts Collinearity Statistics t Sig. Tolerance VIF B Std. Error Beta 66.996 06.89.63.529 -.973.660 -.923 -.476.4.004 250.009-3.3E-02.70 -.006 -.08.985.03 77.06.936E-02.00.636.847.066.002 50.996 If categorical variables are to be included in the model, the indicator variables will need to be created. Recode under transform option is one of the methods to generate the indicator variables. III. Residual Analysis To assess the fit of the model, when performing the regression, also click on the Save button at the bottom of the dialogue box. In the boxes labeled Predicted Values and Residuals, click Unstandardized in both boxes. In the Hit Continue and then hit OK. There will now be a column labeled res_ and a column labeled pre_ in the data window. To make a residual plot, click Graphs\Scatter\Simple\Define and put res_ in the Y-axis box and again put pre_ in the X-axis box. Hit OK. The goal of a residual plot is to see a random scatter of residuals. The normality test in the Explore option can be used to check for normality. IV. Prediction Intervals To calculate the mean prediction intervals and the individual prediction intervals, use the Save button that appears after clicking Analyze\Regression\Linear. Now in the box labeled Prediction Values, click on Unstandardized. This will give the predicted Y-values from the model. The data window will have a column labeled pre_. For the prediction intervals, in the boxes near the bottom labeled Prediction Intervals, put check marks in front of Mean and Individual. In the data window, will now be columns, labeled lmci_, umci_, lici_, and uici_. The first two columns are for the lower and upper bounds for the 95% mean prediction interval. (We are 95% sure that the average y for the given x values is within that interval.) The second two columns are the 95% prediction intervals. To do a prediction, simply enter the value of the predictor variable at the last row of the data sheet under the predictor variable and go through the model building. The procedure is similar to that of simple linear regression. 6

Logistic Regression in SPSS Data: logdisea.sav Goals: Examine relation between disease (binary response) and other explanatory variables such as age, socioeconomic status, sector, and savings account. checking Predict probability of getting disease and estimating the odds ratio To perform the regression, click on Analyze\Regression\Binary Logistic. Place disease in the Dependent box and place age, sciostat, sector and savings in the covariates box. Click the Categorical button for creating indicator variables for the categorical variables. To obtain the the predicting probability of getting disease, click on the Save button at the bottom and then put a check in the Probabilities box of Predicted Values and click Continue. Also, click the Option button and check the goodness of fit box to see how well the model fit the data, and check the CI for exp(b) to obtain confidence interval for odds ratio. Hit Continue and then hit OK. The independent variables (age, sector) are significant in predicting the dependent variable (disease) when the level of significance (P-value labeled with Sig. on the Output) is below 0.05. Step a AGE SCIOSTAT SCIOSTAT() SCIOSTAT(2) SECTOR() SAVINGS Constant Variables in the Equation 95.0% C.I.for EXP(B) B S.E. Wald df Sig. Exp(B) Lower Upper.027.009 8.646.003.027.009.045.439 2.803 -.278.434.409.522.757.323.775 -.29.459.227.634.803.327.976 -.234.357.970.00.29.45.586.06.386.025.874.063.499 2.264 -.84.452 3.246.072.443 a. Variable(s) entered on step : AGE, SCIOSTAT, SECTOR, SAVINGS. Hosmer and Lemeshow Test Step Chi-square df Sig. 0.853 8.20 Stepwise Regression To perform stepwise regression for automatically selecting significant variables, check the Method drop down list and choose the desired one and click OK. SPSS will produce an output table to present the final model with a coefficients table. 7