Using SPSS for Multiple Regression. UDP 520 Lab 7 Lin Lin December 4 th, 2007

Similar documents
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

SPSS Guide: Regression Analysis

Independent t- Test (Comparing Two Means)

Premaster Statistics Tutorial 4 Full solutions

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Simple linear regression

Univariate Regression

Moderator and Mediator Analysis

Chapter 2 Probability Topics SPSS T tests

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

Multiple Regression Using SPSS

Chapter 5 Analysis of variance SPSS Analysis of variance

Multicollinearity Richard Williams, University of Notre Dame, Last revised January 13, 2015

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Introduction. Research Problem. Larojan Chandrasegaran (1), Janaki Samuel Thevaruban (2)

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Correlation and Regression Analysis: SPSS

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Chapter 23. Inferences for Regression

Regression step-by-step using Microsoft Excel

Additional sources Compilation of sources:

Introduction to Quantitative Methods

Regression Analysis (Spring, 2000)

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Two Related Samples t Test

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

The Impact of Affective Human Resources Management Practices on the Financial Performance of the Saudi Banks

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

Linear Models in STATA and ANOVA

Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014

Regression Analysis: A Complete Example

RECRUITERS PRIORITIES IN PLACING MBA FRESHER: AN EMPIRICAL ANALYSIS

MULTIPLE REGRESSION WITH CATEGORICAL DATA

DAFTAR PUSTAKA. Arifin Ali, 2002, Membaca Saham, Edisi I, Yogyakarta : Andi. Bapepam, 2004, Ringkasan Data Perusahaan, Jakarta : Bapepam

THE KRUSKAL WALLLIS TEST

Multiple Regression. Page 24

The Relationship between Life Insurance and Economic Growth: Evidence from India

Module 5: Multiple Regression Analysis

IMPACT AND SIGNIFICANCE OF TRANSPORTATION AND SOCIO ECONOMIC FACTORS ON STUDENTS CLASS ATTENDANCE IN NIGERIA POLYTECHNICS: A

Statistical analysis and projection of wood veneer industry in Turkey:

Simple Linear Regression Inference

Simple Linear Regression, Scatterplots, and Bivariate Correlation

Final Exam Practice Problem Answers

11. Analysis of Case-control Studies Logistic Regression

Statistics. Measurement. Scales of Measurement 7/18/2012

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling

MULTIPLE REGRESSION ANALYSIS OF MAIN ECONOMIC INDICATORS IN TOURISM. R, analysis of variance, Student test, multivariate analysis

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

4. Multiple Regression in Practice

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

The Dummy s Guide to Data Analysis Using SPSS

SPSS-Applications (Data Analysis)

Economics and Finance Review Vol. 1(3) pp , May, 2011 ISSN: Available online at

One-Way Analysis of Variance

August 2012 EXAMINATIONS Solution Part I

Descriptive Statistics

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

A Basic Guide to Analyzing Individual Scores Data with SPSS

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

THE RELATIONSHIP BETWEEN WORKING CAPITAL MANAGEMENT AND DIVIDEND PAYOUT RATIO OF FIRMS LISTED IN NAIROBI SECURITIES EXCHANGE

Ordinal Regression. Chapter

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Study on the Working Capital Management Efficiency in Indian Leather Industry- An Empirical Analysis

Directions for using SPSS

Chapter 7: Simple linear regression Learning Objectives

Journal of Asian Business Strategy. Interior Design and its Impact on of Employees' Productivity in Telecom Sector, Pakistan


Effect of the learning support and the use of project management tools on project success: The case of Pakistan

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

ijcrb.webs.com INTERDISCIPLINARY JOURNAL OF CONTEMPORARY RESEARCH IN BUSINESS OCTOBER 2013 VOL 5, NO 6 Abstract 1. Introduction:

An Introduction to Path Analysis. nach 3

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

THE USE OF REGRESSION ANALYSIS IN MARKETING RESEARCH

Student debt from higher education attendance is an increasingly troubling problem in the

SPSS TUTORIAL & EXERCISE BOOK

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Determinants of the Total Quality Management Implementation in SMEs in Iran (Case of Metal Industry)

EMPIRICAL INVESTIGATION OF LEADERSHIP STYLE ON ENHANCING TEAM BUILDING SKILLS

A Study of Job Satisfaction and IT s Impact on the Performance in the Banking Industry of Pakistan

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Coefficient of Determination

Pearson s Correlation

Chapter 7 Factor Analysis SPSS

5. Linear Regression

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Using Excel for Statistical Analysis

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

Sample Size and Power in Clinical Trials

A full analysis example Multiple correlations Partial correlations

Transcription:

Using SPSS for Multiple Regression UDP 520 Lab 7 Lin Lin December 4 th, 2007

Step 1 Define Research Question What factors are associated with BMI? Predict BMI.

Step 2 Conceptualizing Problem (Theory) Individual Behaviors BMI Individual Characteristics Environment

Step 2 Conceptualizing Problem (Theory) Individual behaviors are associated with BMI. Individual characteristics are associated with BMI. Environment is associated with BMI.

Step 3 & 4 Operationalizing and Hypothesizing Individual behaviors are associated with BMI. Eating behavior: daily calorie intake is positively associated with BMI Exercising behavior: level of exercise is negatively associated with BMI. Individual characteristics are associated with BMI. Sex Income Education level Occupation Environment is associated with BMI. Physical environment Social environment

Step 5 Collecting Data 1000 adults aged 18+ (males and females) were recruited to study factors associated with BMI (BMI) Variables BMI (before WLTP) Sex (female=1) individual characteristics Calorie (calorie intake daily) individual behaviors Exercise (minutes of exercise per week) individual behaviors Income (monthly salary in dollars $) individual characteristics Expenditure on food (monthly food expense in dollars $) individual behaviors Education (education level in years) individual characteristics Residential density (high, median, low) physical environment

Step 6 Developing OLS Equation Multiple regression Y = β + β x + β x + β x BMI 0 1 calorie 2 exercise 3 sex + β x + β x + β x + ε 4 income 5 education 6 built environment

OLS Equation for SPSS Multiple regression Model 1 Y = β + β x + β x BMI 0 1 calorie 2 exercise + β x + + ε β x 4 income 5 education

Using SPSS for Multiple Regression

SPSS Output Tables

Descriptive Statistics Mean Std. Deviation N BMI 24.0674 1.28663 1000 calorie 2017.7167 513.71981 1000 exercise 21.7947 7.66196 1000 income 2005.1981 509.49088 1000 education 19.95 3.820 1000 Correlations Pearson Correlation BMI calorie exercise income education BMI 1.000.784 -.310.033.011 calorie.784 1.000 -.193 -.009.004 exercise -.310 -.193 1.000 -.030 -.046 income.033 -.009 -.030 1.000.069 Sig. (1-tailed) education.011.004 -.046.069 1.000 BMI..000.000.148.361 calorie.000..000.391.451 exercise.000.000..175.072 income.148.391.175..014 N education.361.451.072.014. BMI 1000 1000 1000 1000 1000 calorie 1000 1000 1000 1000 1000 exercise 1000 1000 1000 1000 1000 income 1000 1000 1000 1000 1000 education 1000 1000 1000 1000 1000 Variables Entered/Removed(b) Model 1 Variables Entered education, calorie, income, exercise(a) Variables Removed. Enter Method a All requested variables entered. b Dependent Variable: BMI Model Summary(b) Adjusted R Std. Error of the Model R R Square Square Estimate 1.801(a).642.641.77095 a Predictors: (Constant), education, calorie, income, exercise b Dependent Variable: BMI 1

ANOVA(b) Model 1 Sum of Squares df Mean Square F Sig. Regression 1062.377 4 265.594 446.853.000(a) Residual 591.394 995.594 Total 1653.771 999 a Predictors: (Constant), education, calorie, income, exercise b Dependent Variable: BMI Model Unstandardized Coefficients Coefficients(a) Standardized Coefficients t Sig. 95% Confidence Interval for B Collinearity Statistics B Std. Error Beta Lower Bound Upper Bound Tolerance VIF 1 (Constant) 20.693.208 99.404.000 20.285 21.102 a Dependent Variable: BMI calorie.002.000.753 38.969.000.002.002.962 1.039 exercise -.027.003 -.163-8.434.000 -.034 -.021.960 1.042 income 8.82E-005.000.035 1.837.067.000.000.994 1.006 education -.001.006 -.002 -.086.932 -.013.012.993 1.007 Collinearity Diagnostics(a) Variance Proportions Model Dimension Eigenvalue Condition Index (Constant) calorie exercise income education 1 1 4.778 1.000.00.00.00.00.00 2.110 6.584.00.10.72.02.01 3.060 8.924.00.41.03.56.00 4.041 10.842.01.21.05.26.55 5.011 21.197.99.28.19.16.44 a Dependent Variable: BMI Residuals Statistics(a) Minimum Maximum Mean Std. Deviation N Predicted Value 21.8115 26.9475 24.0674 1.03123 1000 Residual -3.36145 4.91952.00000.76941 1000 Std. Predicted Value -2.188 2.793.000 1.000 1000 Std. Residual -4.360 6.381.000.998 1000 a Dependent Variable: BMI 2

Step 7 Checking for Multicollinearity Correlations Pearson Correlation Sig. (1-tailed) N BMI calorie exercise income education BMI calorie exercise income education BMI calorie exercise income education BMI calorie exercise income education 1.000.784 -.310.033.011.784 1.000 -.193 -.009.004 -.310 -.193 1.000 -.030 -.046.033 -.009 -.030 1.000.069.011.004 -.046.069 1.000..000.000.148.361.000..000.391.451.000.000..175.072.148.391.175..014.361.451.072.014. 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 Check multicollinearity of independent variables. If the absolute value of Pearson correlation is greater than 0.8, collinearity is very likely to exist. If the absolute value of Pearson correlation is close to 0.8 (such as 0.7±0.1), collinearity is likely to exist.

Step 7 Checking for Multicollinearity (cont.) Collinearity Diagnostics a Model 1 Dimension 1 2 3 4 5 a. Dependent Variable: BMI Condition Variance Proportions Eigenvalue Index (Constant) calorie exercise income education 4.778 1.000.00.00.00.00.00.110 6.584.00.10.72.02.01.060 8.924.00.41.03.56.00.041 10.842.01.21.05.26.55.011 21.197.99.28.19.16.44 A condition index greater than 15 indicates a possible problem An index greater than 30 suggests a serious problem with collinearity.

Step 8 Statistics Goodness of fit of model Model 1 Model Summary b Adjusted Std. Error of R R Square R Square the Estimate.801 a.642.641.77095 a. Predictors: (Constant), education, calorie, income, exercise b. Dependent Variable: BMI R 2 = 0.642 It means that 64.2% of variation is explained by the model. The adjusted R 2 adjusts for the number of explanatory terms (independent variables) in a model and increases only if the new independent variable(s) improve(s) the model more than would be expected by chance.

Step 8 Statistics (cont.) Coefficient of each independent variable Model 1 (Constant) calorie exercise income education a. Dependent Variable: BMI Unstandardized Coefficients Standardized Coefficients Coefficients a 95% Confidence Interval for B B Std. Error Beta t Sig. Lower Bound Upper Bound 20.693.208 99.404.000 20.285 21.102 Collinearity Statistics Tolerance.002.000.753 38.969.000.002.002.962 1.039 -.027.003 -.163-8.434.000 -.034 -.021.960 1.042 8.82E-005.000.035 1.837.067.000.000.994 1.006 -.001.006 -.002 -.086.932 -.013.012.993 1.007 VIF Unstandardized coefficients used in the prediction and interpretation standardized coefficients used for comparing the effects of independent variables Compared Sig. with alpha 0.05. If Sig. <0.05 the coefficient is statistically significant from zero.

Step 9 Interpreting Estimated Coefficient Y = 20.693+ 0.002 x + ( 0.027) x + 0.0000882 x + ( 0.001) x BMI calorie exercise income education Controlling for other variables constant, if a person increase 1 calorie intake per day, the BMI of the person will increase by 0.002. Please explain the estimated coefficient of exercise.

Steps on Model Development and Model Selection First, include the theoretically important variables Second, include variables that are strongly associated with the dependent variable (to identify independent variables that are strongly associated with the dependent variable, Pearson r test could be used for interval-ratio variables with the dependent variable). Third, adjusted R 2 need to be compared to determine if the new independent variables improve the model. At the mean time, multicollinearity needs to be checked.

Notes on Regression Model It is VERY important to have theory before starting developing any regression model. If the theory tells you certain variables are too important to exclude from the model, you should include in the model even though their estimated coefficients are not significant. (Of course, it is more conservative way to develop regression model.)

BMI data http://courses.washington.edu/urbdp520/udp520/bmi.sav For exercise, you can develop your own conceptual frameworks (theories), create different OLS models, and examine different independent variables.