Wednesday PM. Multiple regression. Multiple regression in SPSS. Presentation of AM results Multiple linear regression. Logistic regression

Similar documents
Binary Logistic Regression

LOGISTIC REGRESSION ANALYSIS

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

SPSS Guide: Regression Analysis

Chapter 5 Analysis of variance SPSS Analysis of variance

Multinomial and Ordinal Logistic Regression

Overview Classes Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Ordinal Regression. Chapter

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Credit Risk Analysis Using Logistic Regression Modeling

When to Use a Particular Statistical Test

Multiple Regression. Page 24

Logistic Regression.

Module 4 - Multiple Logistic Regression

Analysing Questionnaires using Minitab (for SPSS queries contact -)

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

SPSS Notes (SPSS version 15.0)

11. Analysis of Case-control Studies Logistic Regression

Chapter 15. Mixed Models Overview. A flexible approach to correlated data.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Some Essential Statistics The Lure of Statistics

Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red.

Finding Supporters. Political Predictive Analytics Using Logistic Regression. Multivariate Solutions

HLM software has been one of the leading statistical packages for hierarchical

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Logistic (RLOGIST) Example #1

EARLY VS. LATE ENROLLERS: DOES ENROLLMENT PROCRASTINATION AFFECT ACADEMIC SUCCESS?

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.

SUGI 29 Statistics and Data Analysis

Additional sources Compilation of sources:

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

IBM SPSS Regression 20

Consider a study in which. How many subjects? The importance of sample size calculations. An insignificant effect: two possibilities.

i SPSS Regression 17.0

SAS Software to Fit the Generalized Linear Model

Statistics in Retail Finance. Chapter 2: Statistical models of default

VI. Introduction to Logistic Regression

Moderation. Moderation

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Developing Risk Adjustment Techniques Using the System for Assessing Health Care Quality in the

Univariate Regression

Reporting Statistics in Psychology

Calculating Effect-Sizes

10. Analysis of Longitudinal Studies Repeat-measures analysis

Multiple Regression Using SPSS

Descriptive Statistics

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Logistic (RLOGIST) Example #7

Generalized Linear Models

Didacticiel - Études de cas

Data quality in Accounting Information Systems

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Factor Analysis. Principal components factor analysis. Use of extracted factors in multivariate dependency models

Statistiek II. John Nerbonne. October 1, Dept of Information Science

L3: Statistical Modeling with Hadoop

Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.

DISCRIMINANT FUNCTION ANALYSIS (DA)

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Statistical Models in R

How To Understand Multivariate Models

1.1. Simple Regression in Excel (Excel 2010).

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

The Effect of Competition, Just In Time Production and Total Quality Management on the Use of Multiple Performance Measures: An Empirical Study

Multiple Regression Kean University February 12,

Correlates of Academic Achievement for Master of Education Students at Open University Malaysia

Study Guide for the Final Exam

Data analysis process

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

January 26, 2009 The Faculty Center for Teaching and Learning

Moderator and Mediator Analysis

There are six different windows that can be opened when using SPSS. The following will give a description of each of them.

SPSS Introduction. Yi Li

Simple Linear Regression, Scatterplots, and Bivariate Correlation

IMPACT OF WORKING CAPITAL MANAGEMENT ON PERFORMANCE

UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010

Best Practices in Using Large, Complex Samples: The Importance of Using Appropriate Weights and Design Effect Compensation

SUMAN DUVVURU STAT 567 PROJECT REPORT

A Basic Guide to Analyzing Individual Scores Data with SPSS

International Statistical Institute, 56th Session, 2007: Phil Everson

Directions for using SPSS

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

How to Get More Value from Your Survey Data

Design and Analysis of Phase III Clinical Trials

Multiple logistic regression analysis of cigarette use among high school students

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Gerry Hobbs, Department of Statistics, West Virginia University

Chapter 12 Discovering New Knowledge Data Mining

Logistic (RLOGIST) Example #3

Transcription:

Wednesday PM Presentation of AM results Multiple linear regression Simultaneous Stepwise Hierarchical Logistic regression Multiple regression Multiple regression extends simple linear regression to consider the effects of multiple independent variables (controlling for each other) on the dependent variable. The line fit is: Y = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3 + The coefficients (b i ) tell you the independent effect of a change in one dependent variable on the independent variable, in natural units. Multiple regression in SPSS Same as simple linear regression, but put more than one variable into the independent box. Equation output has a line for each variable: Coefficients: Predicting Q2 from Q3, Q4, Q5 Unstandardized Standardized B SE Beta t Sig. (Constant).407.582.700.485 Q3.679.060.604 11.345.000 Q4 -.028.095 -.017 -.295.768 Q5.112.066.095 1.695.091 Unstandardized coefficients are the average effect of each independent variable, controlling for all other variables, on the dependent variable.

Standardized coefficients Standardized coefficients can be used to compare effect sizes of the independent variables within the regression analysis. In the preceding analysis, a change of 1 standard deviation in Q3 has over 6 times the effect of a change of 1 sd in Q5 and over 30 times the effect of a change of 1 sd in Q4. However, βs s are not stable across analyses and can t t be compared. Stepwise regression In simultaneous regression, all independent variables are entered in the regression equation. In stepwise regression, an algorithm decides which variables to include. The goal of stepwise regression is to develop the model that does the best prediction with the fewest variables. Ideal for creating scoring rules, but atheoretical and can capitalize on chance (post-hoc modeling) Stepwise algorithms In forward stepwise regression, the equation starts with no variables, and the variable that accounts for the most variance is added first. Then the next variable that can add new variance is added, if it adds a significant amount of variance, etc. In backward stepwise regression, the equation starts with all variables; variables that don t add significant variance are removed. There are also hybrid algorithms that both add and remove.

Stepwise regression in SPSS Analyze Regression Linear Enter dependent variable and independent variables in the independents box, as before Change Method in the independents box from Enter to: Forward Backward Stepwise Hierarchical regression In hierarchical regression, we fit a hierarchy of regression models, adding variables according to theory and checking to see if they contribute additional variance. You control the order in which variables are added Used for analyzing the effect of dependent variables on independent variables in the presence of moderating variables. Also called path analysis,, and equivalent to analysis of covariance (ANCOVA). Hierarchical regression in SPSS Analyze Regression Linear Enter dependent variable, and the independent variables you want added for the smallest model Click Next in the independents box Enter additional independent variables repeat as required

Hierarchical regression example In the hyp data, there is a correlation of - 0.7 between case-based course and final exam. Is the relationship between final exam score and course format moderated by midterm exam score? Case-based Final Midterm Hierarchical regression example To answer the question, we: Predict final exam from midterm and format (gives us the effect of format, controlling for midterm, and the effect of midterm, controlling for format) Predict midterm from format (gives us the effect of format on midterm) After running each regression, write the βs on the path diagram: Predict final from midterm, format B Coefficients SE Beta t Sig. (Constant) 50.68 4.415 11.479.000 Case-based course -26.3 3.563 -.597-7.380.000 midterm exam score.156.061.207 2.566.012 Case-based -.597 * Midterm.207 * Final

Predict midterm from format B Coefficients SE Beta t Sig. (Constant) 63.43 3.606 17.59.000 Case-based course -29.2 5.152 -.496-5.662.000 Case-based -.597 * Final -.496 * Midterm.207 * Conclusions: The course format affects the final exam both directly and through an effect on the midterm exam. In both cases, lecture courses yielded higher scores. Logistic regression Linear regression fits a line. Logistic regression fits a cumulative logistic function S-shaped Bounded by [0,1] This function provides a better fit to binomial dependent variables (e.g. pass/fail) Predicted dependent variable represents the probability of one category (e.g. pass) based on the values of the independent variables. Logistic regression in SPSS Analyze Regression Binary logistic (or multinomial logistic) Enter dependent variable and independent variables Output will include: Goodness of model fit (tests of misfit) Classification table Estimates for effects of independent variables Example: Voting for Clinton vs. Bush in 1992 US election, based on sex, age, college graduate

Logistic regression output Goodness of fit measures: -2 Log Likelihood 2116.474 (lower is better) Goodness of Fit 1568.282 (lower is better) Cox & Snell - R^2.012 (higher is better) Nagelkerke - R^2.016 (higher is better) Chi-Square df Significance Model 18.482 3.0003 (A significant chi-square indicates poor fit (significant difference between predicted and observed data), but most models on large data sets will have significant chi-square) Logistic regression output Classification Table The Cut Value is.50 Predicted Bush Clinton Percent Correct B C Observed ------------------- Bush B 0 661.00% ------------------- Clinton C 0 907 100.00% ------------------- Overall 57.84% Logistic regression output Variable B S.E. Wald df Sig R Exp(B) FEMALE.4312.1041 17.2 1.0000.0843 1.5391 OVER65.1227.1329.85 1.3557.0000 1.1306 COLLGRAD.0818.1115.53 1.4631.0000 1.0852 Constant -.4153.1791 5.4 1.0204 B is the coefficient in log-odds; odds; Exp(B) ) = e B gives the effect size as an odds ratio. Your odds of voting for Clinton are 1.54 times greater if you re a woman than a man.

Wednesday PM assignment Using the semantic data set: Perform a regression to predict total score from semantic classification. Interpret the results. Perform a one-way ANOVA to predict total score from semantic classification. Are the results different? Perform a stepwise regression to predict total score. Include semantic classification, number of distinct semantic qualifiers, reasoning, and knowledge. Perform a logistic regression to predict correct diagnosis from total score and number of distinct semantic qualifiers. Interpret the results.