Multiple Imputation and Multiple Regression with SAS and IBM SPSS
|
|
- Mervyn Cummings
- 7 years ago
- Views:
Transcription
1 Multiple Imputation and Multiple Regression with SAS and IBM SPSS See IntroQ Questionnaire for a description of the survey used to generate the data used here. *** Mult-Imput_M-Reg.sas ***; options pageno=min nodate formdlim='-'; title 'Multiple Imputation of Missing Data then Multiple Regression.'; run; PROC IMPORT OUT= WORK.IntroQuest DATAFILE= "C:\Users\Vati\Documents\StatData\IntroQ\IntroQ.sav" DBMS=SPSS REPLACE; RUN; Data Priapus; set IntroQuest; SATM_Miss = 0; If SATM =. then SATM_Miss = 1; proc means n nmiss; run; proc corr nosimple; var SATM_Miss; with statoph gender ideal nucoph year; run; The data are imported from an SPSS.sav file. The MEANS Procedure Variable Label N N Miss Gender Gender Ideal Ideal Eye Eye Statoph Statoph Nucoph Nucoph SATM SATM Year Year Pearson Correlation Coefficients Prob > r under H0: Rho=0 Number of Observations SATM_Miss Statoph Statoph Gender Gender MultReg_Mult-Imputation.docx
2 2 Ideal Ideal Nucoph Nucoph Year Year Note that missingness on SATM is associated with statphobia and year Proc MI seed=69301 out=midata; var statoph gender ideal nucoph SATM year; run; Proc MI is used to create five imputations. Data Set Method Multiple Imputation Chain Initial s for MCMC Start Prior Model Information Number of Imputations 5 Number of Burn-in Iterations 200 Number of Iterations 100 WORK.INTROQUEST MCMC Single Chain EM Posterior Mode Starting Value Jeffreys Seed for random number generator Missing Data Patterns Group Statoph Gender Ideal Nucoph SATM Year Freq Percent Group Means Statoph Gender Ideal Nucoph SATM Year 1 X X X X X X X X X X. X X X X. X X X X. X X X X X. X. X X X X X X X X X. X X X. X X
3 The most common pattern (aside from complete data) is missingness only on SATM. We have means for each of the patterns. Those missing data on SATM do not appear to differ much from those with SATM data. Below we have Expectation Maximization estimates of means and covariances. Missingness on SATM is related to statophobia, by the way. 3 EM (Posterior Mode) s _TYPE NAME_ Statoph Gender Ideal Nucoph SATM Year MEAN COV Statoph COV Gender COV Ideal COV Nucoph COV SATM COV Year Variance Information Variable Variance DF Relative Between Within Total Increase in Variance Fraction Missing Information Relative Efficiency Statoph Ideal Nucoph SATM Snip, snip. I have culled the rest of the text output from Proc MI. Proc Reg outest=mrbyimput covout; Model Statoph = gender ideal nucoph SATM year / stb; By _Imputation_; run; quit; Proc MIAnalyze; modeleffects intercept gender ideal nucoph SATM year; run; Here we used Proc Reg to conduct a multiple regression analysis on each of the five imputations Imputation Number= Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total
4 4 Variable Label Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var DF Parameter Parameter s Standard Error t Value Pr > t Standardized Intercept Intercept Gender Gender Ideal Ideal Nucoph Nucoph SATM SATM < Year Year Multiple Imputation of Missing Data then Multiple Regression Imputation Number= Variable Label Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var DF Parameter Parameter s Standard Error t Value Pr > t Standardized Intercept Intercept Gender Gender Ideal Ideal Nucoph Nucoph SATM SATM < Year Year
5 Imputation Number= Variable Label Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var DF Parameter Parameter s Standard Error t Value Pr > t Standardized Intercept Intercept Gender Gender Ideal Ideal Nucoph Nucoph SATM SATM < Year Year Imputation Number= Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var
6 6 Variable Label DF Parameter Parameter s Standard Error t Value Pr > t Standardized Intercept Intercept Gender Gender Ideal Ideal Nucoph Nucoph SATM SATM < Year Year Imputation Number= Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Variable Label DF Parameter Parameter s Standard Error t Value Pr > t Standardized Intercept Intercept Gender Gender Ideal Ideal Nucoph Nucoph SATM SATM < Year Year Proc MIAnalyze is used to pool the results from the multiple imputations. The variance in the scores is partitioned between that among imputations (A) and that within imputations (W). The Relative Increase in Variance (r) is the increase in variance due to having missing data imputed
7 (relative to the condition where no data are missing), 1 m r W 1 A, where m is the number of imputations. A related statistic, Fraction of Missing Information, is an index of how much more precise the parameter estimate would have been if there had been no missing data. Power will, of course, be greater when the fraction of missing information and relative increase in variance are small. The greater the number of imputations, the less the error and the greater the power, ceteris paribus. Relative efficiency tells you how much power you have for the number of imputations you have employed relative to what you would have if you used an uncountably large number of imputations. The MIANALYZE Procedure Variance Information Parameter Variance DF Relative Between Within Total Increase in Variance Fraction Missing Information Relative Efficiency intercept gender ideal nucoph SATM year Parameter 95% Confidence Limits DF Minimum Maximum t Pr > t intercept gender ideal nucoph SATM <.0001 year Multiple Imputation for Missing Data: Concepts and New Development (Version 9.0)
8 Multiple Imputation with IBM SPSS 8 Analyze, Multiple Imputation, Impute Missing Data Values *Impute Missing Data Values. DATASET DECLARE IntroQ_Imputed. MULTIPLE IMPUTATION Statoph Gender Ideal Nucoph SATM Year /IMPUTE METHOD=AUTO NIMPUTATIONS=5 MAXPCTMISSING=NONE /MISSINGSUMMARIES NONE /IMPUTATIONSUMMARIES MODELS /OUTFILE IMPUTATIONS=IntroQ_Imputed. Multiple Imputation [DataSet] C:\Users\Vati\Documents\StatData\IntroQ\IntroQ.sav
9 9 Imputation Specifications Imputation Method Automatic Number of Imputations 5 Model for Scale Variables Linear Regression Interactions Included in (none) Models Maximum Percentage of 100.0% Missing Values Maximum Number of Parameters in Imputation 100 Model Imputed Values Imputation Results Imputation Method Fully Conditional Specification Fully Conditional Specification Method Iterations 10 Imputed Statoph,Ideal,Nucoph,SATM Not Imputed(Too Many Dependent Variables Missing Values) Not Imputed(No Missing Values) Gender,Year Imputation Sequence Gender,Year,Nucoph,Ideal,Stato ph,satm Nucoph Ideal Statoph SATM Type Linear Regression Linear Regression Linear Regression Linear Regression Imputation Models Model Missing Values Imputed Effects Values Gender,Year,I deal,statoph,s ATM 2 10 Gender,Year,N ucoph,statoph, 5 25 SATM Gender,Year,N ucoph,ideal,s 9 45 ATM Gender,Year,N ucoph,ideal,st atoph
10 10 At this point SPSS has created a new data set with the original data (imputation 0) and the imputed data (in this case, imputations 1 through 5). The cells with imputed scores fall are highlighted. At this point, all you need do is run the desired analysis. If that analysis is supported, it will automatically analyze the original data and each imputed set of data and give you convenient summaries of the results. DATASET ACTIVATE IntroQ_MultipleImputation. REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT Statoph /METHOD=ENTER Gender Ideal Nucoph SATM Year.
11 11 Model Summary Imputation_ Model R R Square Adjusted R Square Std. Error of the Original data a b b b b b a. Predictors: (Constant), Year, Nucoph, Ideal, SATM, Gender b. Predictors: (Constant), Year, Gender, Nucoph, SATM, Ideal ANOVA a Imputation_ Model Sum of Squares df Mean Square F Sig. Regression b Original data 1 Residual Total Regression c 1 1 Residual Total Regression c 2 1 Residual Total Regression c 3 1 Residual Total Regression c 4 1 Residual Total Regression c 5 1 Residual Total a. Dependent Variable: Statoph b. Predictors: (Constant), Year, Nucoph, Ideal, SATM, Gender c. Predictors: (Constant), Year, Gender, Nucoph, SATM, Ideal
12 12 Coefficients a Imputation_ Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta (Constant) Gender Original data 1 Ideal Nucoph SATM Year (Constant) Gender Ideal Nucoph SATM Year (Constant) Gender Ideal Nucoph SATM Year (Constant) Gender Ideal Nucoph SATM Year (Constant) Gender Ideal Nucoph SATM Year
13 13 Coefficients a Imputation_ Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) Gender Ideal Nucoph SATM Year (Constant) Gender Pooled 1 Ideal Nucoph SATM Year Coefficients a Imputation_ Model Fraction Missing Info. Relative Increase Variance Relative Efficiency (Constant) Gender Pooled 1 Ideal Nucoph SATM Year a. Dependent Variable: Statoph Karl L. Wuensch, September, 2013 Return to Wuensch s Stats Lessons Page Treatment of Missing Data recommended reading, David Howell.
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationImputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More information6 Variables: PD MF MA K IAH SBS
options pageno=min nodate formdlim='-'; title 'Canonical Correlation, Journal of Interpersonal Violence, 10: 354-366.'; data SunitaPatel; infile 'C:\Users\Vati\Documents\StatData\Sunita.dat'; input Group
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationSimple Linear Regression, Scatterplots, and Bivariate Correlation
1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.
More information2. Making example missing-value datasets: MCAR, MAR, and MNAR
Lecture 20 1. Types of missing values 2. Making example missing-value datasets: MCAR, MAR, and MNAR 3. Common methods for missing data 4. Compare results on example MCAR, MAR, MNAR data 1 Missing Data
More information1.1. Simple Regression in Excel (Excel 2010).
.. Simple Regression in Excel (Excel 200). To get the Data Analysis tool, first click on File > Options > Add-Ins > Go > Select Data Analysis Toolpack & Toolpack VBA. Data Analysis is now available under
More informationModeration. Moderation
Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation
More informationMultiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.
Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationRandom effects and nested models with SAS
Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationMultiple Regression. Page 24
Multiple Regression Multiple regression is an extension of simple (bi-variate) regression. The goal of multiple regression is to enable a researcher to assess the relationship between a dependent (predicted)
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More informationBasic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
More informationProblem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationSensitivity Analysis in Multiple Imputation for Missing Data
Paper SAS270-2014 Sensitivity Analysis in Multiple Imputation for Missing Data Yang Yuan, SAS Institute Inc. ABSTRACT Multiple imputation, a popular strategy for dealing with missing values, usually assumes
More informationI n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s
I n d i a n a U n i v e r s i t y U n i v e r s i t y I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Linear Regression Models for Panel Data Using SAS, Stata, LIMDEP, and SPSS * Hun Myoung Park,
More informationDoing Multiple Regression with SPSS. In this case, we are interested in the Analyze options so we choose that menu. If gives us a number of choices:
Doing Multiple Regression with SPSS Multiple Regression for Data Already in Data Editor Next we want to specify a multiple regression analysis for these data. The menu bar for SPSS offers several options:
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationABSTRACT INTRODUCTION READING THE DATA SESUG 2012. Paper PO-14
SESUG 2012 ABSTRACT Paper PO-14 Spatial Analysis of Gastric Cancer in Costa Rica using SAS So Young Park, North Carolina State University, Raleigh, NC Marcela Alfaro-Cordoba, North Carolina State University,
More informationVI. Introduction to Logistic Regression
VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationIntroduction to proc glm
Lab 7: Proc GLM and one-way ANOVA STT 422: Summer, 2004 Vince Melfi SAS has several procedures for analysis of variance models, including proc anova, proc glm, proc varcomp, and proc mixed. We mainly will
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationStepwise Regression. Chapter 311. Introduction. Variable Selection Procedures. Forward (Step-Up) Selection
Chapter 311 Introduction Often, theory and experience give only general direction as to which of a pool of candidate variables (including transformed variables) should be included in the regression model.
More informationMissing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13
Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationSIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.
SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation
More informationA Basic Guide to Analyzing Individual Scores Data with SPSS
A Basic Guide to Analyzing Individual Scores Data with SPSS Step 1. Clean the data file Open the Excel file with your data. You may get the following message: If you get this message, click yes. Delete
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationIBM SPSS Missing Values 22
IBM SPSS Missing Values 22 Note Before using this information and the product it supports, read the information in Notices on page 23. Product Information This edition applies to version 22, release 0,
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationLab 5 Linear Regression with Within-subject Correlation. Goals: Data: Use the pig data which is in wide format:
Lab 5 Linear Regression with Within-subject Correlation Goals: Data: Fit linear regression models that account for within-subject correlation using Stata. Compare weighted least square, GEE, and random
More informationLecture 14: GLM Estimation and Logistic Regression
Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South
More informationMilk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED
1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 2. Introduction to SAS PROC MIXED The MIXED procedure provides you with flexibility
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationGetting Correct Results from PROC REG
Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationMissing Data. Paul D. Allison INTRODUCTION
4 Missing Data Paul D. Allison INTRODUCTION Missing data are ubiquitous in psychological research. By missing data, I mean data that are missing for some (but not all) variables and for some (but not all)
More informationPredictability Study of ISIP Reading and STAAR Reading: Prediction Bands. March 2014
Predictability Study of ISIP Reading and STAAR Reading: Prediction Bands March 2014 Chalie Patarapichayatham 1, Ph.D. William Fahle 2, Ph.D. Tracey R. Roden 3, M.Ed. 1 Research Assistant Professor in the
More informationModerator and Mediator Analysis
Moderator and Mediator Analysis Seminar General Statistics Marijtje van Duijn October 8, Overview What is moderation and mediation? What is their relation to statistical concepts? Example(s) October 8,
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationData Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationRegression step-by-step using Microsoft Excel
Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationModule 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling
Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling Pre-requisites Modules 1-4 Contents P5.1 Comparing Groups using Multilevel Modelling... 4
More informationMultiple Imputation for Missing Data: Concepts and New Development (Version 9.0)
Multiple Imputation for Missing Data: Concepts and New Development (Version 9.0) Yang C. Yuan, SAS Institute Inc., Rockville, MD Abstract Multiple imputation provides a useful strategy for dealing with
More informationIllustration (and the use of HLM)
Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationAddressing Alternative. Multiple Regression. 17.871 Spring 2012
Addressing Alternative Explanations: Multiple Regression 17.871 Spring 2012 1 Did Clinton hurt Gore example Did Clinton hurt Gore in the 2000 election? Treatment is not liking Bill Clinton 2 Bivariate
More informationJD Eveland PhD. Presented to TUI University Faculty Research Forum. 30 November 2007
JD Eveland PhD Presented to TUI University Faculty Research Forum 30 November 2007 A follow-up to the CARMA presentation by Dr. James LeBreton on Relative Importance of Predictors with Regression Models
More informationMulticollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015
Multicollinearity Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 13, 2015 Stata Example (See appendices for full example).. use http://www.nd.edu/~rwilliam/stats2/statafiles/multicoll.dta,
More informationStatistics, Data Analysis & Econometrics
Using the LOGISTIC Procedure to Model Responses to Financial Services Direct Marketing David Marsh, Senior Credit Risk Modeler, Canadian Tire Financial Services, Welland, Ontario ABSTRACT It is more important
More informationproveeks_bilag.out The SAS System 22:27 Thursday, November 27, 2003 1 Source DF Squares Square F Value Pr > F
The SAS System 22:27 Thursday, November 27, 2003 1 Model 4 18106 4526.41616 54.70
More informationOutline. Session A: Various Definitions. 1. Basics of Path Diagrams and Path Analysis
Session A: Basics of Structural Equation Modeling and The Mplus Computer Program Kevin Grimm University of California, Davis June 9, 008 Outline Basics of Path Diagrams and Path Analysis Regression and
More informationAn Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA
ABSTRACT An Introduction to Statistical Tests for the SAS Programmer Sara Beck, Fred Hutchinson Cancer Research Center, Seattle, WA Often SAS Programmers find themselves in situations where performing
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More informationComparing a Multiple Regression Model Across Groups
Comparing a Multiple Regression Across Groups We might want to know whether a particular set of predictors leads to a multiple regression model that works equally effectively for two (or more) different
More informationDetecting Email Spam. MGS 8040, Data Mining. Audrey Gies Matt Labbe Tatiana Restrepo
Detecting Email Spam MGS 8040, Data Mining Audrey Gies Matt Labbe Tatiana Restrepo 5 December 2011 INTRODUCTION This report describes a model that may be used to improve likelihood of recognizing undesirable
More informationDepartment of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)
Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation
More informationSAS Syntax and Output for Data Manipulation:
Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining
More informationThe Not-Even-Remotely Close to Being a Complete Guide to SPSS / PASW Syntax. (For SPSS / PASW v.18+)
The Not-Even-Remotely Close to Being a Complete Guide to SPSS / PASW Syntax (For SPSS / PASW v.18+) Dr. Bryan R. Burnham Department of Psychology University of Scranton 1 of 49 Table of Contents 1. What
More information10. Analysis of Longitudinal Studies Repeat-measures analysis
Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationCalifornia SCHIP Caregivers Perceptions of Dental Care
California SCHIP Caregivers Perceptions of Dental Care J.J. CRALL, C UCLA / MCHB National Oral Health Policy Center, LA, CA J. BROWN, RAND Survey Research Group, Santa Monica, CA L.U. BROWN, Managed Risk
More informationSPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout
Analyzing Data SPSS Resources 1. See website (readings) for SPSS tutorial & Stats handout Don t have your own copy of SPSS? 1. Use the libraries to analyze your data 2. Download a trial version of SPSS
More informationLecture 11: Confidence intervals and model comparison for linear regression; analysis of variance
Lecture 11: Confidence intervals and model comparison for linear regression; analysis of variance 14 November 2007 1 Confidence intervals and hypothesis testing for linear regression Just as there was
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationMultiple Imputation for Missing Data: A Cautionary Tale
Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationAdvertising value of mobile marketing through acceptance among youth in Karachi
MPRA Munich Personal RePEc Archive Advertising value of mobile marketing through acceptance among youth in Karachi Suleman Syed Akbar and Rehan Azam and Danish Muhammad IQRA UNIVERSITY 1. September 2012
More information2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) F 2 X 4 U 4
1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data. Linearity (in relationships among the variables--factors are linear constructions of the set of variables) 3. Univariate and multivariate
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationAn Introduction to Path Analysis. nach 3
An Introduction to Path Analysis Developed by Sewall Wright, path analysis is a method employed to determine whether or not a multivariate set of nonexperimental data fits well with a particular (a priori)
More informationUsing Correlation and Regression: Mediation, Moderation, and More
Using Correlation and Regression: Mediation, Moderation, and More Part 2: Mediation analysis with regression Claremont Graduate University Professional Development Workshop August 22, 2015 Dale Berger,
More informationResearch Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
More informationCoefficient of Determination
Coefficient of Determination The coefficient of determination R 2 (or sometimes r 2 ) is another measure of how well the least squares equation ŷ = b 0 + b 1 x performs as a predictor of y. R 2 is computed
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationTopic 3. Chapter 5: Linear Regression in Matrix Form
Topic Overview Statistics 512: Applied Linear Models Topic 3 This topic will cover thinking in terms of matrices regression on multiple predictor variables case study: CS majors Text Example (NKNW 241)
More informationTesting for Lack of Fit
Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit
More informationUsing Excel for Statistical Analysis
Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure
More informationDATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University
DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationDealing with Missing Data
Res. Lett. Inf. Math. Sci. (2002) 3, 153-160 Available online at http://www.massey.ac.nz/~wwiims/research/letters/ Dealing with Missing Data Judi Scheffer I.I.M.S. Quad A, Massey University, P.O. Box 102904
More informationMultiple Regression Using SPSS
Multiple Regression Using SPSS The following sections have been adapted from Field (2009) Chapter 7. These sections have been edited down considerably and I suggest (especially if you re confused) that
More informationxtmixed & denominator degrees of freedom: myth or magic
xtmixed & denominator degrees of freedom: myth or magic 2011 Chicago Stata Conference Phil Ender UCLA Statistical Consulting Group July 2011 Phil Ender xtmixed & denominator degrees of freedom: myth or
More information