# Analyzing Structural Equation Models With Missing Data

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G. R. Hancock & R. O. Mueller (Eds.), Structural Equation Modeling: A Second Course (pp. -4). Greenwich, CT: Information Age Publishing. * AERA Extended Course San Francisco April 6 & 7, 006 Structural Equation Modeling: A Second Course Overview Missing data theory, and assumptions pertaining to missing data analyses Full information maximum likelihood (FIML), and multiple imputation (MI) Incorporating information from auxiliary variables

2 Running Example Concepts illustrated using single factor CFA model with seven manifest indicators The indicators are a subset of items from the Eating Attitudes Test (EAT), a widely used inventory for assessing eating disorder risk The data also include body mass index (BMI) scores (BMI = weight(kg)/[height(m)] ) The data are available upon request from Dr. Enders Running Example EAT4: Like stomach to be empty EAT4: Am preoccupied with... fat on body EAT: Think about burning calories... DRIVE FOR THINNESS EAT: Desire to be thinner EAT0: Feel extremely guilty after eating EAT: Avoid eating when I am hungry EAT: Am terrified about being overweight

3 Missing Data Mechanisms Overview Rubin (976) provided a taxonomy for missing data problems Missing completely at random (MCAR) Missing at random (MAR) Missing not at random (MNAR) Missing data mechanisms describe how the missing values are related to the data, if at all Rubin s mechanisms can be viewed as assumptions that dictate the performance of a particular missing data technique Missing Data Mechanisms Missing Completely At Random (MCAR) Missing values on Y are unrelated to other variables, as well as to the unobserved values of Y The observed data are a random sample of the hypothetically complete data For example, the EAT was not administered due to a clerical error or scheduling difficulties; or individuals who did respond randomly failed to make dark enough marks in places on response sheet This is an unusually strict assumption in most cases FIML and MI are unbiased, as are most traditional deletion methods (listwise; pairwise)

4 Missing Data Mechanisms Missing At Random (MAR) Missing values on Y can depend on other variables, but not on the unobserved values of Y For example, missing EAT item responses are related to BMI, such that individuals with very low BMI scores are more likely to skip certain items FIML and MI assume MAR, and are unbiased Traditional deletion (pairwise; listwise) methods are biased Warning: MAR is defined relative to the variables in the analysis. If the cause of the missing data is a measured variable (e.g., BMI) that does not appear in the analysis model, MAR does not hold Missing Data Mechanisms Missing Not At Random (MNAR) Missing values on Y depend on the unobserved values of Y For example, missing EAT scores are related to the respondent s level of eating disorder risk, such that individuals who are highly preoccupied with food tend to skip those items FIML and MI will yield biased estimates under MNAR, although less biased than traditional methods 4

5 The Multivariate Normal Probability Density Function (PDF) The shape of the multivariate normal curve is defined by the familiar equation l i µ Σ i µ i = e p / / ( π ) Σ ( y ) ( y )/ Plugging in a person s raw data values gives us the likelihood (i.e., the relative probability) of that set of scores, given the values of µ and Σ Suppose that µ = 0, σ =, and r =.60 Bivariate Normal Distribution Two cases: X = -.5, Y = 0 X = -.5, Y = - l =.64 Case is closer to the parameter values, and thus has a higher likelihood) =.064 l - Y X 5

6 The Log Likelihood Likelihoods tend to be very small numbers Taking the natural log of the likelihood makes the math a bit more tractable p ( ) log Li = log( π ) log Σ y µ Σ ( y µ ) The log likelihood still quantifies the same thing the relative probability of a person s scores, given a normal distribution with a particular µ and Σ Casewise Log Likelihoods Two cases: X = -.5, Y = 0 X = -.5, Y = - Case has a higher log likelihood, given these parameters =.64 LL = -.8 LL = -.75 =.064 l l - Y X 6

7 Log Likelihood Functions With complete data, each case s contribution to the log likelihood is p ( ) log Li = log( π ) log Σ y µ Σ ( y µ ) In the missing data context, each ith case s contribution to the log likelihood is pi ( ) log Li = logπ log Σ i yi µ i Σ i ( yi µ i) The Missing Data Log Likelihood The sample log likelihood is obtained by summing over the N cases N N log L = K log i ( i i) Σ y µ Σ i ( yi µ i) i= i= In the current CFA example, µ and Σ are functions of the model parameters: Σ = ΛΦΛ + Θ µ = τ + Λκ 7

8 How Does ML Differ With Missing Data? The only difference between the complete data and missing data log likelihood is the i subscript This allows the size and content of the data and parameter arrays to vary across cases The log likelihood is based only on those variables (and parameters) for which case i has complete data Example () Consider a case with three variables, Y, Y, and Y The contribution to the log likelihood for cases who have complete data is σ σ σ y µ σ σ σ y µ log Li = Ki log σ σ σ y µ σ σ σ y µ σ σ σ y σ µ σ σ y µ Σ y µ Σ y µ 8

9 Example () The contribution to the log likelihood for cases who are missing Y is log L i σ σ y µ σ σ y µ = Ki log σ σ y µ σ σ y µ The rows and columns for Y were removed σ σ σ y µ σ σ σ y µ log Li = Ki log σ σ σ y µ σ σ σ y µ σ σ σ y σ µ σ σ y µ Notes on FIML FIML is not much different from complete-data ML estimation, but incorporates a mean structure Missing values are not imputed! Parameters and standard errors are estimated directly using all the observed data Including the partially complete cases serves to steer the estimation algorithm toward a more accurate set of parameters, via the relations among the variables 9

10 Incorporating Auxiliary Variables In order for MAR to hold, the cause of the missing data must appear in the analysis model An auxiliary variable (AV) such as BMI is ancillary to one s substantive hypotheses Adopting an inclusive analysis strategy that incorporates AVs can make MAR more plausible A useful AV is either a potential cause or correlate of missingness, or a correlate of the variable that is missing The Saturated Correlates Model Graham (00) outlined a saturated correlates model for including AVs in an SEM analysis Three rules apply: AVs should be correlated with observed (not latent) predictors in the model AVs should be correlated with the residual terms from observed (not latent) dependent variables AVs should be correlated with one another 0

11 The Saturated Correlates Model EAT Analysis CFA Model EAT4 EAT4 EAT DRIVE FOR THINNESS EAT BMI EAT0 EAT EAT The Saturated Correlates Model Path Model Example X Y Y ζ AV ζ AV

12 The Saturated Correlates Model General Structural Model Example Z ζ X Y x x y y AV AV FIML Miscellanea When given the choice, compute standard errors using the observed information matrix The observed information assumes MAR, while the expected information requires MCAR Rescaled test statistics for nonnormal data are available in Mplus and EQS (see Yuan and Bentler, 000) Sandwich estimator (i.e., robust ) standard errors are also available, but assume MCAR data

13 Overview Of Multiple Imputation Multiple copies (e.g., m = 0) of the incomplete data set are created Each data set is imputed with different estimates of the missing values using multiple regression The SEM analysis is performed on each of the filled-in data sets A single point estimate and standard error is obtained by averaging over the m data sets Imputation And Analysis Phase Unlike FIML, missing data are handled in an imputation phase that is distinct from the analysis The imputation phase is used to create the m imputed data sets Once the data sets are constructed, any number of different analyses can be performed using the same set of imputations

14 Imputation Phase: Imputation (I) Step Begin with initial estimate of µ and Σ Use µ and Σ to construct regression equations for each missing data pattern Impute missing values with predicted scores, and add a normally distributed residual to each to restore lost variability Imputation Phase: Posterior (P) Step Compute new estimates of µ and Σ using the imputed data Conditional on the updated estimates of µ and Σ, randomly sample new values for µ and Σ from a simulated sampling distribution Carry the new values of µ and Σ forward to the next I step, and impute missing values using a new set of regression equations 4

15 EAT Example Case EAT m = m = m = m = 9 m = 0? 5 6 4? 5 Analysis Phase: Combining Estimates After the imputations are created, the model of interest is fit to each of the m data sets A point estimate for any parameter of interest is obtained by averaging the estimates from the m analyses m ˆ θ = θi m i = 5

16 Analysis Phase: Combining S.E.s Between-imputation variance is the variance of the parameter across the m imputations Within-imputation variance is the mean of the m squared S.E.s The total variance (i.e., squared S.E.) V i = m = V ˆ i m i = ( ) i m B = ˆ θ θ m T = V + + B m Single Parameter Significance Tests Single parameter tests are obtained via a t statistic t = The degrees of freedom are complex, and depend on between- and within-imputation variance and m Use adjusted degrees of freedom (Barnard & Rubin, 999) whenever possible θ T 6

17 Incorporating Auxiliary Variables AVs are incorporated as additional predictor variables in the imputation phase The AVs need not be included in the subsequent analyses, because the imputed values have already been conditioned on the AVs As a general rule, the imputation phase should include all variables and design effects (e.g., interaction terms) that appear in the subsequent analysis model MI Miscellanea Mplus and LISREL have functions that automate the process of estimating and combining parameter estimates and standard errors χ statistics can be combined, but this is more complex than a simple average ( Dr. Enders if you want a SAS program to do this) It is unclear how to combine fit indices at this point Assessing MI convergence requires special graphical techniques (see the book chapter) 7

18 Selected Analysis Results Loading EAT EAT EAT0 EAT EAT EAT4 EAT4 FIML.85 (.09).54 (.059).745 (.065).096 (.077). (.0).98 (.07).7 (.077) MI.64 (.07).54 (.059).77 (.066).09 (.077).099 (.0).980 (.07).707 (.074) A Final Comparison of FIML and MI Both procedures assume MAR data and multivariate normality FIML and MI should yield very similar estimates if the same set of input variables are used MI is more complex to use, but the resulting analyses don t require specialized software If an estimation routine exists, it is probably more straightforward to use FIML and incorporate AVs into the analysis 8

### A Basic Introduction to Missing Data

John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

### Craig K. Enders Arizona State University Department of Psychology craig.enders@asu.edu

Craig K. Enders Arizona State University Department of Psychology craig.enders@asu.edu Topic Page Missing Data Patterns And Missing Data Mechanisms 1 Traditional Missing Data Techniques 7 Maximum Likelihood

### An introduction to modern missing data analyses

Journal of School Psychology 48 (2010) 5 37 An introduction to modern missing data analyses Amanda N. Baraldi, Craig K. Enders Arizona State University, United States Received 19 October 2009; accepted

### APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

### Missing Data Dr Eleni Matechou

1 Statistical Methods Principles Missing Data Dr Eleni Matechou matechou@stats.ox.ac.uk References: R.J.A. Little and D.B. Rubin 2nd edition Statistical Analysis with Missing Data J.L. Schafer and J.W.

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

### Problem of Missing Data

VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

### ZHIYONG ZHANG AND LIJUAN WANG

PSYCHOMETRIKA VOL. 78, NO. 1, 154 184 JANUARY 2013 DOI: 10.1007/S11336-012-9301-5 METHODS FOR MEDIATION ANALYSIS WITH MISSING DATA ZHIYONG ZHANG AND LIJUAN WANG UNIVERSITY OF NOTRE DAME Despite wide applications

### Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

### Handling attrition and non-response in longitudinal data

Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

### Modern Methods for Missing Data

Modern Methods for Missing Data Paul D. Allison, Ph.D. Statistical Horizons LLC www.statisticalhorizons.com 1 Introduction Missing data problems are nearly universal in statistical practice. Last 25 years

### Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random

[Leeuw, Edith D. de, and Joop Hox. (2008). Missing Data. Encyclopedia of Survey Research Methods. Retrieved from http://sage-ereference.com/survey/article_n298.html] Missing Data An important indicator

### Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis

Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics

ALLISON 1 TABLE OF CONTENTS 1. INTRODUCTION... 3 2. ASSUMPTIONS... 6 MISSING COMPLETELY AT RANDOM (MCAR)... 6 MISSING AT RANDOM (MAR)... 7 IGNORABLE... 8 NONIGNORABLE... 8 3. CONVENTIONAL METHODS... 10

### A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA

123 Kwantitatieve Methoden (1999), 62, 123-138. A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA Joop J. Hox 1 ABSTRACT. When we deal with a large data set with missing data, we have to undertake

### Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

### Missing Data in Educational Research: A Review of Reporting Practices and Suggestions for Improvement

Review of Educational Research Winter 2004, Vol. 74, No. 4, pp. 525-556 Missing Data in Educational Research: A Review of Reporting Practices and Suggestions for Improvement James L. Peugh University of

### Missing Data & How to Deal: An overview of missing data. Melissa Humphries Population Research Center

Missing Data & How to Deal: An overview of missing data Melissa Humphries Population Research Center Goals Discuss ways to evaluate and understand missing data Discuss common missing data methods Know

### Nonrandomly Missing Data in Multiple Regression Analysis: An Empirical Comparison of Ten Missing Data Treatments

Brockmeier, Kromrey, & Hogarty Nonrandomly Missing Data in Multiple Regression Analysis: An Empirical Comparison of Ten s Lantry L. Brockmeier Jeffrey D. Kromrey Kristine Y. Hogarty Florida A & M University

### Challenges in Longitudinal Data Analysis: Baseline Adjustment, Missing Data, and Drop-out

Challenges in Longitudinal Data Analysis: Baseline Adjustment, Missing Data, and Drop-out Sandra Taylor, Ph.D. IDDRC BBRD Core 23 April 2014 Objectives Baseline Adjustment Introduce approaches Guidance

### How To Use A Monte Carlo Study To Decide On Sample Size and Determine Power

How To Use A Monte Carlo Study To Decide On Sample Size and Determine Power Linda K. Muthén Muthén & Muthén 11965 Venice Blvd., Suite 407 Los Angeles, CA 90066 Telephone: (310) 391-9971 Fax: (310) 391-8971

### Missing Data Techniques for Structural Equation Modeling

Journal of Abnormal Psychology Copyright 2003 by the American Psychological Association, Inc. 2003, Vol. 112, No. 4, 545 557 0021-843X/03/\$12.00 DOI: 10.1037/0021-843X.112.4.545 Missing Data Techniques

### Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

### 2. Making example missing-value datasets: MCAR, MAR, and MNAR

Lecture 20 1. Types of missing values 2. Making example missing-value datasets: MCAR, MAR, and MNAR 3. Common methods for missing data 4. Compare results on example MCAR, MAR, MNAR data 1 Missing Data

### Published online: 25 Sep 2014.

This article was downloaded by: [Cornell University Library] On: 23 February 2015, At: 11:23 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:

### Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and

### Statistical Analysis with Missing Data

Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### Best Practices for Missing Data Management in Counseling Psychology

Journal of Counseling Psychology 2010 American Psychological Association 2010, Vol. 57, No. 1, 1 10 0022-0167/10/\$12.00 DOI: 10.1037/a0018082 Best Practices for Missing Data Management in Counseling Psychology

### HANDLING DROPOUT AND WITHDRAWAL IN LONGITUDINAL CLINICAL TRIALS

HANDLING DROPOUT AND WITHDRAWAL IN LONGITUDINAL CLINICAL TRIALS Mike Kenward London School of Hygiene and Tropical Medicine Acknowledgements to James Carpenter (LSHTM) Geert Molenberghs (Universities of

### Dealing with Missing Data

Res. Lett. Inf. Math. Sci. (2002) 3, 153-160 Available online at http://www.massey.ac.nz/~wwiims/research/letters/ Dealing with Missing Data Judi Scheffer I.I.M.S. Quad A, Massey University, P.O. Box 102904

### Imputing Missing Data using SAS

ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

### A Review of Methods. for Dealing with Missing Data. Angela L. Cool. Texas A&M University 77843-4225

Missing Data 1 Running head: DEALING WITH MISSING DATA A Review of Methods for Dealing with Missing Data Angela L. Cool Texas A&M University 77843-4225 Paper presented at the annual meeting of the Southwest

### Analysis of Longitudinal Data with Missing Values.

Analysis of Longitudinal Data with Missing Values. Methods and Applications in Medical Statistics. Ingrid Garli Dragset Master of Science in Physics and Mathematics Submission date: June 2009 Supervisor:

### Applications of Structural Equation Modeling in Social Sciences Research

American International Journal of Contemporary Research Vol. 4 No. 1; January 2014 Applications of Structural Equation Modeling in Social Sciences Research Jackson de Carvalho, PhD Assistant Professor

### Factor Analysis. Factor Analysis

Factor Analysis Principal Components Analysis, e.g. of stock price movements, sometimes suggests that several variables may be responding to a small number of underlying forces. In the factor model, we

### How to Use a Monte Carlo Study to Decide on Sample Size and Determine Power

STRUCTURAL EQUATION MODELING, 9(4), 599 620 Copyright 2002, Lawrence Erlbaum Associates, Inc. TEACHER S CORNER How to Use a Monte Carlo Study to Decide on Sample Size and Determine Power Linda K. Muthén

### Comparison of Estimation Methods for Complex Survey Data Analysis

Comparison of Estimation Methods for Complex Survey Data Analysis Tihomir Asparouhov 1 Muthen & Muthen Bengt Muthen 2 UCLA 1 Tihomir Asparouhov, Muthen & Muthen, 3463 Stoner Ave. Los Angeles, CA 90066.

### Dealing with missing data: Key assumptions and methods for applied analysis

Technical Report No. 4 May 6, 2013 Dealing with missing data: Key assumptions and methods for applied analysis Marina Soley-Bori msoley@bu.edu This paper was published in fulfillment of the requirements

### Copyright 2010 The Guilford Press. Series Editor s Note

This is a chapter excerpt from Guilford Publications. Applied Missing Data Analysis, by Craig K. Enders. Copyright 2010. Series Editor s Note Missing data are a real bane to researchers across all social

### On Treatment of the Multivariate Missing Data

On Treatment of the Multivariate Missing Data Peter J. Foster, Ahmed M. Mami & Ali M. Bala First version: 3 September 009 Research Report No. 3, 009, Probability and Statistics Group School of Mathematics,

### Perform hypothesis testing

Multivariate hypothesis tests for fixed effects Testing homogeneity of level-1 variances In the following sections, we use the model displayed in the figure below to illustrate the hypothesis tests. Partial

### Combining Multiple Imputation and Inverse Probability Weighting

Combining Multiple Imputation and Inverse Probability Weighting Shaun Seaman 1, Ian White 1, Andrew Copas 2,3, Leah Li 4 1 MRC Biostatistics Unit, Cambridge 2 MRC Clinical Trials Unit, London 3 UCL Research

### Data Cleaning and Missing Data Analysis

Data Cleaning and Missing Data Analysis Dan Merson vagabond@psu.edu India McHale imm120@psu.edu April 13, 2010 Overview Introduction to SACS What do we mean by Data Cleaning and why do we do it? The SACS

### Item Imputation Without Specifying Scale Structure

Original Article Item Imputation Without Specifying Scale Structure Stef van Buuren TNO Quality of Life, Leiden, The Netherlands University of Utrecht, The Netherlands Abstract. Imputation of incomplete

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Introduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses

Introduction to Multivariate Models: Modeling Multivariate Outcomes with Mixed Model Repeated Measures Analyses Applied Multilevel Models for Cross Sectional Data Lecture 11 ICPSR Summer Workshop University

### Imputing Attendance Data in a Longitudinal Multilevel Panel Data Set

Imputing Attendance Data in a Longitudinal Multilevel Panel Data Set April 2015 SHORT REPORT Baby FACES 2009 This page is left blank for double-sided printing. Imputing Attendance Data in a Longitudinal

### Missing data and net survival analysis Bernard Rachet

Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,

### Missing Data. Paul D. Allison INTRODUCTION

4 Missing Data Paul D. Allison INTRODUCTION Missing data are ubiquitous in psychological research. By missing data, I mean data that are missing for some (but not all) variables and for some (but not all)

### MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS)

MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS) R.KAVITHA KUMAR Department of Computer Science and Engineering Pondicherry Engineering College, Pudhucherry, India DR. R.M.CHADRASEKAR Professor,

### ANALYSIS WITH MISSING DATA IN PREVENTION RESEARCH

10 ANALYSIS WITH MISSING DATA IN PREVENTION RESEARCH JOHN W. GRAHAM, SCOTT M. HOFER, STEWART I. DONALDSON, DAVID P. MAcKINNON, AND JOSEPH L. SCHAFER Missing data are pervasive in alcohol and drug abuse

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

### Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University

Missing Data in Longitudinal Studies: To Impute or not to Impute? Robert Platt, PhD McGill University 1 Outline Missing data definitions Longitudinal data specific issues Methods Simple methods Multiple

### Missing Data: Patterns, Mechanisms & Prevention. Edith de Leeuw

Missing Data: Patterns, Mechanisms & Prevention Edith de Leeuw Thema middag Nonresponse en Missing Data, Universiteit Groningen, 30 Maart 2006 Item-Nonresponse Pattern General pattern: various variables

### The Latent Variable Growth Model In Practice. Individual Development Over Time

The Latent Variable Growth Model In Practice 37 Individual Development Over Time y i = 1 i = 2 i = 3 t = 1 t = 2 t = 3 t = 4 ε 1 ε 2 ε 3 ε 4 y 1 y 2 y 3 y 4 x η 0 η 1 (1) y ti = η 0i + η 1i x t + ε ti

### PATTERN MIXTURE MODELS FOR MISSING DATA. Mike Kenward. London School of Hygiene and Tropical Medicine. Talk at the University of Turku,

PATTERN MIXTURE MODELS FOR MISSING DATA Mike Kenward London School of Hygiene and Tropical Medicine Talk at the University of Turku, April 10th 2012 1 / 90 CONTENTS 1 Examples 2 Modelling Incomplete Data

### When to Use Which Statistical Test

When to Use Which Statistical Test Rachel Lovell, Ph.D., Senior Research Associate Begun Center for Violence Prevention Research and Education Jack, Joseph, and Morton Mandel School of Applied Social Sciences

### Introduction to mixed model and missing data issues in longitudinal studies

Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models

### A General Approach to Variance Estimation under Imputation for Missing Survey Data

A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey

### A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

### Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### How to Use a Monte Carlo Study to Decide on Sample Size and Determine Power Linda K. Muthén & Bengt O. Muthén Published online: 19 Nov 2009.

This article was downloaded by: [University of California, Los Angeles (UCLA)] On: 19 February 2014, At: 05:09 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954

### Re-analysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey

Re-analysis using Inverse Probability Weighting and Multiple Imputation of Data from the Southampton Women s Survey MRC Biostatistics Unit Institute of Public Health Forvie Site Robinson Way Cambridge

### Using Medical Research Data to Motivate Methodology Development among Undergraduates in SIBS Pittsburgh

Using Medical Research Data to Motivate Methodology Development among Undergraduates in SIBS Pittsburgh Megan Marron and Abdus Wahed Graduate School of Public Health Outline My Experience Motivation for

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### M. Ehren N. Shackleton. Institute of Education, University of London. June 2014. Grant number: 511490-2010-LLP-NL-KA1-KA1SCR

Impact of school inspections on teaching and learning in primary and secondary education in the Netherlands; Technical report ISI-TL project year 1-3 data M. Ehren N. Shackleton Institute of Education,

### Werner Wothke, SmallWaters Corp. Longitudinal and multi-group modeling with missing data

Werner Wothke, SmallWaters Corp. Longitudinal and multi-group modeling with missing data Reprinted with permission from T.D. Little, K.U. Schnabel and J. Baumert [Eds.] (2000) Modeling longitudinal and

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng. LISREL for Windows: PRELIS User s Guide

Stephen du Toit Mathilda du Toit Gerhard Mels Yan Cheng LISREL for Windows: PRELIS User s Guide Table of contents INTRODUCTION... 1 GRAPHICAL USER INTERFACE... 2 The Data menu... 2 The Define Variables

### Covariance-based Structural Equation Modeling: Foundations and Applications

Covariance-based Structural Equation Modeling: Foundations and Applications Dr. Jörg Henseler University of Cologne Department of Marketing and Market Research & Radboud University Nijmegen Institute for

### Advances in Missing Data Methods and Implications for Educational Research. Chao-Ying Joanne Peng, Indiana University-Bloomington

Advances in Missing Data 1 Running head: MISSING DATA METHODS Advances in Missing Data Methods and Implications for Educational Research Chao-Ying Joanne Peng, Indiana University-Bloomington Michael Harwell,

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### MISSING DATA IN NON-PARAMETRIC TESTS OF CORRELATED DATA

MISSING DATA IN NON-PARAMETRIC TESTS OF CORRELATED DATA Annie Green Howard A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements

### A Review of Methods for Missing Data

Educational Research and Evaluation 1380-3611/01/0704-353\$16.00 2001, Vol. 7, No. 4, pp. 353±383 # Swets & Zeitlinger A Review of Methods for Missing Data Therese D. Pigott Loyola University Chicago, Wilmette,

### Missing Data. Katyn & Elena

Missing Data Katyn & Elena What to do with Missing Data Standard is complete case analysis/listwise dele;on ie. Delete cases with missing data so only complete cases are le> Two other popular op;ons: Mul;ple

### Presentation Outline. Structural Equation Modeling (SEM) for Dummies. What Is Structural Equation Modeling?

Structural Equation Modeling (SEM) for Dummies Joseph J. Sudano, Jr., PhD Center for Health Care Research and Policy Case Western Reserve University at The MetroHealth System Presentation Outline Conceptual

### The General Linear Model: Theory

Gregory Carey, 1998 General Linear Model - 1 The General Linear Model: Theory 1.0 Introduction In the discussion of multiple regression, we used the following equation to express the linear model for a

### Missing Data Part 1: Overview, Traditional Methods Page 1

Missing Data Part 1: Overview, Traditional Methods Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised January 17, 2015 This discussion borrows heavily from: Applied

### CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### lavaan: an R package for structural equation modeling

lavaan: an R package for structural equation modeling Yves Rosseel Department of Data Analysis Belgium Utrecht April 24, 2012 Yves Rosseel lavaan: an R package for structural equation modeling 1 / 20 Overview

### Lecture #2 Overview. Basic IRT Concepts, Models, and Assumptions. Lecture #2 ICPSR Item Response Theory Workshop

Basic IRT Concepts, Models, and Assumptions Lecture #2 ICPSR Item Response Theory Workshop Lecture #2: 1of 64 Lecture #2 Overview Background of IRT and how it differs from CFA Creating a scale An introduction

### Dealing with Missing Data

Dealing with Missing Data Roch Giorgi email: roch.giorgi@univ-amu.fr UMR 912 SESSTIM, Aix Marseille Université / INSERM / IRD, Marseille, France BioSTIC, APHM, Hôpital Timone, Marseille, France January

### Dr James Roger. GlaxoSmithKline & London School of Hygiene and Tropical Medicine.

American Statistical Association Biopharm Section Monthly Webinar Series: Sensitivity analyses that address missing data issues in Longitudinal studies for regulatory submission. Dr James Roger. GlaxoSmithKline

### Models for Count Data With Overdispersion

Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extra-poisson variation and the negative binomial model, with brief appearances

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Regression Modeling Strategies

Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions

### Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

### CHAPTER 3 COMMONLY USED STATISTICAL TERMS

CHAPTER 3 COMMONLY USED STATISTICAL TERMS There are many statistics used in social science research and evaluation. The two main areas of statistics are descriptive and inferential. The third class of

### Missing Data Sensitivity Analysis of a Continuous Endpoint An Example from a Recent Submission

Missing Data Sensitivity Analysis of a Continuous Endpoint An Example from a Recent Submission Arno Fritsch Clinical Statistics Europe, Bayer November 21, 2014 ASA NJ Chapter / Bayer Workshop, Whippany

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

### Missing Data: Our View of the State of the Art

Psychological Methods Copyright 2002 by the American Psychological Association, Inc. 2002, Vol. 7, No. 2, 147 177 1082-989X/02/\$5.00 DOI: 10.1037//1082-989X.7.2.147 Missing Data: Our View of the State