Statistical Analysis with Missing Data

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Statistical Analysis with Missing Data"

Transcription

1 Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION

2 Contents Preface PARTI OVERVIEW AND BASIC APPROACHES 1. Introduction 1.1. The Problem of Missing Data, Missing-Data Patterns, Mechanisms That Lead to Missing Data, A Taxonomy of Missing-Data Methods, Missing Data in Experiments 2.1. Introduction, The Exact Least Squares Solution with Complete Data, The Correct Least Squares Analysis with Missing Data, Filling in Least Squares Estimates, Yates's Method, Using a Formula for the Missing Values, Iterating to Find the Missing Values, ANCOVA with Missing-Value Covariates, Bartlett's ANCOVA Method, Useful Properties ofbartlett's Method, Notation, The ANCOVA Estimates of Parameters and Missing Y Values, ANCOVA Estimates of the Residual Sums of Squares and the Covariance Matrix of ß, 31

3 2.6. Least Squares Estimates of Missing Values by ANCOVA Using Only Complete-Data Methods, Correct Least Squares Estimates of Standard Errors and One Degree of Freedom Sums of Squares, Correct Least Squares Sums of Squares with More Than One Degree of Freedom, 37 CONTENTS 3. Complete-Case and Available-Case Analysis, Including Weighting Methods Introduction, Complete-Case Analysis, Weighted Complete-Case Analysis, Weighting Adjustments, Added Variance from Nonresponse Weighting, Post-Stratification and Raking To Known Margins, Inference from Weighted Data, Summary of Weighting Methods, Available-Case Analysis, Single Imputation Methods Introduction, Imputing Means from a Predictive Distribution, Unconditional Mean Imputation, Conditional Mean Imputation, Imputing Draws from a Predictive Distribution, Draws Based on Explicit Models, Draws Based on Implicit Models, Conclusions, Estimation of Imputation Uncertainty Introduction, Imputation Methods that Provide Valid Standard Errors from a Single Filled-in Data Set, Standard Errors for Imputed Data by Resampling, Bootstrap Standard Errors, Jackknife Standard Errors, Introduction to Multiple Imputation, Comparison of Resampling Methods and Multiple Imputation, 89

4 CONTENTS PART II LIKELIHOOD-BASED APPROACHES TO THE ANALYSIS OF MISSING DATA 6. Theory of Inference Based on the Likelihood Function 6.1. Review of Likelihood-Based Estimation for Complete Data, Maximum Likelihood Estimation, Rudiments of Bayes Estimation, Large-Sample Maximum Likelihood and Bayes Inference, Bayes Inference Based on the Füll Posterior Distribution, Simulating Draws from Posterior Distributions, Likelihood-Based Inference with Incomplete Data, A Generally Flawed Alternative to Maximum Likelihood: Maximizing Over the Parameters and the Missing Data, The Method, Background, Examples, Likelihood Theory for Coarsened Data, Factored Likelihood Methods, Ignoring the Missing-Data Mechanism 7.1. Introduction, Bivariate Normal Data with One Variable Subject to Nonresponse: ML Estimation, MLEstimates, Large-Sample Covariance Matrix, Bivariate Normal Monotone Data: Small-Sample Inference, Monotone Data With More Than Two Variables, Multivariate Data With One Normal Variable Subject to Nonresponse, Factorization of the Likelihood for a General Monotone Pattern, Computation for Monotone Normal Data via the Sweep Operator, Bayes Computation for Monotone Normal Data via the Sweep Operator, Factorizations for Special Nonmonotone Patterns, 156

5 Vlll CONTENTS 8. Maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse Alternative Computational Strategies, Introduction to the EM Algorithm, The E and M Steps of EM, Theory of the EM Algorithm, Convergence Properties, EM for Exponential Families, Rate of Convergence of EM, Extensions ofem, ECM Algorithm, ECME and AECM Algorithms, PX-EM Algorithm, Hybrid Maximization Methods, 186 Large-Sample Inference Based on Maximum Likelihood Estimates Standard Errors Based on the Information Matrix, Standard Errors via Methods that do not Require Computing and Inverting an Estimate of the Observed Information Matrix, Supplemental EM Algorithm, Bootstrapping the Observed Data, Other Large Sample Methods, Posterior Standard Errors from Bayesian Methods, Bayes and Multiple Imputation Bayesian Iterative Simulation Methods, Data Augmentation, The Gibbs' Sampler, Assessing Convergence of Iterative Simulations, Some Other Simulation Methods, Multiple Imputation, Large-Sample Bayesian Approximation of the Posterior Mean and Variance Based on a Small Number of Draws, Approximations Using Test Statistics, Other Methods for Creating Multiple Imputations, 214

6 CONTENTS PART III LIKELIHOOD-BASED APPROACHES TO THE ANALYSIS OF INCOMPLETE DATA: SOME EXAMPLES IX 11. Multivariate Normal Examples, Ignoring the Missing-Data Mechanism Introduction, Inference for a Mean Vector and Covariance Matrix with Missing Data Under Normality, The EM Algorithm for Incomplete Multivariate Normal Samples, Estimated Asymptotic Covariance Matrix of (0-6), Bayes Inference for the Normal Model via Data Augmentation, Estimation with a Restricted Covariance Matrix, Multiple Linear Regression, Linear Regression with Missing Values Confined to the Dependent Variable, More General Linear Regression Problems with Missing Data, A General Repeated-Measures Model with Missing Data, Time Series Models, Introduction, Autoregressive Models for Univariate Time Series with Missing Values, Kaiman Filter Models, Robust Estimation Introduction, Robust Estimation for a Univariate Sample, Robust Estimation of the Mean and Covariance Matrix, Multivariate Complete Data, Robust Estimation of the Mean and Covariance Matrix from Data with Missing Values, Adaptive Robust Multivariate Estimation, Bayes Inferences for the t Model, Further Extensions of the t Model, Models for Partially Classified Contingency Tables, Ignoring the Missing-Data Mechanism Introduction, 266

7 X CONTENTS Factored Likelihoods for Monotone Multinomial Data, Introduction, ML Estimation for Monotone Patterns, Precision of Estimation, ML and Bayes Estimation for Multinomial Samples with General Patterns of Missing Data, Loglinear Models for Partially Classified Contingency Tables, The Complete-Data Case, Loglinear Models for Partially Classified Tables, Goodness-of-Fit Tests for Partially Classified Data, Mixed Normal and Non-normal Data with Missing Values, Ignoring the Missing-Data Mechanism Introduction, The General Location Model, The Complete-Data Model and Parameter Estimates, ML Estimation with Missing Values, Details of the E Step Calculations, Bayes Computations for the Unrestricted General Location Model, The General Location Model with Parameter Constraints, Introduction, Restricted Models for the Cell Means, Loglinear Models for the Cell Probabilities, Modifications to the Algorithms of Sections and for Parameter Restrictions, Simplifications when the Categorical Variables are More Observed than the Continuous Variables, Regression Problems Involving Mixtures of Continuous and Categorical Variables, Normal Linear Regression with Missing Continuous or Categorical Covariates, Logistic Regression with Missing Continuous or Categorical Covariates, Further Extensions of the General Location Model, Nonignorable Missing-Data Models Introduction, 312

8 CONTENTS xi Likelihood Theory for Nonignorable Models, Models with Known Nonignorable Missing-Data Mechanisms: Grouped and Rounded Data, Normal Selection Models, Normal Pattern-Mixture Models, Univariate Normal Pattern-Mixture Models, Bivariate Normal Pattern-Mixture Models Identified via Parameter Restrictions, Nonignorable Models for Normal Repeated-Measures Data, Nonignorable Models for Categorical Data, 340 References 349 Author Index 365 Subject Index 371

APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Problem of Missing Data

Problem of Missing Data VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis

Review of the Methods for Handling Missing Data in. Longitudinal Data Analysis Int. Journal of Math. Analysis, Vol. 5, 2011, no. 1, 1-13 Review of the Methods for Handling Missing Data in Longitudinal Data Analysis Michikazu Nakai and Weiming Ke Department of Mathematics and Statistics

More information

Multiple Imputation for Missing Data: A Cautionary Tale

Multiple Imputation for Missing Data: A Cautionary Tale Multiple Imputation for Missing Data: A Cautionary Tale Paul D. Allison University of Pennsylvania Address correspondence to Paul D. Allison, Sociology Department, University of Pennsylvania, 3718 Locust

More information

Note on the EM Algorithm in Linear Regression Model

Note on the EM Algorithm in Linear Regression Model International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Regression Modeling Strategies

Regression Modeling Strategies Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions

More information

TABLE OF CONTENTS ALLISON 1 1. INTRODUCTION... 3

TABLE OF CONTENTS ALLISON 1 1. INTRODUCTION... 3 ALLISON 1 TABLE OF CONTENTS 1. INTRODUCTION... 3 2. ASSUMPTIONS... 6 MISSING COMPLETELY AT RANDOM (MCAR)... 6 MISSING AT RANDOM (MAR)... 7 IGNORABLE... 8 NONIGNORABLE... 8 3. CONVENTIONAL METHODS... 10

More information

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and

More information

An extension of the factoring likelihood approach for non-monotone missing data

An extension of the factoring likelihood approach for non-monotone missing data An extension of the factoring likelihood approach for non-monotone missing data Jae Kwang Kim Dong Wan Shin January 14, 2010 ABSTRACT We address the problem of parameter estimation in multivariate distributions

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Analysis of Longitudinal Data with Missing Values.

Analysis of Longitudinal Data with Missing Values. Analysis of Longitudinal Data with Missing Values. Methods and Applications in Medical Statistics. Ingrid Garli Dragset Master of Science in Physics and Mathematics Submission date: June 2009 Supervisor:

More information

Handling attrition and non-response in longitudinal data

Handling attrition and non-response in longitudinal data Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

More information

Item Imputation Without Specifying Scale Structure

Item Imputation Without Specifying Scale Structure Original Article Item Imputation Without Specifying Scale Structure Stef van Buuren TNO Quality of Life, Leiden, The Netherlands University of Utrecht, The Netherlands Abstract. Imputation of incomplete

More information

Analyzing Structural Equation Models With Missing Data

Analyzing Structural Equation Models With Missing Data Analyzing Structural Equation Models With Missing Data Craig Enders* Arizona State University cenders@asu.edu based on Enders, C. K. (006). Analyzing structural equation models with missing data. In G.

More information

INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

More information

A Review of Methods for Missing Data

A Review of Methods for Missing Data Educational Research and Evaluation 1380-3611/01/0704-353$16.00 2001, Vol. 7, No. 4, pp. 353±383 # Swets & Zeitlinger A Review of Methods for Missing Data Therese D. Pigott Loyola University Chicago, Wilmette,

More information

Missing Data Dr Eleni Matechou

Missing Data Dr Eleni Matechou 1 Statistical Methods Principles Missing Data Dr Eleni Matechou matechou@stats.ox.ac.uk References: R.J.A. Little and D.B. Rubin 2nd edition Statistical Analysis with Missing Data J.L. Schafer and J.W.

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

More information

Applied Missing Data Analysis in the Health Sciences. Statistics in Practice

Applied Missing Data Analysis in the Health Sciences. Statistics in Practice Brochure More information from http://www.researchandmarkets.com/reports/2741464/ Applied Missing Data Analysis in the Health Sciences. Statistics in Practice Description: A modern and practical guide

More information

arxiv: v1 [math.st] 5 Jan 2017

arxiv: v1 [math.st] 5 Jan 2017 Sequential identification of nonignorable missing data mechanisms Mauricio Sadinle and Jerome P. Reiter Duke University arxiv:1701.01395v1 [math.st] 5 Jan 2017 January 6, 2017 Abstract With nonignorable

More information

MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS)

MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS) MISSING DATA IMPUTATION IN CARDIAC DATA SET (SURVIVAL PROGNOSIS) R.KAVITHA KUMAR Department of Computer Science and Engineering Pondicherry Engineering College, Pudhucherry, India DR. R.M.CHADRASEKAR Professor,

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV

Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV Contents List of Figures List of Tables List of Examples Foreword Preface to Volume IV xiii xvi xxi xxv xxix IV.1 Value at Risk and Other Risk Metrics 1 IV.1.1 Introduction 1 IV.1.2 An Overview of Market

More information

Bayesian Multiple Imputation of Zero Inflated Count Data

Bayesian Multiple Imputation of Zero Inflated Count Data Bayesian Multiple Imputation of Zero Inflated Count Data Chin-Fang Weng chin.fang.weng@census.gov U.S. Census Bureau, 4600 Silver Hill Road, Washington, D.C. 20233-1912 Abstract In government survey applications,

More information

HANDLING DROPOUT AND WITHDRAWAL IN LONGITUDINAL CLINICAL TRIALS

HANDLING DROPOUT AND WITHDRAWAL IN LONGITUDINAL CLINICAL TRIALS HANDLING DROPOUT AND WITHDRAWAL IN LONGITUDINAL CLINICAL TRIALS Mike Kenward London School of Hygiene and Tropical Medicine Acknowledgements to James Carpenter (LSHTM) Geert Molenberghs (Universities of

More information

Data fusion with international large scale assessments: a case study using the OECD PISA and TALIS surveys

Data fusion with international large scale assessments: a case study using the OECD PISA and TALIS surveys Kaplan and McCarty Large-scale Assessments in Education 2013, 1:6 RESEARCH Open Access Data fusion with international large scale assessments: a case study using the OECD PISA and TALIS surveys David Kaplan

More information

C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)}

C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} C: LEVEL 800 {MASTERS OF ECONOMICS( ECONOMETRICS)} 1. EES 800: Econometrics I Simple linear regression and correlation analysis. Specification and estimation of a regression model. Interpretation of regression

More information

Analysis of Incomplete Survey Data Multiple Imputation via Bayesian Bootstrap Predictive Mean Matching

Analysis of Incomplete Survey Data Multiple Imputation via Bayesian Bootstrap Predictive Mean Matching Analysis of Incomplete Survey Data Multiple Imputation via Bayesian Bootstrap Predictive Mean Matching Dissertation zur Erlangung des akademischen Grades eines Doktors der Sozial- und Wirtschaftswissenschaften

More information

Subject Index. regression,

Subject Index. regression, Subject Index adaptive MCMC, 308 adding parameters to a model, 185 186 adequate summary, 217, 232 adolescent smoking survey, 148 150 AECM algorithm, 324, 348 airline fatalities, 59, 82 Akaike information

More information

Missing Data Techniques for Structural Equation Modeling

Missing Data Techniques for Structural Equation Modeling Journal of Abnormal Psychology Copyright 2003 by the American Psychological Association, Inc. 2003, Vol. 112, No. 4, 545 557 0021-843X/03/$12.00 DOI: 10.1037/0021-843X.112.4.545 Missing Data Techniques

More information

Analysis of Microdata

Analysis of Microdata Rainer Winkelmann Stefan Boes Analysis of Microdata With 38 Figures and 41 Tables 4y Springer Contents 1 Introduction 1 1.1 What Are Microdata? 1 1.2 Types of Microdata 4 1.2.1 Qualitative Data 4 1.2.2

More information

Linear regression methods for large n and streaming data

Linear regression methods for large n and streaming data Linear regression methods for large n and streaming data Large n and small or moderate p is a fairly simple problem. The sufficient statistic for β in OLS (and ridge) is: The concept of sufficiency is

More information

On Treatment of the Multivariate Missing Data

On Treatment of the Multivariate Missing Data On Treatment of the Multivariate Missing Data Peter J. Foster, Ahmed M. Mami & Ali M. Bala First version: 3 September 009 Research Report No. 3, 009, Probability and Statistics Group School of Mathematics,

More information

The Exponential Family

The Exponential Family The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural

More information

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

More information

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

More information

Statistics in Geophysics: Linear Regression II

Statistics in Geophysics: Linear Regression II Statistics in Geophysics: Linear Regression II Steffen Unkel Department of Statistics Ludwig-Maximilians-University Munich, Germany Winter Term 2013/14 1/28 Model definition Suppose we have the following

More information

Dealing with missing data: Key assumptions and methods for applied analysis

Dealing with missing data: Key assumptions and methods for applied analysis Technical Report No. 4 May 6, 2013 Dealing with missing data: Key assumptions and methods for applied analysis Marina Soley-Bori msoley@bu.edu This paper was published in fulfillment of the requirements

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 Biometrika (????,??,?, pp. 1 14 C???? Biometrika Trust Printed in

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity

More information

IBM SPSS Missing Values 22

IBM SPSS Missing Values 22 IBM SPSS Missing Values 22 Note Before using this information and the product it supports, read the information in Notices on page 23. Product Information This edition applies to version 22, release 0,

More information

Handling missing data in Stata a whirlwind tour

Handling missing data in Stata a whirlwind tour Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled

More information

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,

More information

lavaan: an R package for structural equation modeling

lavaan: an R package for structural equation modeling lavaan: an R package for structural equation modeling Yves Rosseel Department of Data Analysis Belgium Utrecht April 24, 2012 Yves Rosseel lavaan: an R package for structural equation modeling 1 / 20 Overview

More information

Reject Inference in Credit Scoring. Jie-Men Mok

Reject Inference in Credit Scoring. Jie-Men Mok Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business

More information

Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England

Econometric Analysis of Cross Section and Panel Data Second Edition. Jeffrey M. Wooldridge. The MIT Press Cambridge, Massachusetts London, England Econometric Analysis of Cross Section and Panel Data Second Edition Jeffrey M. Wooldridge The MIT Press Cambridge, Massachusetts London, England Preface Acknowledgments xxi xxix I INTRODUCTION AND BACKGROUND

More information

Introduction to latent variable models

Introduction to latent variable models Introduction to latent variable models lecture 1 Francesco Bartolucci Department of Economics, Finance and Statistics University of Perugia, IT bart@stat.unipg.it Outline [2/24] Latent variables and their

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

Centre for Central Banking Studies

Centre for Central Banking Studies Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics

More information

Lab 8: Introduction to WinBUGS

Lab 8: Introduction to WinBUGS 40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Advanced Linear Modeling

Advanced Linear Modeling Ronald Christensen Advanced Linear Modeling Multivariate, Time Series, and Spatial Data; Nonparametric Regression and Response Surface Maximization Second Edition Springer Preface to the Second Edition

More information

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

More information

Workpackage 11 Imputation and Non-Response. Deliverable 11.2

Workpackage 11 Imputation and Non-Response. Deliverable 11.2 Workpackage 11 Imputation and Non-Response Deliverable 11.2 2004 II List of contributors: Seppo Laaksonen, Statistics Finland; Ueli Oetliker, Swiss Federal Statistical Office; Susanne Rässler, University

More information

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random

Missing Data. A Typology Of Missing Data. Missing At Random Or Not Missing At Random [Leeuw, Edith D. de, and Joop Hox. (2008). Missing Data. Encyclopedia of Survey Research Methods. Retrieved from http://sage-ereference.com/survey/article_n298.html] Missing Data An important indicator

More information

Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid

Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid Fortgeschrittene Computerintensive Methoden: Finite Mixture Models Steffen Unkel Manuel Eugster, Bettina Grün, Friedrich Leisch, Matthias Schmid Institut für Statistik LMU München Sommersemester 2013 Outline

More information

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression

11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial Least Squares Regression Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c11 2013/9/9 page 221 le-tex 221 11 Linear and Quadratic Discriminant Analysis, Logistic Regression, and Partial

More information

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models

Overview. Longitudinal Data Variation and Correlation Different Approaches. Linear Mixed Models Generalized Linear Mixed Models Overview 1 Introduction Longitudinal Data Variation and Correlation Different Approaches 2 Mixed Models Linear Mixed Models Generalized Linear Mixed Models 3 Marginal Models Linear Models Generalized Linear

More information

WHAT DO WE DO WITH MISSING DATA?SOME OPTIONS FOR ANALYSIS OF INCOMPLETE DATA

WHAT DO WE DO WITH MISSING DATA?SOME OPTIONS FOR ANALYSIS OF INCOMPLETE DATA Annu. Rev. Public Health 2004. 25:99 117 doi: 10.1146/annurev.publhealth.25.102802.124410 Copyright c 2004 by Annual Reviews. All rights reserved WHAT DO WE DO WITH MISSING DATA?SOME OPTIONS FOR ANALYSIS

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Missing Data. Paul D. Allison INTRODUCTION

Missing Data. Paul D. Allison INTRODUCTION 4 Missing Data Paul D. Allison INTRODUCTION Missing data are ubiquitous in psychological research. By missing data, I mean data that are missing for some (but not all) variables and for some (but not all)

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Dealing with Missing Data

Dealing with Missing Data Dealing with Missing Data Roch Giorgi email: roch.giorgi@univ-amu.fr UMR 912 SESSTIM, Aix Marseille Université / INSERM / IRD, Marseille, France BioSTIC, APHM, Hôpital Timone, Marseille, France January

More information

Imputing Missing Data using SAS

Imputing Missing Data using SAS ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

More information

Missing data and net survival analysis Bernard Rachet

Missing data and net survival analysis Bernard Rachet Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,

More information

Graduate Programs in Statistics

Graduate Programs in Statistics Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Univariate and Multivariate Methods PEARSON. Addison Wesley

Univariate and Multivariate Methods PEARSON. Addison Wesley Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values

A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values Methods Report A Mixed Model Approach for Intent-to-Treat Analysis in Longitudinal Clinical Trials with Missing Values Hrishikesh Chakraborty and Hong Gu March 9 RTI Press About the Author Hrishikesh Chakraborty,

More information

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection

Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics

More information

DATA MINING IN FINANCE

DATA MINING IN FINANCE DATA MINING IN FINANCE Advances in Relational and Hybrid Methods by BORIS KOVALERCHUK Central Washington University, USA and EVGENII VITYAEV Institute of Mathematics Russian Academy of Sciences, Russia

More information

A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA

A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA 123 Kwantitatieve Methoden (1999), 62, 123-138. A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA Joop J. Hox 1 ABSTRACT. When we deal with a large data set with missing data, we have to undertake

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Probabilistic user behavior models in online stores for recommender systems

Probabilistic user behavior models in online stores for recommender systems Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA

A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA A THEORETICAL COMPARISON OF DATA MASKING TECHNIQUES FOR NUMERICAL MICRODATA Krish Muralidhar University of Kentucky Rathindra Sarathy Oklahoma State University Agency Internal User Unmasked Result Subjects

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Imputing Values to Missing Data

Imputing Values to Missing Data Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data

More information

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University

Pattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision

More information

Economic Order Quantity and Economic Production Quantity Models for Inventory Management

Economic Order Quantity and Economic Production Quantity Models for Inventory Management Economic Order Quantity and Economic Production Quantity Models for Inventory Management Inventory control is concerned with minimizing the total cost of inventory. In the U.K. the term often used is stock

More information

Electronic Theses and Dissertations UC Riverside

Electronic Theses and Dissertations UC Riverside Electronic Theses and Dissertations UC Riverside Peer Reviewed Title: Bayesian and Non-parametric Approaches to Missing Data Analysis Author: Yu, Yao Acceptance Date: 01 Series: UC Riverside Electronic

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

Comparison of Estimation Methods for Complex Survey Data Analysis

Comparison of Estimation Methods for Complex Survey Data Analysis Comparison of Estimation Methods for Complex Survey Data Analysis Tihomir Asparouhov 1 Muthen & Muthen Bengt Muthen 2 UCLA 1 Tihomir Asparouhov, Muthen & Muthen, 3463 Stoner Ave. Los Angeles, CA 90066.

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

Mixture Models. Jia Li. Department of Statistics The Pennsylvania State University. Mixture Models

Mixture Models. Jia Li. Department of Statistics The Pennsylvania State University. Mixture Models Mixture Models Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based

More information

SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

More information

Longitudinal Studies, The Institute of Education, University of London. Square, London, EC1 OHB, U.K. Email: R.D.Wiggins@city.ac.

Longitudinal Studies, The Institute of Education, University of London. Square, London, EC1 OHB, U.K. Email: R.D.Wiggins@city.ac. A comparative evaluation of currently available software remedies to handle missing data in the context of longitudinal design and analysis. Wiggins, R.D 1., Ely, M 2. & Lynch, K. 3 1 Department of Sociology,

More information

Experimental data analysis Lecture 3: Confidence intervals. Dodo Das

Experimental data analysis Lecture 3: Confidence intervals. Dodo Das Experimental data analysis Lecture 3: Confidence intervals Dodo Das Review of lecture 2 Nonlinear regression - Iterative likelihood maximization Levenberg-Marquardt algorithm (Hybrid of steepest descent

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

Simple Second Order Chi-Square Correction

Simple Second Order Chi-Square Correction Simple Second Order Chi-Square Correction Tihomir Asparouhov and Bengt Muthén May 3, 2010 1 1 Introduction In this note we describe the second order correction for the chi-square statistic implemented

More information