A brush-up on residuals

Similar documents
Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Schools Value-added Information System Technical Manual

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

A Hierarchical Linear Modeling Approach to Higher Education Instructional Costs

The Basic Two-Level Regression Model

Simple Regression Theory II 2010 Samuel L. Baker

table to see that the probability is (b) What is the probability that x is between 16 and 60? The z-scores for 16 and 60 are: = 1.

Illustration (and the use of HLM)

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Power and sample size in multilevel modeling

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Review Jeopardy. Blue vs. Orange. Review Jeopardy

2. Linear regression with multiple regressors

Introduction to Data Analysis in Hierarchical Linear Models

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

Confidence Intervals for Spearman s Rank Correlation

Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). 1

Linear Regression. Chapter 5. Prediction via Regression Line Number of new birds and Percent returning. Least Squares

POLYNOMIAL AND MULTIPLE REGRESSION. Polynomial regression used to fit nonlinear (e.g. curvilinear) data into a least squares linear regression model.

Part 2: Analysis of Relationship Between Two Variables

Statistical Models in R

Univariate Regression

Biostatistics Short Course Introduction to Longitudinal Studies

Descriptive Statistics

Elementary Statistics Sample Exam #3

Using R for Linear Regression

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but

A Review of Statistical Outlier Methods

HLM software has been one of the leading statistical packages for hierarchical

Specifications for this HLM2 run

INTRODUCTION TO MULTIPLE CORRELATION

Chapter 5. Conditional CAPM. 5.1 Conditional CAPM: Theory Risk According to the CAPM. The CAPM is not a perfect model of expected returns.

Introducing the Multilevel Model for Change

Statistics 522: Sampling and Survey Techniques. Topic 5. Consider sampling children in an elementary school.

The importance of graphing the data: Anscombe s regression examples

MS RR Booil Jo. Supplemental Materials (to be posted on the Web)

Joint models for classification and comparison of mortality in different countries.

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Confidence Intervals for One Standard Deviation Using Standard Deviation

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Premaster Statistics Tutorial 4 Full solutions

Introduction to Longitudinal Data Analysis

Coefficient of Determination

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Multilevel modelling of complex survey data

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Mining Student Ratings and Course Contents for Computer Science Curriculum Decisions

What Is an Intracluster Correlation Coefficient? Crucial Concepts for Primary Care Researchers

Penalized regression: Introduction

Example: Boats and Manatees

NOTES ON HLM TERMINOLOGY

SPSS Guide: Regression Analysis

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

4. Simple regression. QBUS6840 Predictive Analytics.

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

2. Simple Linear Regression

AN ILLUSTRATION OF MULTILEVEL MODELS FOR ORDINAL RESPONSE DATA

430 Statistics and Financial Mathematics for Business

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Statistiek II. John Nerbonne. October 1, Dept of Information Science

Multiple Choice: 2 points each

Sample Size and Power in Clinical Trials

8. Time Series and Prediction

Study Guide for the Final Exam

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Simple Linear Regression Inference

Elements of statistics (MATH0487-1)

Ordinal Regression. Chapter

CALCULATIONS & STATISTICS

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

The Dummy s Guide to Data Analysis Using SPSS

A Statistical Analysis of the Prices of. Personalised Number Plates in Britain

Models for Longitudinal and Clustered Data

Greatest Common Factors and Least Common Multiples with Venn Diagrams

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y X

Using SAS Proc Mixed for the Analysis of Clustered Longitudinal Data

Name: Date: Use the following to answer questions 2-3:

AFM 472. Midterm Examination. Monday Oct. 24, A. Huang

Directions for using SPSS

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Chapter 9 Descriptive Statistics for Bivariate Data

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Predictor Coef StDev T P Constant X S = R-Sq = 0.0% R-Sq(adj) = 0.

Additional sources Compilation of sources:

Correlation key concepts:

17. SIMPLE LINEAR REGRESSION II

Statistics E100 Fall 2013 Practice Midterm I - A Solutions

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Chapter 13 Introduction to Linear Regression and Correlation Analysis

College Readiness LINKING STUDY

Regression III: Advanced Methods

Lecture 5 Three level variance component models

Introduction to Hierarchical Linear Modeling with R

Week TSX Index

Testing for Lack of Fit

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Introduction to Quantitative Methods

Transcription:

A brush-up on residuals Going back to a single level model... y i = β 0 + β 1 x 1i + ε i The residual for each observation is the difference between the value of y predicted by the equation and the actual value of y

A brush-up on residuals Going back to a single level model... y i = β 0 + β 1 x 1i + ε i The residual for each observation is the difference between the value of y predicted by the equation and the actual value of y In symbols, y i ŷ i

A brush-up on residuals Going back to a single level model... y i = β 0 + β 1 x 1i + ε i The residual for each observation is the difference between the value of y predicted by the equation and the actual value of y In symbols, y i ŷ i So we can calculate the residuals by: r i = y i ŷ i = y i ( ˆβ 0 ˆβ 1 x 1i ) Graphically, we can think of the residual as the distance between the data point and the regression line

Multilevel residuals Back to multilevel modelling... y ij = β 0 + β 1 x 1ij + u j + e ij Multilevel residuals follow the same basic idea, but now that we have two error terms, we have two residuals

Multilevel residuals Back to multilevel modelling... y ij = β 0 + β 1 x 1ij + u j + e ij Multilevel residuals follow the same basic idea, but now that we have two error terms, we have two residuals: the distance from the overall regression line to the line for the group (the level 2 residual)

Multilevel residuals Back to multilevel modelling... y ij = β 0 + β 1 x 1ij + u j + e ij Multilevel residuals follow the same basic idea, but now that we have two error terms, we have two residuals: the distance from the overall regression line to the line for the group (the level 2 residual) and the distance from the line for the group to the data point (the level 1 residual)

Calculation of multilevel residuals Raw residuals First, we must calculate the raw residual r j

Calculation of multilevel residuals Raw residuals First, we must calculate the raw residual r j Define r ij = y ij ŷ ij This is just as if we were treating it as a single-level model

Calculation of multilevel residuals Raw residuals First, we must calculate the raw residual r j Define r ij = y ij ŷ ij This is just as if we were treating it as a single-level model Now r j is the mean of r ij for group j

Why can t we just use these raw residuals? Suppose we have just two data points for one of our groups

Why can t we just use these raw residuals? Suppose we have just two data points for one of our groups These two data points might be unusual for their group

Why can t we just use these raw residuals? Suppose we have just two data points for one of our groups These two data points might be unusual for their group Then the r j calculated using them will be wrong

Why can t we just use these raw residuals? Suppose we have just two data points for one of our groups These two data points might be unusual for their group Then the r j calculated using them will be wrong But our points are always a sample and we can t know if they re typical of their group

Why can t we just use these raw residuals? Suppose we have just two data points for one of our groups These two data points might be unusual for their group Then the r j calculated using them will be wrong But our points are always a sample and we can t know if they re typical of their group We can combine the data from the group with information from the other groups to bring the residuals towards the overall average

Why can t we just use these raw residuals? Suppose we have just two data points for one of our groups These two data points might be unusual for their group Then the r j calculated using them will be wrong But our points are always a sample and we can t know if they re typical of their group We can combine the data from the group with information from the other groups to bring the residuals towards the overall average Then the level 2 residuals will be less sensitive to outlying elements of the group

Calculation of multilevel residuals Raw residuals r ij = y ij ŷ ij r j is the mean of r ij for group j Level 2 residual The estimated level 2 residual is the shrinkage factor times the raw residual û 0j = σ2 u 0 σ 2 u 0 + σ2 e 0 n j r j Shrinkage factor σ 2 u 0 σ 2 u 0 + σ2 e 0 n j Level 1 residual The level 1 residual is the observed value, minus the predicted value from the overall regression line, minus the level 2 residual ê 0ij = y ij ( ˆβ 0 + ˆβ 1 x 1ij ) û 0j

How much shrinkage occurs? A lot of shrinkage Not much shrinkage

How much shrinkage occurs? A lot of shrinkage Not much shrinkage n j

How much shrinkage occurs? A lot of shrinkage Not much shrinkage n j When there are not many When there are a lot of level 1 units in the group level 1 units in the group

How much shrinkage occurs? A lot of shrinkage Not much shrinkage n j When there are not many When there are a lot of level 1 units in the group level 1 units in the group σ 2 e 0

How much shrinkage occurs? A lot of shrinkage Not much shrinkage n j When there are not many When there are a lot of level 1 units in the group level 1 units in the group σe 2 0 When the level 1 When the level 1 variance is big variance is small

How much shrinkage occurs? A lot of shrinkage Not much shrinkage n j When there are not many When there are a lot of level 1 units in the group level 1 units in the group σe 2 0 When the level 1 When the level 1 variance is big variance is small σ 2 u 0

How much shrinkage occurs? A lot of shrinkage Not much shrinkage n j When there are not many When there are a lot of level 1 units in the group level 1 units in the group σe 2 0 When the level 1 When the level 1 variance is big variance is small σu 2 0 When the level 2 When the level 2 variance is small variance is big

Measuring dependency The question of the relative sizes of σ 2 u 0 and σ 2 e 0 is actually quite important. The relative sizes change according to how much dependency we have in our data. The dependency arises because observations from the same group are likely to be more similar than those from different groups. The fact that we have dependent data is the whole reason that we are using multilevel modelling. We use multilevel modelling partly in order to correctly estimate standard errors If we use a single-level model for dependent data, standard errors will be overestimated So we would like to know how much dependency we have in our data

Example Parameter Single level Multilevel Intercept -0.098(0.021)-0.101(0.070) Boy school 0.122(0.049) 0.120(0.149) Girl school 0.244(0.034) 0.258(0.117) Between school variance (σu) 2. (. ) 0.155(0.030) Between student variance (σe) 2 0.985(0.022) 0.848(0.019)

Measuring dependency We can use the variance partitioning coefficient to measure dependency also called VPC or ρ or intraclass correlation For the two-level random intercepts case, ρ = σ2 u 0 σ 2 u 0 +σ 2 e 0 Beware! Note how the VPC is similar to, but not identical to, the shrinkage factor The VPC is the proportion of the total variance that is at level 2 How can we interpret it?

Correlation structure of single level model y i = β 0 + β 1 x 1i + e 0i

Correlation structure of two level random intercept model y ij = β 0 + β 1 x 1ij + u 0j + e 0ij