Centering Predictors and Variance Decomposition

Size: px
Start display at page:

Download "Centering Predictors and Variance Decomposition"

Transcription

1 Centering Predictors and Variance Decomposition Applied Multilevel Models for Cross Sectional Data Lecture 6 ICPSR Summer Workshop University of Colorado Boulder

2 Covered this Section We will expand on this example to cover a few more important concepts in multilevel models The importance of centering of variables Distinguishing within from between cluster effects How total variation is partitioned by random effects Implications for how residuals are correlated Implications for hypothesis testing (type 1 and 2 errors) Implications for modeling dependencies

3 AS SEEN LAST TIME

4 Guiding Example Imagine you are interested in studying the effects of socioeconomic status (SES) on student achievement What do you think the relationship between student achievement and SES happens to be? You are interested in predicting achievement from SES Your guiding research question

5 Your Study Let s imagine you are able to get data from 7 elementary schools around Boulder You sample 50 students from each elementary school You record a measure of their SES (scale with a mean of 50) You record a measure of their achievement (scale with a mean of 100) Both scales magically have absolutely perfect reliability

6 For Now Let s Model the School Intercept Level 1: the student level Level 2: the school level is the overall intercept (predicted value when all X = 0) is the slope for school mean SES (indicates average intercept increase when school mean SES increases by 1. is the fixed slope for SES meaning each school has the same increase (increase in student score when student SES increases by 1) the error associated with school intercepts (called a random intercept) Is assumed to be normally distributed with mean 0 and variance

7 Putting the Model Together We can substitute our level 2 model terms into our level 1 model equation to get an overall regression line: =

8 The Analysis Results = = = = 2.62 (p = ) = 0.99 (p < )

9 By School Regression Lines Student Achievement Student SES

10 Analysis Interpretation = The variance of school random intercepts (how much schools vary from each other) after accounting for school SES = The variance of residuals for student scores (how much students vary after accounting for school mean SES and student SES) The error variance

11 Analysis Interpretation = The overall intercept The predicted score for a student who has zero SES (X is = 0) At a school with a mean SES of zero ( 0) = 2.62 (p = ) The slope for school mean SES The predicted score for a student increases by 2.62 for every one unit increase in the school mean, after controlling for student SES (contextual or incremental effect) Average achievement for a school increases as average SES increases Statistically significant (level 2 degrees of freedom)

12 Analysis Interpretation = 0.99 (p < ) The slope for student SES The predicted score for a student decreases by 0.99 for every one unit increase in the student s SES Within a school, SES is negatively related to achievement Statistically significant (level 1 degrees of freedom) So, what is the nature of the relationship between SES and achievement? Level 1 SES is negatively related to achievement Level 2 SES is positively related to achievement

13 CENTERING

14 A Closer Look at Our Parameters Recall, the data analysis had the following results: Variances: Fixed Effects: Parameter Estimate Parameter Estimate p value < Take the intercept ( 18.69) This value is the predicted achievement for a student with: A zero value for his/her SES ( ) A zero value for the school mean SES for the student ( 0)

15 However About SES and Achievement An intercept of sounds all well and good until you look at whether or not it actually occurs in our data Also, saying SES is zero is also unrealistic The range of achievement scores is 81.1 to doesn t exist The range of SES is 42.4 to 58.7 for students and 45.7 to 54.5 for school means 0 doesn t exist in either

16 Centering Because our intercept is implausible, we may wish to center our data so as to bring the intercept more into line with the data we collected To center the data, subtract a value from each of the predictor/independent variables Centering will alter the meaning of certain parameters The intercept Some slopes (depending on method of centering) Two methods of centering are popular: Grand mean centering/centering by a constant Cluster mean centering

17 Grand Mean Centering Perhaps the easiest way to center the data would be to subtract the grand mean from each observation The grand mean is the mean of each X variable across all observations, regardless of sampling unit Our regression equation then becomes: The intercept now reflects the predicted value of Y for a student who: Has an SES equal to the grand mean Attends a school with a mean SES equal to the grand mean

18 Grand Mean Centering The fixed slope now represents the change in Y for every unit of SES a school mean is above the grand mean The fixed slope now represents the change in Y for every unit of SES a student is above the grand mean

19 Our Results with and without Grand Mean Centering Model: Variances: Parameter Estimate Fixed Effects: Parameter Estimate p value < Model: Variances: Fixed Effects: The only difference is in the intercept Parameter Estimate Parameter Estimate p value < < The intercept now equals the mean of Y

20 Cluster Mean Centering Another popular method for centering is that of cluster mean centering Taking each person s independent variable(s) and subtracting the mean(s) from their cluster/sampling unit Here we subtract the school mean SES from each student s SES One issue with cluster mean centering: what do we do with the level 2 effect? Would be zero if we cluster mean centered it We can leave it alone (what would happen to the intercept?) We can grand mean center it What would you choose?

21 Cluster Mean Centering with our Data Our model (with X cluster mean centered and grand mean centered): is now the predicted value for a student with: SES equal to the school mean Attending a school with mean SES equal to the grand mean is now the increase in Y for each unit of SES the school mean is above the grand mean is now the increase in Y for each unit of SES the student is above the school mean

22 Our Results with and without Cluster Mean Centering Model: Variances: Fixed Effects: Parameter Estimate Parameter Estimate p value < Model: Variances: Fixed Effects: Parameter Estimate Parameter Estimate p value < < There is a difference in the school mean slope The slope changes more on why later in lecture There is a difference in the intercept The intercept now equals the mean of Y

23 Why Does the Slope for School Mean SES Change? The slope for School Mean SES changed from 2.62 in the no centering/gmc models to 1.63 in the cluster mean centered model Remember, slopes in regression are dependent on the other variables in the model If independent variables are correlated, regression weights will change We changed to The issue is with the types of information contained in : It contains both level 1 and level 2 information (each student s SES is related to their school s mean SES The corresponding weight (2.62) represented the additional effect school mean SES when controlling for student SES Cluster centered student SES has only level 1 information

24 Correlation and Centering Correlations of student SES ( ) with: School mean SES ( ): School mean SES after grand mean centering ( ): Correlations of student SES after cluster mean centering ( ) with: School mean SES ( ): School mean SES after grand mean centering ( ): The effect changes because of the correlation between school mean SES and student SES

25 Centering Summary The scale of variables may lead to parameter values that are not plausible Sometimes interpretation changes (grand mean centering) Sometimes inference changes (cluster mean centering) Detailed shortly Centering helps to: Make parameter estimates understandable Help estimation of random effects in some types of models Disentangle types of effects (for cluster mean centering)

26 TYPES OF VARIANCES IN MULTILEVEL MODELS

27 The Goals of MLM: Variance Partitioning The way MLMs control for observations that may be dependent is to incorporate different types of variability into an analysis Variability within clusters Variability between clusters This section will discuss a general multilevel modeling framework for hierarchical data, indicating how different types of effects partition variability in different ways Some of this will be technical but well worth the time

28 Multiple Components/Levels: HLM Recall our running example: attempting to predict student achievement from student SES and school SES Hierarchical analysis made results more informative Within school: student SES is negatively associated with student achievement Between schools: school mean SES is positively associated with student achievement If you will recall from last time, we started by taking a basic regression analysis: And specifying a basic regression model for each possible school:

29 Names of Levels and Analysis Heuristics MLM presents a heuristic for formulating a model that is based on the level of the data/analysis This heuristic is effect for many models, although it breaks down for certain types of models (i.e. crossed models) The by school regression model is called the level 1 model It uses level 1 independent variables to predict the outcome The residual term is one random component (analogous to the residual in regression/anova)

30 Level Two Model The coefficients of the level 1 one model (i.e., and ) were then modeled using a similar modeling approach, which is called the level 2 model Intercept subscript starts with 0 For the intercepts: Slope subscript starts with 1 Random Error Term (called random intercept): For the slopes: Intercept multiplies level 1 covariate Predictors are level 2 covariates Random Error Term (called random slope): Predictors become cross level interactions

31 The Combined Model Although the MLM heuristic does a good job parsing which effects predict which portions of the model, combining the level 2 and level 1 models results in the formation of what is called a general linear mixed model More standard terminology from statistics Level 1 Predictor(s) come from intercepts of slope model(s) Cross level Interactions come from predictors of slope model(s) Fixed Effects: Level 2 Predictor(s) come from predictors of intercept model Random Effects: U

32 Fixed Effects The fixed effects represent model parameters that: Are assumed to be fixed (no prior distribution assumed) Applied to everyone, regardless of sampling unit/cluster Are used to test hypotheses about types of effects Degrees of freedom depend on level of effect Constitute the predicted value for a given As such, they are sometimes called the model for the means

33 Random Effects The random effects represent model parameters that: Are random (assumed to follow a statistical distribution) Normal distribution Zero mean Variance/covariance parameters that are estimated Are the same only if subjects are in the same sampling unit The variance and covariance contribute to the covariance of observations within a cluster ~ 0 0,

34 More on Random Effects The inclusion of random effects impacts the way hypothesis tests about fixed effects are constructed They partition variability into segments that are due to cluster Specification of models with random effects leads makes explicit the assumption that within cluster, there observations are correlated (shown later)

35 BUILDING MODELS: FROM START TO FINISH

36 Building Multilevel Models The process of model building begins with a very basic model and then adds predictors at each type of level We will be fitting a series of models, each attempting to answer a different question Upon the end of the process, we will evaluate the final model we use and make inferences regarding the nature of our variables

37 Question #1: How Much Variability is there in Achievement? To answer our first question, we will fit what is called an empty model: No predictors of achievement No random effects by school We started with this model last lecture Intercept was the mean achievement score Error variance was the variance in achievement score We will use this model as a baseline And build from it by adding random effects and predictors

38 Model #1: Empty Model Model: Model results: Where 0, Fit: Deviance (for comparing model fit): 2,491.5 Means: Variances: 72.29

39 Question #2: Is there variability in achievement unique to schools? The second question can be answered by using an extension of the empty model: a MLM with a random intercept: If the random intercept variance is greater than zero, then the answer is yes Level 1 Model: Where 0, Level 2 Model: Where 0, ; is fixed intercept (means) Combined Model:

40 Model #2 Results (Compared with Model #1) Model #2 Fit: Model #1 Fit: Value Estimate Value Estimate Deviance 2,202.5 Means: Parameter Estimate (SE) Variances: (Intercept) (2.50) Deviance 2,491.5 Means: Parameter Estimate (SE) (Intercept) (0.45) Parameter Estimate Variances: Parameter Estimate (Error) (2.22) (Error) (5.46) (Random Intercept) (23.43)

41 First: Is Model 2 Preferred? To answer our question we need to determine if the random intercept variance is greater than zero : 0; : 0 Would indicate variability due to schools And dependencies between observations within schools Can use a deviance test (Null model deviance full model deviance) is *approximately* chi square distributed Degrees of freedom equal difference in number of parameters between models (here only one new parameter random intercept variance) Note: this test is approximate only very conservative Deviance test: (2, ,202.5) = 289; df=1; p < Indicates there is variability in achievement due to school

42 Second: What are interpretations of parameters of Model 2? Now that we know Model 2 is preferred, we will deviate from our model fitting to demonstrate the interpretations of model parameters from the model Shown for teaching purposes: you wouldn t do this until the end of the model building process First, the fixed intercept ( ) The value of the fixed intercept stayed constant from Model 1 to Model 2 Model for the means is unchanged when model for variances changes Model 1 and Model 2 only differed by the random intercept for school The standard error of the intercept changed Because of the different variance partitioning in Model 2 Overall: intercept still represents predicted value of achievement when all predictors are zero Since no predictors in model everyone s predicted value is mean of achievement

43 Interpretations of Variance Parameters The variance parameters represent the variance of the random intercept and the variance of the level 1 error term These parameters indicate how much variability is present at each level of the analysis They also indicate the degree to which observations nested within a sampling unit/cluster are correlated To demonstrate, I will show what the model expects the dependency between observations within a cluster to be

44 Covariance of Observations Within Cluster Using the algebra of expectations, we seek to determine the dependency (covariance) between two observations, in the same cluster Observations and,,,,,,,,,,,,

45 Covariance of Observations Between Clusters Using the algebra of expectations, we seek to determine the dependency (covariance) between two observations, in different clusters, Observations and,,,,,,,,,,, 0

46 Variance of Observations Within or Between Clusters Using the algebra of expectations, we seek to determine the variance of an observation : 2,

47 Correlation of Observations within Clusters Because we know:, We can determine the correlation between observations within a cluster Also known as the intraclass correlation,,

48 Back to Our Data From our data, we estimated: Parameter Estimate (Error) (2.22) (Random Intercept) (23.43) Meaning, our intraclass correlation was: This means: Student s achievement scores within a school had a correlation of % of the total variability in achievement scores came from between school variability Our linear regression assumption of uncorrelated residuals is violated Should be using the mixed model with a random intercept

49 Back to Model Building Now we ve determined that there is variability at the school level, it is now our job to explain both sources of variability using the independent variables we have collected Student SES School level SES The model building process now attempts to add variables to the baseline model (empty + random intercept) The question comes as to process in which level should we add our variables We will add variables to each level and determine how much variance is accounted for at each level by the new variables

50 Model Building (part 1 using Cluster Mean Centering) Given our choices of levels, we can add variables at: Level 1 only (add cluster mean centered Student SES): New 43.32; Reduced level 1 variance (slight reduction in level 2) Level 2 only (add grand mean centered school mean SES): New 13.28; Reduced level 2 variance (no reduction in level 1) Level 1 and Level 2 simultaneously (add cluster mean centered Student SES and grandmean centered school mean SES): New 13.35; Reduced both level 1 and level 2 variance

51 Which Path to Choose? The path to choose (which level) depends on several factors Types of variables Types of centering Do level 1 variables include *only* level 1 information Research questions of interest There is no true consensus as to which path to use We used cluster mean centered variables at level 1 Level 1 variable without any level 2 information Let s examine what would have happened had we used grand mean centered variables at level 1

52 Model Building (part 2 using GMC) Given our choices of levels, we can add variables at: Level 1 only (add grand mean centered Student SES): New 88.27; Reduced level 1 variance (HUGE increase in level 2 variance) Because level 1 variable included level 2 information Level 2 only (add grand mean centered school mean SES): New 13.28; Reduced level 2 variance (no reduction in level 1) Same as cluster mean centered variables Level 1 and Level 2 simultaneously (add cluster mean centered Student SES and grandmean centered school mean SES): New 13.35; Reduced both level 1 and level 2 variance Same as with cluster mean centered variables

53 For Us We ll Pick Door #3 For our analysis, we ll choose to put level 1 and level 2 variables in simultaneously This is due to understanding our variables it appears that SES has a differential effect at different levels of our analysis Inspection of data So we ll go with #3 and add both student and school SES simultaneously

54 Our New Model (Called Model #3) Adding cluster mean centered student SES and grand mean centered school mean SES yields the following model: Level 1: Where 0, Level 2: Where ~ 0, Combined:

55 Model #3 Results Model Fit Model Fit: Model #2 (old model) Model #3 (new) Value Estimate Value Estimate Deviance 2,202.5 Deviance 2,147.9 Question: Is model 3 preferred to model 2? Answer: Deviance test (two new parameters): Test statistic: 2, ,147.9 = 57.6 Degrees of freedom = 2 (new parameters and ) P value: < Conclusion: Model #3 is preferred

56 Model #3 Results Fixed Effects (Means) Model parameter estimates: ; The overall intercept the value of achievement for a student with SES equal to their school mean SES at a school with mean SES equal to the grand mean SES Is the average value of achievement ; The slope for student SES (minus school mean SES) Represents the change in achievement for each unit a student SES differs from their school mean Given school mean SES is held constant ; The slope for school mean SES (minus grand mean SES) Represents the change in achievement for each unit the school mean SES differs from the grand mean Given student SES is held constant

57 Model #3 Results Variance Parameters Our estimated variance parameters were: Parameter Estimate (Error) (1.93) (Random Intercept) (7.41) Meaning, our intraclass correlation was: This means: Student s achievement scores within a school had a correlation of % of the total variability in achievement scores came from between school variability Our linear regression assumption of uncorrelated residuals is violated Should be using the mixed model with a random intercept

58 More on Variances In comparing Model #3 to Model #2, an important distinction is of how much each variance component is reduced because of addition of the predictors Called a pseudo R 2 Level 2 Variance ( ): Model #2 = 43.25; Model #3 = Reduction: = 29.9 Proportion of Model #2 Variance Explained: 29.9/43.25 =.69 Explanation: School Mean SES explains 69% of the variance in school achievement (random intercept variance)

59 Level 1 Variance Reduction Level 1 Variance ( ): Model #2 = 29.04; Model #3 = Reduction: = 3.69 Proportion of Model #2 Variance Explained: 3.69/29.04 =.13 Explanation: Student SES explains 13% of the variance in student achievement

60 Wrapping Up We discussed the process of fitting multilevel models in the context of our familiar example Why and how to center Effects on parameter interpretations and estimates The model building process How variance is partitioned at each level How variance gets explained at each level The final results of determining which model to choose And how to interpret the parameters

61 Up Next Our analysis and lecture today left a few things up in the air: What about the slopes for SES? Are there school level effects on the slopes (random slopes) Is there a cross level interaction between school mean SES and student SES? Next we will investigate these questions

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

More information

Specifications for this HLM2 run

Specifications for this HLM2 run One way ANOVA model 1. How much do U.S. high schools vary in their mean mathematics achievement? 2. What is the reliability of each school s sample mean as an estimate of its true population mean? 3. Do

More information

Illustration (and the use of HLM)

Illustration (and the use of HLM) Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Introduction to Data Analysis in Hierarchical Linear Models

Introduction to Data Analysis in Hierarchical Linear Models Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Use of deviance statistics for comparing models

Use of deviance statistics for comparing models A likelihood-ratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter

More information

HLM software has been one of the leading statistical packages for hierarchical

HLM software has been one of the leading statistical packages for hierarchical Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

More information

Introducing the Multilevel Model for Change

Introducing the Multilevel Model for Change Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Qualitative vs Quantitative research & Multilevel methods

Qualitative vs Quantitative research & Multilevel methods Qualitative vs Quantitative research & Multilevel methods How to include context in your research April 2005 Marjolein Deunk Content What is qualitative analysis and how does it differ from quantitative

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

This chapter will demonstrate how to perform multiple linear regression with IBM SPSS CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

13. Poisson Regression Analysis

13. Poisson Regression Analysis 136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Ordinal Regression. Chapter

Ordinal Regression. Chapter Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Elementary Statistics Sample Exam #3

Elementary Statistics Sample Exam #3 Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to

More information

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group

Introduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

NOTES ON HLM TERMINOLOGY

NOTES ON HLM TERMINOLOGY HLML01cc 1 FI=HLML01cc NOTES ON HLM TERMINOLOGY by Ralph B. Taylor breck@rbtaylor.net All materials copyright (c) 1998-2002 by Ralph B. Taylor LEVEL 1 Refers to the model describing units within a grouping:

More information

The Basic Two-Level Regression Model

The Basic Two-Level Regression Model 2 The Basic Two-Level Regression Model The multilevel regression model has become known in the research literature under a variety of names, such as random coefficient model (de Leeuw & Kreft, 1986; Longford,

More information

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5

Copyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5 Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression

More information

1.6 The Order of Operations

1.6 The Order of Operations 1.6 The Order of Operations Contents: Operations Grouping Symbols The Order of Operations Exponents and Negative Numbers Negative Square Roots Square Root of a Negative Number Order of Operations and Negative

More information

Longitudinal Meta-analysis

Longitudinal Meta-analysis Quality & Quantity 38: 381 389, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 381 Longitudinal Meta-analysis CORA J. M. MAAS, JOOP J. HOX and GERTY J. L. M. LENSVELT-MULDERS Department

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:

More information

One-Way Analysis of Variance

One-Way Analysis of Variance One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

Example: Boats and Manatees

Example: Boats and Manatees Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving

Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving Section 7 Algebraic Manipulations and Solving Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving Before launching into the mathematics, let s take a moment to talk about the words

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Homework 11. Part 1. Name: Score: / null

Homework 11. Part 1. Name: Score: / null Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Power and sample size in multilevel modeling

Power and sample size in multilevel modeling Snijders, Tom A.B. Power and Sample Size in Multilevel Linear Models. In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science. Volume 3, 1570 1573. Chicester (etc.): Wiley,

More information

Module 4 - Multiple Logistic Regression

Module 4 - Multiple Logistic Regression Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be

More information

A Hierarchical Linear Modeling Approach to Higher Education Instructional Costs

A Hierarchical Linear Modeling Approach to Higher Education Instructional Costs A Hierarchical Linear Modeling Approach to Higher Education Instructional Costs Qin Zhang and Allison Walters University of Delaware NEAIR 37 th Annual Conference November 15, 2010 Cost Factors Middaugh,

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

College Readiness LINKING STUDY

College Readiness LINKING STUDY College Readiness LINKING STUDY A Study of the Alignment of the RIT Scales of NWEA s MAP Assessments with the College Readiness Benchmarks of EXPLORE, PLAN, and ACT December 2011 (updated January 17, 2012)

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Multiple regression - Matrices

Multiple regression - Matrices Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Schools Value-added Information System Technical Manual

Schools Value-added Information System Technical Manual Schools Value-added Information System Technical Manual Quality Assurance & School-based Support Division Education Bureau 2015 Contents Unit 1 Overview... 1 Unit 2 The Concept of VA... 2 Unit 3 Control

More information

Experimental Designs (revisited)

Experimental Designs (revisited) Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described

More information

Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM. Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών

Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM. Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών Το υλικό αυτό προέρχεται από workshop που οργανώθηκε σε θερινό σχολείο της Ευρωπαϊκής

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

More information

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

More information

Pearson's Correlation Tests

Pearson's Correlation Tests Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

An introduction to hierarchical linear modeling

An introduction to hierarchical linear modeling Tutorials in Quantitative Methods for Psychology 2012, Vol. 8(1), p. 52-69. An introduction to hierarchical linear modeling Heather Woltman, Andrea Feldstain, J. Christine MacKay, Meredith Rocchi University

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Binary Logistic Regression

Binary Logistic Regression Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

More information

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

More information

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

August 2012 EXAMINATIONS Solution Part I

August 2012 EXAMINATIONS Solution Part I August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,

More information

INTRODUCTION TO MULTIPLE CORRELATION

INTRODUCTION TO MULTIPLE CORRELATION CHAPTER 13 INTRODUCTION TO MULTIPLE CORRELATION Chapter 12 introduced you to the concept of partialling and how partialling could assist you in better interpreting the relationship between two primary

More information

Introduction to Path Analysis

Introduction to Path Analysis This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Multivariate Analysis of Variance (MANOVA)

Multivariate Analysis of Variance (MANOVA) Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction

More information