Centering Predictors and Variance Decomposition
|
|
- Evelyn Harrell
- 7 years ago
- Views:
Transcription
1 Centering Predictors and Variance Decomposition Applied Multilevel Models for Cross Sectional Data Lecture 6 ICPSR Summer Workshop University of Colorado Boulder
2 Covered this Section We will expand on this example to cover a few more important concepts in multilevel models The importance of centering of variables Distinguishing within from between cluster effects How total variation is partitioned by random effects Implications for how residuals are correlated Implications for hypothesis testing (type 1 and 2 errors) Implications for modeling dependencies
3 AS SEEN LAST TIME
4 Guiding Example Imagine you are interested in studying the effects of socioeconomic status (SES) on student achievement What do you think the relationship between student achievement and SES happens to be? You are interested in predicting achievement from SES Your guiding research question
5 Your Study Let s imagine you are able to get data from 7 elementary schools around Boulder You sample 50 students from each elementary school You record a measure of their SES (scale with a mean of 50) You record a measure of their achievement (scale with a mean of 100) Both scales magically have absolutely perfect reliability
6 For Now Let s Model the School Intercept Level 1: the student level Level 2: the school level is the overall intercept (predicted value when all X = 0) is the slope for school mean SES (indicates average intercept increase when school mean SES increases by 1. is the fixed slope for SES meaning each school has the same increase (increase in student score when student SES increases by 1) the error associated with school intercepts (called a random intercept) Is assumed to be normally distributed with mean 0 and variance
7 Putting the Model Together We can substitute our level 2 model terms into our level 1 model equation to get an overall regression line: =
8 The Analysis Results = = = = 2.62 (p = ) = 0.99 (p < )
9 By School Regression Lines Student Achievement Student SES
10 Analysis Interpretation = The variance of school random intercepts (how much schools vary from each other) after accounting for school SES = The variance of residuals for student scores (how much students vary after accounting for school mean SES and student SES) The error variance
11 Analysis Interpretation = The overall intercept The predicted score for a student who has zero SES (X is = 0) At a school with a mean SES of zero ( 0) = 2.62 (p = ) The slope for school mean SES The predicted score for a student increases by 2.62 for every one unit increase in the school mean, after controlling for student SES (contextual or incremental effect) Average achievement for a school increases as average SES increases Statistically significant (level 2 degrees of freedom)
12 Analysis Interpretation = 0.99 (p < ) The slope for student SES The predicted score for a student decreases by 0.99 for every one unit increase in the student s SES Within a school, SES is negatively related to achievement Statistically significant (level 1 degrees of freedom) So, what is the nature of the relationship between SES and achievement? Level 1 SES is negatively related to achievement Level 2 SES is positively related to achievement
13 CENTERING
14 A Closer Look at Our Parameters Recall, the data analysis had the following results: Variances: Fixed Effects: Parameter Estimate Parameter Estimate p value < Take the intercept ( 18.69) This value is the predicted achievement for a student with: A zero value for his/her SES ( ) A zero value for the school mean SES for the student ( 0)
15 However About SES and Achievement An intercept of sounds all well and good until you look at whether or not it actually occurs in our data Also, saying SES is zero is also unrealistic The range of achievement scores is 81.1 to doesn t exist The range of SES is 42.4 to 58.7 for students and 45.7 to 54.5 for school means 0 doesn t exist in either
16 Centering Because our intercept is implausible, we may wish to center our data so as to bring the intercept more into line with the data we collected To center the data, subtract a value from each of the predictor/independent variables Centering will alter the meaning of certain parameters The intercept Some slopes (depending on method of centering) Two methods of centering are popular: Grand mean centering/centering by a constant Cluster mean centering
17 Grand Mean Centering Perhaps the easiest way to center the data would be to subtract the grand mean from each observation The grand mean is the mean of each X variable across all observations, regardless of sampling unit Our regression equation then becomes: The intercept now reflects the predicted value of Y for a student who: Has an SES equal to the grand mean Attends a school with a mean SES equal to the grand mean
18 Grand Mean Centering The fixed slope now represents the change in Y for every unit of SES a school mean is above the grand mean The fixed slope now represents the change in Y for every unit of SES a student is above the grand mean
19 Our Results with and without Grand Mean Centering Model: Variances: Parameter Estimate Fixed Effects: Parameter Estimate p value < Model: Variances: Fixed Effects: The only difference is in the intercept Parameter Estimate Parameter Estimate p value < < The intercept now equals the mean of Y
20 Cluster Mean Centering Another popular method for centering is that of cluster mean centering Taking each person s independent variable(s) and subtracting the mean(s) from their cluster/sampling unit Here we subtract the school mean SES from each student s SES One issue with cluster mean centering: what do we do with the level 2 effect? Would be zero if we cluster mean centered it We can leave it alone (what would happen to the intercept?) We can grand mean center it What would you choose?
21 Cluster Mean Centering with our Data Our model (with X cluster mean centered and grand mean centered): is now the predicted value for a student with: SES equal to the school mean Attending a school with mean SES equal to the grand mean is now the increase in Y for each unit of SES the school mean is above the grand mean is now the increase in Y for each unit of SES the student is above the school mean
22 Our Results with and without Cluster Mean Centering Model: Variances: Fixed Effects: Parameter Estimate Parameter Estimate p value < Model: Variances: Fixed Effects: Parameter Estimate Parameter Estimate p value < < There is a difference in the school mean slope The slope changes more on why later in lecture There is a difference in the intercept The intercept now equals the mean of Y
23 Why Does the Slope for School Mean SES Change? The slope for School Mean SES changed from 2.62 in the no centering/gmc models to 1.63 in the cluster mean centered model Remember, slopes in regression are dependent on the other variables in the model If independent variables are correlated, regression weights will change We changed to The issue is with the types of information contained in : It contains both level 1 and level 2 information (each student s SES is related to their school s mean SES The corresponding weight (2.62) represented the additional effect school mean SES when controlling for student SES Cluster centered student SES has only level 1 information
24 Correlation and Centering Correlations of student SES ( ) with: School mean SES ( ): School mean SES after grand mean centering ( ): Correlations of student SES after cluster mean centering ( ) with: School mean SES ( ): School mean SES after grand mean centering ( ): The effect changes because of the correlation between school mean SES and student SES
25 Centering Summary The scale of variables may lead to parameter values that are not plausible Sometimes interpretation changes (grand mean centering) Sometimes inference changes (cluster mean centering) Detailed shortly Centering helps to: Make parameter estimates understandable Help estimation of random effects in some types of models Disentangle types of effects (for cluster mean centering)
26 TYPES OF VARIANCES IN MULTILEVEL MODELS
27 The Goals of MLM: Variance Partitioning The way MLMs control for observations that may be dependent is to incorporate different types of variability into an analysis Variability within clusters Variability between clusters This section will discuss a general multilevel modeling framework for hierarchical data, indicating how different types of effects partition variability in different ways Some of this will be technical but well worth the time
28 Multiple Components/Levels: HLM Recall our running example: attempting to predict student achievement from student SES and school SES Hierarchical analysis made results more informative Within school: student SES is negatively associated with student achievement Between schools: school mean SES is positively associated with student achievement If you will recall from last time, we started by taking a basic regression analysis: And specifying a basic regression model for each possible school:
29 Names of Levels and Analysis Heuristics MLM presents a heuristic for formulating a model that is based on the level of the data/analysis This heuristic is effect for many models, although it breaks down for certain types of models (i.e. crossed models) The by school regression model is called the level 1 model It uses level 1 independent variables to predict the outcome The residual term is one random component (analogous to the residual in regression/anova)
30 Level Two Model The coefficients of the level 1 one model (i.e., and ) were then modeled using a similar modeling approach, which is called the level 2 model Intercept subscript starts with 0 For the intercepts: Slope subscript starts with 1 Random Error Term (called random intercept): For the slopes: Intercept multiplies level 1 covariate Predictors are level 2 covariates Random Error Term (called random slope): Predictors become cross level interactions
31 The Combined Model Although the MLM heuristic does a good job parsing which effects predict which portions of the model, combining the level 2 and level 1 models results in the formation of what is called a general linear mixed model More standard terminology from statistics Level 1 Predictor(s) come from intercepts of slope model(s) Cross level Interactions come from predictors of slope model(s) Fixed Effects: Level 2 Predictor(s) come from predictors of intercept model Random Effects: U
32 Fixed Effects The fixed effects represent model parameters that: Are assumed to be fixed (no prior distribution assumed) Applied to everyone, regardless of sampling unit/cluster Are used to test hypotheses about types of effects Degrees of freedom depend on level of effect Constitute the predicted value for a given As such, they are sometimes called the model for the means
33 Random Effects The random effects represent model parameters that: Are random (assumed to follow a statistical distribution) Normal distribution Zero mean Variance/covariance parameters that are estimated Are the same only if subjects are in the same sampling unit The variance and covariance contribute to the covariance of observations within a cluster ~ 0 0,
34 More on Random Effects The inclusion of random effects impacts the way hypothesis tests about fixed effects are constructed They partition variability into segments that are due to cluster Specification of models with random effects leads makes explicit the assumption that within cluster, there observations are correlated (shown later)
35 BUILDING MODELS: FROM START TO FINISH
36 Building Multilevel Models The process of model building begins with a very basic model and then adds predictors at each type of level We will be fitting a series of models, each attempting to answer a different question Upon the end of the process, we will evaluate the final model we use and make inferences regarding the nature of our variables
37 Question #1: How Much Variability is there in Achievement? To answer our first question, we will fit what is called an empty model: No predictors of achievement No random effects by school We started with this model last lecture Intercept was the mean achievement score Error variance was the variance in achievement score We will use this model as a baseline And build from it by adding random effects and predictors
38 Model #1: Empty Model Model: Model results: Where 0, Fit: Deviance (for comparing model fit): 2,491.5 Means: Variances: 72.29
39 Question #2: Is there variability in achievement unique to schools? The second question can be answered by using an extension of the empty model: a MLM with a random intercept: If the random intercept variance is greater than zero, then the answer is yes Level 1 Model: Where 0, Level 2 Model: Where 0, ; is fixed intercept (means) Combined Model:
40 Model #2 Results (Compared with Model #1) Model #2 Fit: Model #1 Fit: Value Estimate Value Estimate Deviance 2,202.5 Means: Parameter Estimate (SE) Variances: (Intercept) (2.50) Deviance 2,491.5 Means: Parameter Estimate (SE) (Intercept) (0.45) Parameter Estimate Variances: Parameter Estimate (Error) (2.22) (Error) (5.46) (Random Intercept) (23.43)
41 First: Is Model 2 Preferred? To answer our question we need to determine if the random intercept variance is greater than zero : 0; : 0 Would indicate variability due to schools And dependencies between observations within schools Can use a deviance test (Null model deviance full model deviance) is *approximately* chi square distributed Degrees of freedom equal difference in number of parameters between models (here only one new parameter random intercept variance) Note: this test is approximate only very conservative Deviance test: (2, ,202.5) = 289; df=1; p < Indicates there is variability in achievement due to school
42 Second: What are interpretations of parameters of Model 2? Now that we know Model 2 is preferred, we will deviate from our model fitting to demonstrate the interpretations of model parameters from the model Shown for teaching purposes: you wouldn t do this until the end of the model building process First, the fixed intercept ( ) The value of the fixed intercept stayed constant from Model 1 to Model 2 Model for the means is unchanged when model for variances changes Model 1 and Model 2 only differed by the random intercept for school The standard error of the intercept changed Because of the different variance partitioning in Model 2 Overall: intercept still represents predicted value of achievement when all predictors are zero Since no predictors in model everyone s predicted value is mean of achievement
43 Interpretations of Variance Parameters The variance parameters represent the variance of the random intercept and the variance of the level 1 error term These parameters indicate how much variability is present at each level of the analysis They also indicate the degree to which observations nested within a sampling unit/cluster are correlated To demonstrate, I will show what the model expects the dependency between observations within a cluster to be
44 Covariance of Observations Within Cluster Using the algebra of expectations, we seek to determine the dependency (covariance) between two observations, in the same cluster Observations and,,,,,,,,,,,,
45 Covariance of Observations Between Clusters Using the algebra of expectations, we seek to determine the dependency (covariance) between two observations, in different clusters, Observations and,,,,,,,,,,, 0
46 Variance of Observations Within or Between Clusters Using the algebra of expectations, we seek to determine the variance of an observation : 2,
47 Correlation of Observations within Clusters Because we know:, We can determine the correlation between observations within a cluster Also known as the intraclass correlation,,
48 Back to Our Data From our data, we estimated: Parameter Estimate (Error) (2.22) (Random Intercept) (23.43) Meaning, our intraclass correlation was: This means: Student s achievement scores within a school had a correlation of % of the total variability in achievement scores came from between school variability Our linear regression assumption of uncorrelated residuals is violated Should be using the mixed model with a random intercept
49 Back to Model Building Now we ve determined that there is variability at the school level, it is now our job to explain both sources of variability using the independent variables we have collected Student SES School level SES The model building process now attempts to add variables to the baseline model (empty + random intercept) The question comes as to process in which level should we add our variables We will add variables to each level and determine how much variance is accounted for at each level by the new variables
50 Model Building (part 1 using Cluster Mean Centering) Given our choices of levels, we can add variables at: Level 1 only (add cluster mean centered Student SES): New 43.32; Reduced level 1 variance (slight reduction in level 2) Level 2 only (add grand mean centered school mean SES): New 13.28; Reduced level 2 variance (no reduction in level 1) Level 1 and Level 2 simultaneously (add cluster mean centered Student SES and grandmean centered school mean SES): New 13.35; Reduced both level 1 and level 2 variance
51 Which Path to Choose? The path to choose (which level) depends on several factors Types of variables Types of centering Do level 1 variables include *only* level 1 information Research questions of interest There is no true consensus as to which path to use We used cluster mean centered variables at level 1 Level 1 variable without any level 2 information Let s examine what would have happened had we used grand mean centered variables at level 1
52 Model Building (part 2 using GMC) Given our choices of levels, we can add variables at: Level 1 only (add grand mean centered Student SES): New 88.27; Reduced level 1 variance (HUGE increase in level 2 variance) Because level 1 variable included level 2 information Level 2 only (add grand mean centered school mean SES): New 13.28; Reduced level 2 variance (no reduction in level 1) Same as cluster mean centered variables Level 1 and Level 2 simultaneously (add cluster mean centered Student SES and grandmean centered school mean SES): New 13.35; Reduced both level 1 and level 2 variance Same as with cluster mean centered variables
53 For Us We ll Pick Door #3 For our analysis, we ll choose to put level 1 and level 2 variables in simultaneously This is due to understanding our variables it appears that SES has a differential effect at different levels of our analysis Inspection of data So we ll go with #3 and add both student and school SES simultaneously
54 Our New Model (Called Model #3) Adding cluster mean centered student SES and grand mean centered school mean SES yields the following model: Level 1: Where 0, Level 2: Where ~ 0, Combined:
55 Model #3 Results Model Fit Model Fit: Model #2 (old model) Model #3 (new) Value Estimate Value Estimate Deviance 2,202.5 Deviance 2,147.9 Question: Is model 3 preferred to model 2? Answer: Deviance test (two new parameters): Test statistic: 2, ,147.9 = 57.6 Degrees of freedom = 2 (new parameters and ) P value: < Conclusion: Model #3 is preferred
56 Model #3 Results Fixed Effects (Means) Model parameter estimates: ; The overall intercept the value of achievement for a student with SES equal to their school mean SES at a school with mean SES equal to the grand mean SES Is the average value of achievement ; The slope for student SES (minus school mean SES) Represents the change in achievement for each unit a student SES differs from their school mean Given school mean SES is held constant ; The slope for school mean SES (minus grand mean SES) Represents the change in achievement for each unit the school mean SES differs from the grand mean Given student SES is held constant
57 Model #3 Results Variance Parameters Our estimated variance parameters were: Parameter Estimate (Error) (1.93) (Random Intercept) (7.41) Meaning, our intraclass correlation was: This means: Student s achievement scores within a school had a correlation of % of the total variability in achievement scores came from between school variability Our linear regression assumption of uncorrelated residuals is violated Should be using the mixed model with a random intercept
58 More on Variances In comparing Model #3 to Model #2, an important distinction is of how much each variance component is reduced because of addition of the predictors Called a pseudo R 2 Level 2 Variance ( ): Model #2 = 43.25; Model #3 = Reduction: = 29.9 Proportion of Model #2 Variance Explained: 29.9/43.25 =.69 Explanation: School Mean SES explains 69% of the variance in school achievement (random intercept variance)
59 Level 1 Variance Reduction Level 1 Variance ( ): Model #2 = 29.04; Model #3 = Reduction: = 3.69 Proportion of Model #2 Variance Explained: 3.69/29.04 =.13 Explanation: Student SES explains 13% of the variance in student achievement
60 Wrapping Up We discussed the process of fitting multilevel models in the context of our familiar example Why and how to center Effects on parameter interpretations and estimates The model building process How variance is partitioned at each level How variance gets explained at each level The final results of determining which model to choose And how to interpret the parameters
61 Up Next Our analysis and lecture today left a few things up in the air: What about the slopes for SES? Are there school level effects on the slopes (random slopes) Is there a cross level interaction between school mean SES and student SES? Next we will investigate these questions
" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationIntroduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationAnalyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest
Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t
More informationSpecifications for this HLM2 run
One way ANOVA model 1. How much do U.S. high schools vary in their mean mathematics achievement? 2. What is the reliability of each school s sample mean as an estimate of its true population mean? 3. Do
More informationIllustration (and the use of HLM)
Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationIntroduction to Data Analysis in Hierarchical Linear Models
Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationUse of deviance statistics for comparing models
A likelihood-ratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter
More informationHLM software has been one of the leading statistical packages for hierarchical
Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush
More informationIntroducing the Multilevel Model for Change
Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationQualitative vs Quantitative research & Multilevel methods
Qualitative vs Quantitative research & Multilevel methods How to include context in your research April 2005 Marjolein Deunk Content What is qualitative analysis and how does it differ from quantitative
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationA Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationThis chapter will demonstrate how to perform multiple linear regression with IBM SPSS
CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the
More informationSPSS Guide: Regression Analysis
SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar
More information13. Poisson Regression Analysis
136 Poisson Regression Analysis 13. Poisson Regression Analysis We have so far considered situations where the outcome variable is numeric and Normally distributed, or binary. In clinical work one often
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationElementary Statistics Sample Exam #3
Elementary Statistics Sample Exam #3 Instructions. No books or telephones. Only the supplied calculators are allowed. The exam is worth 100 points. 1. A chi square goodness of fit test is considered to
More informationIntroduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group
Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More information1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationNOTES ON HLM TERMINOLOGY
HLML01cc 1 FI=HLML01cc NOTES ON HLM TERMINOLOGY by Ralph B. Taylor breck@rbtaylor.net All materials copyright (c) 1998-2002 by Ralph B. Taylor LEVEL 1 Refers to the model describing units within a grouping:
More informationThe Basic Two-Level Regression Model
2 The Basic Two-Level Regression Model The multilevel regression model has become known in the research literature under a variety of names, such as random coefficient model (de Leeuw & Kreft, 1986; Longford,
More informationCopyright 2007 by Laura Schultz. All rights reserved. Page 1 of 5
Using Your TI-83/84 Calculator: Linear Correlation and Regression Elementary Statistics Dr. Laura Schultz This handout describes how to use your calculator for various linear correlation and regression
More information1.6 The Order of Operations
1.6 The Order of Operations Contents: Operations Grouping Symbols The Order of Operations Exponents and Negative Numbers Negative Square Roots Square Root of a Negative Number Order of Operations and Negative
More informationLongitudinal Meta-analysis
Quality & Quantity 38: 381 389, 2004. 2004 Kluwer Academic Publishers. Printed in the Netherlands. 381 Longitudinal Meta-analysis CORA J. M. MAAS, JOOP J. HOX and GERTY J. L. M. LENSVELT-MULDERS Department
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationIndices of Model Fit STRUCTURAL EQUATION MODELING 2013
Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:
More informationOne-Way Analysis of Variance
One-Way Analysis of Variance Note: Much of the math here is tedious but straightforward. We ll skim over it in class but you should be sure to ask questions if you don t understand it. I. Overview A. We
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationTwo-sample hypothesis testing, II 9.07 3/16/2004
Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,
More informationExample: Boats and Manatees
Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationPart 1 Expressions, Equations, and Inequalities: Simplifying and Solving
Section 7 Algebraic Manipulations and Solving Part 1 Expressions, Equations, and Inequalities: Simplifying and Solving Before launching into the mathematics, let s take a moment to talk about the words
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationHomework 11. Part 1. Name: Score: / null
Name: Score: / Homework 11 Part 1 null 1 For which of the following correlations would the data points be clustered most closely around a straight line? A. r = 0.50 B. r = -0.80 C. r = 0.10 D. There is
More informationMATRIX ALGEBRA AND SYSTEMS OF EQUATIONS
MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS Systems of Equations and Matrices Representation of a linear system The general system of m equations in n unknowns can be written a x + a 2 x 2 + + a n x n b a
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationPower and sample size in multilevel modeling
Snijders, Tom A.B. Power and Sample Size in Multilevel Linear Models. In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science. Volume 3, 1570 1573. Chicester (etc.): Wiley,
More informationModule 4 - Multiple Logistic Regression
Module 4 - Multiple Logistic Regression Objectives Understand the principles and theory underlying logistic regression Understand proportions, probabilities, odds, odds ratios, logits and exponents Be
More informationA Hierarchical Linear Modeling Approach to Higher Education Instructional Costs
A Hierarchical Linear Modeling Approach to Higher Education Instructional Costs Qin Zhang and Allison Walters University of Delaware NEAIR 37 th Annual Conference November 15, 2010 Cost Factors Middaugh,
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationCollege Readiness LINKING STUDY
College Readiness LINKING STUDY A Study of the Alignment of the RIT Scales of NWEA s MAP Assessments with the College Readiness Benchmarks of EXPLORE, PLAN, and ACT December 2011 (updated January 17, 2012)
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationMultiple regression - Matrices
Multiple regression - Matrices This handout will present various matrices which are substantively interesting and/or provide useful means of summarizing the data for analytical purposes. As we will see,
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationSchools Value-added Information System Technical Manual
Schools Value-added Information System Technical Manual Quality Assurance & School-based Support Division Education Bureau 2015 Contents Unit 1 Overview... 1 Unit 2 The Concept of VA... 2 Unit 3 Control
More informationExperimental Designs (revisited)
Introduction to ANOVA Copyright 2000, 2011, J. Toby Mordkoff Probably, the best way to start thinking about ANOVA is in terms of factors with levels. (I say this because this is how they are described
More informationΕισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM. Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών
Εισαγωγή στην πολυεπίπεδη μοντελοποίηση δεδομένων με το HLM Βασίλης Παυλόπουλος Τμήμα Ψυχολογίας, Πανεπιστήμιο Αθηνών Το υλικό αυτό προέρχεται από workshop που οργανώθηκε σε θερινό σχολείο της Ευρωπαϊκής
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
More information17. SIMPLE LINEAR REGRESSION II
17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.
More informationMULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS
MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance
More informationCase Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?
Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationAnswer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade
Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements
More informationChapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation
Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus
More informationPearson's Correlation Tests
Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationAn introduction to hierarchical linear modeling
Tutorials in Quantitative Methods for Psychology 2012, Vol. 8(1), p. 52-69. An introduction to hierarchical linear modeling Heather Woltman, Andrea Feldstain, J. Christine MacKay, Meredith Rocchi University
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two- Means
Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationWhat s New in Econometrics? Lecture 8 Cluster and Stratified Sampling
What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and
More informationBinary Logistic Regression
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
More informationCOMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk
COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution
More informationCHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
More information5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More informationINTRODUCTION TO MULTIPLE CORRELATION
CHAPTER 13 INTRODUCTION TO MULTIPLE CORRELATION Chapter 12 introduced you to the concept of partialling and how partialling could assist you in better interpreting the relationship between two primary
More informationIntroduction to Path Analysis
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationMultivariate Analysis of Variance (MANOVA)
Multivariate Analysis of Variance (MANOVA) Aaron French, Marcelo Macedo, John Poulsen, Tyler Waterson and Angela Yu Keywords: MANCOVA, special cases, assumptions, further reading, computations Introduction
More information