# SAS Syntax and Output for Data Manipulation:

Size: px
Start display at page:

Transcription

1 Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining the extent to which change in a test score outcome across four annual occasions can be described with fixed and random linear effects of time (indexed by years in study, in which 0 is baseline) in a sample of 25 persons. SAS Syntax and Output for Data Manipulation: * Location for files to be saved - CHANGE THIS TO YOUR DIRECTORY; %LET filesave=f:\example Data\Chapter 5 Data; LIBNAME filesave "&example."; * Import data into work library, center time; DATA example5; SET filesave.example5; time = wave - 1; LABEL time= "time: Time in Study (0=1)"; The ANSWER KEY for both the model for the means (via saturated means) and the model for the variances (via unstructured R matrix of all possible variances and covariances) is possible to estimate in balanced data: TITLE1 "Saturated Means, Unstructured Variance Model -- the ANSWER KEY"; PROC MIXED DATA=example5 COVTEST NOITPRINT NOCLPRINT METHOD=REML; MODEL outcome = wave / SOLUTION DDFM=Satterthwaite; REPEATED wave / R RCORR TYPE=UN SUBJECT=IDnum; LSMEANS wave; Dimensions Covariance Parameters 10 Columns in X 5 Columns in 0 Subjects 25 Max Obs Per Subject Estimated R Correlation Matrix for IDnum Because this model uses REPEATED only (no RANDOM statement), the R matrix holds the total variances and covariances over waves directly. Likewise, RCORR holds the total correlations over waves directly. Cov Parm Subject Estimate Error Value Pr UN(1,1) IDnum UN(2,1) IDnum UN(2,2) IDnum UN(3,1) IDnum UN(3,2) IDnum UN(3,3) IDnum UN(4,1) IDnum UN(4,2) IDnum UN(4,3) IDnum UN(4,4) IDnum

2 Psyc 944 Example 5 page 2-2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) <.0001 Because we are in REML, only the variance model parameters count towards AIC and BIC. Thus, in REML we cannot use 2ΔLL to compare models that differ in their fixed effects (i.e., different models for the means). This is the test of whether we need anything beyond a constant residual variance σ (df=9) and we do. wave: Occasion Effect (1-4) Estimate Error DF t Value Pr > t Intercept <.0001 wave <.0001 wave <.0001 wave <.0001 wave Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F wave <.0001 This is the ANOVA test of omnibus mean differences across wave (note df=3 for the 4 means across waves), assuming an unstructured R matrix (multivariate ANOVA). Least Squares Means wave: Occasion Effect (1-4) Estimate Error DF t Value Pr > t wave <.0001 wave <.0001 wave <.0001 wave <.0001 Because wave is on the CLASS statement, the LSMEANS provides means per wave (as found from fixed intercept + difference for each wave in the solution for fixed effects). Test Score Outcome Years in Study These are the estimates from the Saturated Means, Unstructured Variance model, and here are the individual growth curves that these estimates summarize Means by Wave Variances by Wave

3 Psyc 944 Example 5 page 3 If an unstructured R matrix was not possible to estimate, I d still examine the answer key for the model for the means (via a saturated means model), but estimate a random intercept only (which should always be possible): TITLE1 "Saturated Means, Random Intercept Variance Model -- MEANS ANSWER KEY"; PROC MIXED DATA=example5 COVTEST NOITPRINT NOCLPRINT METHOD=REML; MODEL outcome = wave / SOLUTION DDFM=Satterthwaite; RANDOM INTERCEPT / G V VCORR TYPE=UN SUBJECT=IDnum; REPEATED wave / R TYPE=VC SUBJECT=IDnum; LSMEANS wave; Estimated V Matrix for IDnum Estimated V Correlation Matrix for IDnum Cov Parm Subject Estimate Error Value Pr > UN(1,1) IDnum wave IDnum < Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) <.0001 wave: Occasion Effect (1-4) Estimate Error DF t Value Pr > t Intercept <.0001 wave <.0001 wave <.0001 wave <.0001 wave Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F wave <.0001 This is the ANOVA test of omnibus mean differences across wave (note df=3 for the 4 means across waves), assuming a random intercept only (CS V matrix; univariate ANOVA). Least Squares Means wave: Occasion Effect (1-4) Estimate Error DF t Value Pr > t wave <.0001 wave <.0001 wave <.0001 wave <.0001

4 Psyc 944 Example 5 page 4 5.1: Empty Means, Random Intercept Model TITLE1 "Eq 5.1: Empty Means, Random Intercept Model"; PROC MIXED DATA=example5 COVTEST NOITPRINT NOCLPRINT METHOD=REML; MODEL outcome = / SOLUTION DDFM=Satterthwaite; RANDOM INTERCEPT / G V VCORR TYPE=UN SUBJECT=IDnum; REPEATED wave / R TYPE=VC SUBJECT=IDnum; Level 1: y e Level 2: ti 0i ti U 0i 00 0i Composite: y U e ti 00 0i ti Estimated G Matrix IDnum: Person ID Row Effect number Col1 1 Intercept Estimated V Matrix for IDnum Estimated V Correlation Matrix for IDnum Because this model uses the REPEATED and RANDOM statements, the V matrix holds the total variances and covariances over waves (from putting G and R back together through the matrix). Likewise, VCORR holds the total correlations over waves. VCORR provides the ICC as: IntVar/TotalVar Cov Parm Subject Estimate Error Value Pr > UN(1,1) IDnum Random intercept variance in G wave IDnum <.0001 Residual variance in R -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) This is the test of whether we need the random intercept variance (so df=1) and we do. Effect Estimate Error DF t Value Pr > t Intercept <.0001 This is gamma00

5 Psyc 944 Example 5 page 5 5.3: Fixed Linear Time, Random Intercept Model TITLE1 "Eq 5.3: Fixed Linear Time, Random Intercept Model"; PROC MIXED DATA=example5 COVTEST NOITPRINT NOCLPRINT METHOD=REML; MODEL outcome = time / SOLUTION DDFM=Satterthwaite; RANDOM INTERCEPT / G V VCORR TYPE=UN SUBJECT=IDnum; REPEATED wave / R TYPE=VC SUBJECT=IDnum; ESTIMATE "Intercept at Time 0" int 1 time 0; ESTIMATE "Intercept at Time 1" int 1 time 1; ESTIMATE "Intercept at Time 2" int 1 time 2; ESTIMATE "Intercept at Time 3" int 1 time 3; Level 1: y β + β Time + e Level 2: Note the two different versions of the time variable in the syntax. Both are necessary here because they do different things. Wave is treated as a categorical predictor, and its role is to structure the R matrix in the event of missing data. Therefore, wave goes on the CLASS and REPEATED statements. In contrast, time is treated as a continuous predictor, and its role is to index linear effects of time (and it is centered such that wave 1 = time 0). Accordingly, in the ESTIMATE statements, only one value after time is needed. ti 0i 1i ti ti U 0i 00 0i 1i 10 Composite: y U Time e ti 00 0i 10 ti ti Estimated G Matrix IDnum: Person ID Row Effect number Col1 1 Intercept Estimated V Matrix for IDnum Estimated V Correlation Matrix for IDnum After controlling for the fixed linear effect of time, the residual variance was reduced from σ = 7.06 in the empty means, random intercept model to σ = 2.17 in this model. This is a reduction of ( ) / 7.06 =.69 (or 69% of the residual variance is accounted for by a fixed linear time). However, the random intercept variance actually increased from 2.88 to This is because of how τ is found: true τ = observed τ (σ / n) So reducing σ will make τ increase. Cov Parm Subject Estimate Error Value Pr > UN(1,1) IDnum Random intercept variance in G wave IDnum <.0001 Residual variance in R -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Are we allowed to examine the 2ΔLL to see if adding a fixed linear effect of time improved model fit in REML? If not, what do we do instead? <.0001 This tests whether we need the random intercept variance (so df=1) and we (still) do.

6 Psyc 944 Example 5 page 6 Effect Estimate Error DF t Value Pr > t Intercept <.0001 this is gamma00 time <.0001 this is gamma10 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F time <.0001 Estimates These are the predicted outcome means from a fixed linear model Label Estimate Error DF t Value Pr > t Intercept at Time <.0001 Intercept at Time <.0001 Intercept at Time <.0001 Intercept at Time <.0001 Level 1: yti 0i 1i Timeti eti 5.5: Random Linear Time Model Level 2: U 0i 00 0i U TITLE1 "Eq 5.5: Random Linear Time Model"; PROC MIXED DATA=example5 COVTEST NOITPRINT 1i 10 1i NOCLPRINT METHOD=REML; Composite: y ti 00 U0i 10 U1i Timeti eti MODEL outcome = time / SOLUTION DDFM=Satterthwaite; RANDOM INTERCEPT time / G V GCORR VCORR TYPE=UN SUBJECT=IDnum; REPEATED wave / R TYPE=VC SUBJECT=IDnum; ESTIMATE "Intercept at Time 0" int 1 time 0; ESTIMATE "Intercept at Time 1" int 1 time 1; ESTIMATE "Intercept at Time 2" int 1 time 2; ESTIMATE "Intercept at Time 3" int 1 time 3; Note that the time variable gets included in the RANDOM statement, not wave including wave would result in model non-convergence, because it would try to estimate a random slope variance for each possible difference between waves (instead of a single variance for a continuous random slope through all the waves) Estimated G Matrix IDnum: Person ID Row Effect number Col1 Col2 1 Intercept time Estimated G Correlation Matrix IDnum: Person ID Row Effect number Col1 Col2 1 Intercept time After adding a random linear effect of time, the residual variance is smaller, but it is not correct to say that it has been reduced. Random effects do not explain variance; they simply re-allocate it. Here, this means that part of what was residual is now individual differences in the linear effect of time as a new pile of variance in the G matrix below. The G matrix provides the variances and covariances of the individual random effects. Now G is a 2x2 matrix because we have 2 random effects (intercept, linear slope). The GCORR matrix provides the correlation(s) among the individual random effects. Here, the individual intercepts and slopes are correlated r =.04.

7 Psyc 944 Example 5 page 7 Estimated V Matrix for IDnum Estimated V Correlation Matrix for IDnum The V matrix holds the total variances and covariances over waves (from putting G and R back together through the matrix). Likewise, VCORR holds the total correlations over waves. Note that all of these are now predicted to differ as a function of which wave it is (see table 5.2 for a description of how this works). Cov Parm Subject Estimate Error Value Pr UN(1,1) IDnum Random intercept variance in G UN(2,1) IDnum Random intercept-slope covariance in G UN(2,2) IDnum Random linear slope variance in G wave IDnum <.0001 Residual variance in R -2 Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Are we allowed to examine the 2ΔLL to see if adding a random linear effect of time improved model fit in REML? If so, how many model parameters have we added? <.0001 This tests whether we need anything in the G matrix (so df=3). Note this does NOT tell us if we need the random linear time slope specifically! Effect Estimate Error DF t Value Pr > t Intercept <.0001 this is gamma00 time <.0001 this is gamma10 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F time <.0001 Estimates These are the predicted outcome means from a random linear model Label Estimate Error DF t Value Pr > t Intercept at Time <.0001 Intercept at Time <.0001 Intercept at Time <.0001 Intercept at Time <.0001 We can use the model estimates to calculate 95% random effects confidence intervals that describe the predicted range of individual random effects: Random Effect 95% CI = fixed effect ± 1.96* Random Variance Intercept 95% CI = γ ± 1.96* τ = ± 1.96* 2.26 = 7.32 to U Linear Time Slope 95% CI = γ 10 ± 1.96* τ U = 1.72 ± 1.96* 0.91 = 0.15 to 3.59

8 Psyc 944 Example 5 page 8 Last but not least: there may still be residual covariances after modeling individual differences in the linear effect of time (i.e., adding a random linear slope to the G matrix). We can test alternative R matrix assumptions besides VC (which assumes no residual covariance/correlation over time) to see if this is the case: TITLE1 "Random Linear Time Model + AR1 R Matrix"; PROC MIXED DATA=example5 COVTEST NOITPRINT NOCLPRINT METHOD=REML; MODEL outcome = time / SOLUTION DDFM=Satterthwaite; RANDOM INTERCEPT time / G GCORR V VCORR TYPE=UN SUBJECT=IDnum; REPEATED / R RCORR TYPE=AR(1) SUBJECT=IDnum; Estimated R Correlation Matrix for IDnum Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) The 2LL is not smaller than the random linear time model, so adding an AR1 correlation to the R matrix does not improve model fit. TITLE1 "Random Linear Time Model + TOEP2 R Matrix"; PROC MIXED DATA=example5 COVTEST NOITPRINT NOCLPRINT METHOD=REML; MODEL outcome = time / SOLUTION DDFM=Satterthwaite; RANDOM INTERCEPT time / G GCORR V VCORR TYPE=UN SUBJECT=IDnum; REPEATED wave / R RCORR TYPE=TOEP(2) SUBJECT=IDnum; Estimated R Correlation Matrix for IDnum Res Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) The 2LL is not smaller than the random linear time model, so adding a lag-1 covariance to the R matrix does not improve model fit, either.

### 861 Example SPLH. 5 page 1. prefer to have. New data in. SPSS Syntax FILE HANDLE. VARSTOCASESS /MAKE rt. COMPUTE mean=2. COMPUTE sal=2. END IF.

SPLH 861 Example 5 page 1 Multivariate Models for Repeated Measures Response Times in Older and Younger Adults These data were collected as part of my masters thesis, and are unpublished in this form (to

### Random effects and nested models with SAS

Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/

### Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995.

Lecture 18 1. Random intercepts and slopes 2. Notation for mixed effects models 3. Comparing nested models 4. Multilevel/Hierarchical models 5. SAS versions of R models in Gelman and Hill, chapter 12 1

### This can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.

One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.

### Milk Data Analysis. 1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED

1. Objective Introduction to SAS PROC MIXED Analyzing protein milk data using STATA Refit protein milk data using PROC MIXED 2. Introduction to SAS PROC MIXED The MIXED procedure provides you with flexibility

### Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA

Paper P-702 Individual Growth Analysis Using PROC MIXED Maribeth Johnson, Medical College of Georgia, Augusta, GA ABSTRACT Individual growth models are designed for exploring longitudinal data on individuals

### Electronic Thesis and Dissertations UCLA

Electronic Thesis and Dissertations UCLA Peer Reviewed Title: A Multilevel Longitudinal Analysis of Teaching Effectiveness Across Five Years Author: Wang, Kairong Acceptance Date: 2013 Series: UCLA Electronic

### Generalized Mixed Models for Binomial Longitudinal Outcomes (% Correct) using PROC GLIMMIX

Hoffman Psyc 945 Example 6c Generalized Mixed Models for Binomial Longitudinal Outcomes (% Correct) using PROC GLIMMIX The data for this example are based on the publication below, which examined annual

### An Introduction to Modeling Longitudinal Data

An Introduction to Modeling Longitudinal Data Session I: Basic Concepts and Looking at Data Robert Weiss Department of Biostatistics UCLA School of Public Health robweiss@ucla.edu August 2010 Robert Weiss

### Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

### Illustration (and the use of HLM)

Illustration (and the use of HLM) Chapter 4 1 Measurement Incorporated HLM Workshop The Illustration Data Now we cover the example. In doing so we does the use of the software HLM. In addition, we will

### I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

### E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

### Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

### Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts, Procedures and Illustrations

Research Article TheScientificWorldJOURNAL (2011) 11, 42 76 TSW Child Health & Human Development ISSN 1537-744X; DOI 10.1100/tsw.2011.2 Longitudinal Data Analyses Using Linear Mixed Models in SPSS: Concepts,

### Using PROC MIXED in Hierarchical Linear Models: Examples from two- and three-level school-effect analysis, and meta-analysis research

Using PROC MIXED in Hierarchical Linear Models: Examples from two- and three-level school-effect analysis, and meta-analysis research Sawako Suzuki, DePaul University, Chicago Ching-Fan Sheu, DePaul University,

### Interactions involving Categorical Predictors

Interactions involving Categorical Predictors Today s Class: To CLASS or not to CLASS: Manual vs. program-created differences among groups Interactions of continuous and categorical predictors Interactions

### Introducing the Multilevel Model for Change

Department of Psychology and Human Development Vanderbilt University GCM, 2010 1 Multilevel Modeling - A Brief Introduction 2 3 4 5 Introduction In this lecture, we introduce the multilevel model for change.

### MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 1 DAVID C. HOWELL 4/26/2010

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 1 DAVID C. HOWELL 4/26/2010 FOR THE SECOND PART OF THIS DOCUMENT GO TO www.uvm.edu/~dhowell/methods/supplements/mixed Models Repeated/Mixed Models for

### Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data.

Chapter 15 Mixed Models A flexible approach to correlated data. 15.1 Overview Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms,

### Statistics and Pharmacokinetics in Clinical Pharmacology Studies

Paper ST03 Statistics and Pharmacokinetics in Clinical Pharmacology Studies ABSTRACT Amy Newlands, GlaxoSmithKline, Greenford UK The aim of this presentation is to show how we use statistics and pharmacokinetics

### Assignments Analysis of Longitudinal data: a multilevel approach

Assignments Analysis of Longitudinal data: a multilevel approach Frans E.S. Tan Department of Methodology and Statistics University of Maastricht The Netherlands Maastricht, Jan 2007 Correspondence: Frans

### VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Overview of Methods for Analyzing Cluster-Correlated Data. Garrett M. Fitzmaurice

Overview of Methods for Analyzing Cluster-Correlated Data Garrett M. Fitzmaurice Laboratory for Psychiatric Biostatistics, McLean Hospital Department of Biostatistics, Harvard School of Public Health Outline

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Final Exam Practice Problem Answers

Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

### Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Rob J Hyndman Forecasting using 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1 Outline 1 Regression with ARIMA errors 2 Example: Japanese cars 3 Using Fourier terms for seasonality 4

### 9.2 User s Guide SAS/STAT. The MIXED Procedure. (Book Excerpt) SAS Documentation

SAS/STAT 9.2 User s Guide The MIXED Procedure (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation for the complete

### DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

### The Relationship Between Rodent Offspring Blood Lead Levels and Maternal Diet

The Relationship Between Rodent Offspring Blood Lead Levels and Maternal Diet Allison Crawford, Xiahong Li, Mira Shapiro 1, Ruitao Zhang Introduction A study was undertaken to understand the effect of

### Multilevel Modeling Tutorial. Using SAS, Stata, HLM, R, SPSS, and Mplus

Using SAS, Stata, HLM, R, SPSS, and Mplus Updated: March 2015 Table of Contents Introduction... 3 Model Considerations... 3 Intraclass Correlation Coefficient... 4 Example Dataset... 4 Intercept-only Model

### Mixed Models. Jing Cheng Gayla Olbricht Nilupa Gunaratna Rebecca Kendall Alex Lipka Sudeshna Paul Benjamin Tyner. May 19, 2005

Mixed Models Jing Cheng Gayla Olbricht Nilupa Gunaratna Rebecca Kendall Alex Lipka Sudeshna Paul Benjamin Tyner May 19, 2005 1 Contents 1 Introduction 3 2 Two-Way Mixed Effects Models 3 2.1 Pearl Data

### Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

### 5. Multiple regression

5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

### Multivariate Logistic Regression

1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

### Applied Longitudinal Data Analysis: An Introductory Course

Applied Longitudinal Data Analysis: An Introductory Course Emilio Ferrer UC Davis The Risk and Prevention in Education Sciences (RPES) Curry School of Education - UVA August 2005 Acknowledgments Materials

### Analyzing Intervention Effects: Multilevel & Other Approaches. Simplest Intervention Design. Better Design: Have Pretest

Analyzing Intervention Effects: Multilevel & Other Approaches Joop Hox Methodology & Statistics, Utrecht Simplest Intervention Design R X Y E Random assignment Experimental + Control group Analysis: t

### Using Excel for Statistical Analysis

Using Excel for Statistical Analysis You don t have to have a fancy pants statistics package to do many statistical functions. Excel can perform several statistical tests and analyses. First, make sure

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

### Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

### data visualization and regression

data visualization and regression Sepal.Length 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 I. setosa I. versicolor I. virginica I. setosa I. versicolor I. virginica Species Species

### HLM software has been one of the leading statistical packages for hierarchical

Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

### 1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling

Module 5: Introduction to Multilevel Modelling SPSS Practicals Chris Charlton 1 Centre for Multilevel Modelling Pre-requisites Modules 1-4 Contents P5.1 Comparing Groups using Multilevel Modelling... 4

### Simple Linear Regression, Scatterplots, and Bivariate Correlation

1 Simple Linear Regression, Scatterplots, and Bivariate Correlation This section covers procedures for testing the association between two continuous variables using the SPSS Regression and Correlate analyses.

### Directions for using SPSS

Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

### Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

### ANOVA. February 12, 2015

ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R

### 6 Variables: PD MF MA K IAH SBS

options pageno=min nodate formdlim='-'; title 'Canonical Correlation, Journal of Interpersonal Violence, 10: 354-366.'; data SunitaPatel; infile 'C:\Users\Vati\Documents\StatData\Sunita.dat'; input Group

### ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

### Two Related Samples t Test

Two Related Samples t Test In this example 1 students saw five pictures of attractive people and five pictures of unattractive people. For each picture, the students rated the friendliness of the person

### Week 5: Multiple Linear Regression

BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

### How To Model A Series With Sas

Chapter 7 Chapter Table of Contents OVERVIEW...193 GETTING STARTED...194 TheThreeStagesofARIMAModeling...194 IdentificationStage...194 Estimation and Diagnostic Checking Stage...... 200 Forecasting Stage...205

### A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Ordinal Regression. Chapter

Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

### SPSS Resources. 1. See website (readings) for SPSS tutorial & Stats handout

Analyzing Data SPSS Resources 1. See website (readings) for SPSS tutorial & Stats handout Don t have your own copy of SPSS? 1. Use the libraries to analyze your data 2. Download a trial version of SPSS

### Mihaela Ene, Elizabeth A. Leighton, Genine L. Blue, Bethany A. Bell University of South Carolina

Paper 134-2014 Multilevel Models for Categorical Data using SAS PROC GLIMMIX: The Basics Mihaela Ene, Elizabeth A. Leighton, Genine L. Blue, Bethany A. Bell University of South Carolina ABSTRACT Multilevel

### ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking

Dummy Coding for Dummies Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health ABSTRACT There are a number of ways to incorporate categorical variables into

### Introduction to Data Analysis in Hierarchical Linear Models

Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Use of deviance statistics for comparing models

A likelihood-ratio test can be used under full ML. The use of such a test is a quite general principle for statistical testing. In hierarchical linear models, the deviance test is mostly used for multiparameter

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### Statistics, Data Analysis & Econometrics

Using the LOGISTIC Procedure to Model Responses to Financial Services Direct Marketing David Marsh, Senior Credit Risk Modeler, Canadian Tire Financial Services, Welland, Ontario ABSTRACT It is more important

### The 3-Level HLM Model

James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 2 Basic Characteristics of the 3-level Model Level-1 Model Level-2 Model Level-3 Model

### Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1

Bill Burton Albert Einstein College of Medicine william.burton@einstein.yu.edu April 28, 2014 EERS: Managing the Tension Between Rigor and Resources 1 Calculate counts, means, and standard deviations Produce

### This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

### Multiple Regression. Page 24

Multiple Regression Multiple regression is an extension of simple (bi-variate) regression. The goal of multiple regression is to enable a researcher to assess the relationship between a dependent (predicted)

### Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY

SP10 From GLM to GLIMMIX-Which Model to Choose? Patricia B. Cerrito, University of Louisville, Louisville, KY ABSTRACT The purpose of this paper is to investigate several SAS procedures that are used in

### New SAS Procedures for Analysis of Sample Survey Data

New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many

### Package dsmodellingclient

Package dsmodellingclient Maintainer Author Version 4.1.0 License GPL-3 August 20, 2015 Title DataSHIELD client site functions for statistical modelling DataSHIELD

### Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

### 10. Analysis of Longitudinal Studies Repeat-measures analysis

Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.

### Introduction to mixed model and missing data issues in longitudinal studies

Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### Simple Predictive Analytics Curtis Seare

Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

### Moderation. Moderation

Stats - Moderation Moderation A moderator is a variable that specifies conditions under which a given predictor is related to an outcome. The moderator explains when a DV and IV are related. Moderation

### Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure

Technical report Linear Mixed-Effects Modeling in SPSS: An Introduction to the MIXED Procedure Table of contents Introduction................................................................ 1 Data preparation

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### 7 Time series analysis

7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are

### Regression Analysis: A Complete Example

Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### Statistics and Data Analysis

Statistics and Data Analysis In this guide I will make use of Microsoft Excel in the examples and explanations. This should not be taken as an endorsement of Microsoft or its products. In fact, there are

### Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready