# Generalized Linear Mixed Modeling and PROC GLIMMIX

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Generalized Linear Mixed Modeling and PROC GLIMMIX Richard Charnigo Professor of Statistics and Biostatistics Director of Statistics and Psychometrics Core, CDART

2 Objectives First ~80 minutes: 1. Be able to formulate a generalized linear mixed model for longitudinal data involving a categorical and a continuous covariate. 2. Understand how generalized linear mixed modeling differs from logistic regression and linear mixed modeling. Last ~40 minutes: 3. Be able to use PROC GLIMMIX to fit a generalized linear mixed model for longitudinal data involving a categorical and a continuous covariate.

3 Motivating example The Excel file at { contains a simulated data set: Five hundred college freshmen ( ID ) are asked to indicate whether they have consumed marijuana during the past three months ( MJ ). The students are also assessed on negative urgency; the results are expressed as Z scores ( NegUrg ). One and two years later ( Time ), most of the students supply updated information on marijuana consumption; however, some students drop out.

4 Motivating example Two possible research questions are: i. Is there an association between negative urgency and marijuana use at baseline? ii. Does marijuana use tend to change over time and, if so, is that change predicted by negative urgency at baseline? We can envisage more complicated and realistic scenarios ( e.g., with additional personality variables and/or interventions ), but this simple scenario will help us get a hold of generalized linear mixed modeling and PROC GLIMMIX.

5 Exploratory data analysis Before pursuing generalized linear mixed (or other statistical) modeling, we are well-advised to engage in exploratory data analysis. This can alert us to any gross mistakes in the data set, heretofore undetected, which may compromise our work. This can also suggest a structure for the generalized linear mixed model and help us to anticipate what the results should be.

6 Exploratory data analysis Variable Label N Minimum Time Time NegUrg NegUrg MJ MJ Lower Quartile Median Upper Quartile Maximum Mean Std Dev Skewness In this example, no gross mistakes are apparent. Having Z scores between and seems reasonable for a sample of size 500. The minimum and maximum values of time and marijuana use are correct, given that the latter is being treated dichotomously.

7 Exploratory data analysis Table of MJ by negurgstratum MJ(MJ) negurgstratum Frequency Percent Row Pct Col Pct Total Total There is a clear association between negative urgency stratum and marijuana use during freshman year, with 8.7% of those low on negative urgency (bottom 25%) using marijuana versus 19.5% for average (middle 50%) and 37.5% for high (top 25%).

8 Exploratory data analysis Table of MJ by negurgstratum MJ(MJ) negurgstratum Frequency Percent Row Pct Col Pct Total Total A similar phenomenon is observed in sophomore year, but overall marijuana use has increased from 21.6% to 30.4%.

9 MJ(MJ) Frequency Percent Row Pct Col Pct Exploratory data analysis Table of MJ by negurgstratum negurgstratum Total Total By junior year, marijuana use has increased to 33.5%.

10 First generalized linear mixed model Let Y jk denote subject j s marijuana use at time k. Because Y jk is dichotomous, we cannot employ a linear mixed model, which assumes a continuous (in fact, normally distributed) outcome. However, consider these three equations: logit{p(y jk = 1)} = a 0 + a 1 k, if subject j is low logit{p(y jk = 1)} = b 0 + b 1 k, if subject j is average logit{p(y jk = 1)} = c 0 + c 1 k, if subject j is high on negative urgency, where logit{x} is defined as log( x / (1-x) ).

11 First generalized linear mixed model Three comments are in order: First, the generalized linear mixed model defined by the three equations can be expressed as a logistic regression model. Let X 1 and X 2 respectively be dummy variables for low and high negative urgency. Then we may write logit{p(y jk = 1)} = b 0 + (a 0 b 0 ) X 1j + (c 0 b 0 ) X 2j + ( b 1 + (a 1 b 1 ) X 1j + (c 1 b 1 ) X 2j ) k. Indeed, just as linear regression is a special case of linear mixed modeling, logistic regression is a special case of generalized linear mixed modeling.

12 First generalized linear mixed model Second, we are in essence logistic-regressing marijuana use on time but allowing each subject to have one of three intercepts and one of three slopes, according to his/her negative urgency. Third, our research questions amount to asking whether a 0, b 0, c 0 differ from each other, whether a 1, b 1, c 1 differ from zero, and whether a 1, b 1, c 1 differ from each other.

13 First generalized linear mixed model Now let us examine the results from fitting the generalized linear mixed model using PROC GLIMMIX. We see that PROC GLIMMIX used all available observations ( 1350 ), including observations from the 100 subjects who dropped out early. Number of Observations Read 1350 Number of Observations Used 1350

14 First generalized linear mixed model The estimates of the intercepts a 0, b 0, c 0 are -2.48, -1.33, and The estimates of the slopes a 1, b 1, c 1 are 0.18, 0.38, and Exponentiating the latter gives us estimates of the factors by which the odds of marijuana use get multiplied each year, within each of the negative urgency strata. For example, exp(0.3143) = in the high stratum. Parameter Estimates Effect negurgstratum Estimate Standard Error DF t Value Pr > t negurgstratum <.0001 negurgstratum <.0001 negurgstratum Time*negurgstratum Time*negurgstratum Time*negurgstratum

15 First generalized linear mixed model We can also use PROC GLIMMIX to estimate any linear combinations of a 0, b 0, c 0, a 1, b 1, c 1. For example, below are estimates of c 0 a 0 ( high vs. low negative urgency freshmen ), ( c 0 + c 1 ) ( a 0 + a 1 ) ( high vs. low negative urgency sophomores ), and ( c 0 + 2c 1 ) ( a 0 + 2a 1 ) ( high vs. low negative urgency juniors ). Again, exponentiation will yield estimated odds ratios. Estimates Label Estimate Standard Error DF t Value Pr > t High vs low freshman <.0001 High vs low sophomore <.0001 High vs low junior <.0001

16 Second generalized linear mixed model As noted earlier, our first generalized linear mixed model can be expressed as a logistic regression model. How, then, does generalized linear mixed modeling go beyond logistic regression? The answer is that we may also allow each subject to have his/her own personal intercept and slope, not merely choose from among three intercepts and three slopes. This can capture correlations among repeated measurements on that subject. The personal intercept and slope may be related to negative urgency and to unmeasured factors. For simplicity in what follows, however, we will confine attention to a personal intercept.

17 Second generalized linear mixed model More specifically, we propose the following: logit{p(y jk = 1)} = b 0 + (a 0 b 0 ) X 1j + (c 0 b 0 ) X 2j + P 1j + ( b 1 + (a 1 b 1 ) X 1j + (c 1 b 1 ) X 2j ) k. Above, P 1j is an unobserved zero-mean variable that adjusts the intercept for subject j. Thus, the interpretations of a 0, b 0, c 0 are subtly altered. They are now the average intercepts for subjects who are low, average, and high on negative urgency. Even so, our research questions are still addressed by estimating a 0, b 0, c 0, a 1, b 1, c 1.

18 Second generalized linear mixed model While we can predict P 1j from the data, in practice this is rarely done. However, its variance is routinely estimated. Covariance Parameter Estimates Cov Parm Subject Estimate Standard Error UN(1,1) ID Solutions for Fixed Effects Effect negurgstratum Estimate Standard Error DF t Value Pr > t negurgstratum <.0001 negurgstratum <.0001 negurgstratum Time*negurgstratum Time*negurgstratum Time*negurgstratum

19 Second generalized linear mixed model Some care is now required in interpreting odds ratio estimates. For example, exp(2.3808) = says that a freshman high on negative urgency is estimated to have times the odds of using marijuana versus a freshman low on negative urgency, controlling for whatever unmeasured factors contribute to the personal intercepts. Estimates Label Estimate Standard Error DF t Value Pr > t High vs low freshman <.0001 High vs low sophomore <.0001 High vs low junior <.0001

20 Second generalized linear mixed model Which model is better: the first or second? Conceptually, the second model is appealing because P 1j captures correlations among the repeated observations on subject j. Thus, we avoid the unrealistic assumption, present in logistic regression, that observations are independent. Empirically, we may examine a model selection criterion such as the BIC; a smaller value is better. Here are results for the first and second models. Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better) Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better)

21 Third generalized linear mixed model So far we have treated negative urgency as categorical, but this is not necessary and perhaps not optimal. Let us now consider the following: logit{p(y jk = 1)} = ( d 0 + e 0 N j + P 1j ) + ( d 1 + e 1 N j ) k. Above, N j denotes the continuous negative urgency variable, while P 1j is, as before, an adjustment to the intercept.

22 Third generalized linear mixed model Since negative urgency was expressed as a Z score, d 0 is the average intercept and d 1 is the slope among those average on negative urgency. Likewise, d 0 + e 0 is the average intercept and d 1 + e 1 is the slope among those one standard deviation above average on negative urgency. And, d 0 e 0 is the average intercept and d 1 e 1 is the slope among those one standard deviation below average on negative urgency.

23 Third generalized linear mixed model We estimate the variance of P 1j d 0, e 0, d 1, e 1. as well as estimating Covariance Parameter Estimates Cov Parm Subject Estimate Standard Error UN(1,1) ID Solutions for Fixed Effects Effect Estimate Standard Error DF t Value Pr > t Intercept <.0001 NegUrg <.0001 Time <.0001 NegUrg*Time

24 Third generalized linear mixed model In addition, we may estimate linear combinations of d 0, e 0, d 1, e 1. For example, 2e 0 compares freshmen one standard deviation above to freshmen one standard deviation below, 2e 0 + 2e 1 compares such sophomores, and 2e 0 + 4e 1 compares such juniors. Moreover, the BIC prefers this model over either of the first two. Estimates Label Estimate Standard Error DF t Value Pr > t High vs low freshman <.0001 High vs low sophomore <.0001 High vs low junior <.0001 Fit Statistics -2 Log Likelihood AIC (smaller is better) AICC (smaller is better) BIC (smaller is better)

25 What s next? Now we will launch SAS and examine the PROC GLIMMIX implementations of the three generalized linear mixed models. In addition, although beyond the scope of today s presentation, I mention that there are versions of generalized linear mixed models that accommodate responses which are neither normally distributed nor dichotomous. The most common is a version that accommodates responses which are Poisson distributed; this is useful when the outcome of interest is a count (e.g., how many times a person engages in a particular behavior).

### SAS Syntax and Output for Data Manipulation:

Psyc 944 Example 5 page 1 Practice with Fixed and Random Effects of Time in Modeling Within-Person Change The models for this example come from Hoffman (in preparation) chapter 5. We will be examining

### I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of

### ln(p/(1-p)) = α +β*age35plus, where p is the probability or odds of drinking

Dummy Coding for Dummies Kathryn Martin, Maternal, Child and Adolescent Health Program, California Department of Public Health ABSTRACT There are a number of ways to incorporate categorical variables into

### Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA

Age to Age Factor Selection under Changing Development Chris G. Gross, ACAS, MAAA Introduction A common question faced by many actuaries when selecting loss development factors is whether to base the selected

### VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

### Electronic Thesis and Dissertations UCLA

Electronic Thesis and Dissertations UCLA Peer Reviewed Title: A Multilevel Longitudinal Analysis of Teaching Effectiveness Across Five Years Author: Wang, Kairong Acceptance Date: 2013 Series: UCLA Electronic

### Lecture #2 Overview. Basic IRT Concepts, Models, and Assumptions. Lecture #2 ICPSR Item Response Theory Workshop

Basic IRT Concepts, Models, and Assumptions Lecture #2 ICPSR Item Response Theory Workshop Lecture #2: 1of 64 Lecture #2 Overview Background of IRT and how it differs from CFA Creating a scale An introduction

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

### Lecture 19: Conditional Logistic Regression

Lecture 19: Conditional Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina

### Module 3: Correlation and Covariance

Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

### Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

### Overview of Methods for Analyzing Cluster-Correlated Data. Garrett M. Fitzmaurice

Overview of Methods for Analyzing Cluster-Correlated Data Garrett M. Fitzmaurice Laboratory for Psychiatric Biostatistics, McLean Hospital Department of Biostatistics, Harvard School of Public Health Outline

### Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL

### Raul Cruz-Cano, HLTH653 Spring 2013

Multilevel Modeling-Logistic Schedule 3/18/2013 = Spring Break 3/25/2013 = Longitudinal Analysis 4/1/2013 = Midterm (Exercises 1-5, not Longitudinal) Introduction Just as with linear regression, logistic

### Biostatistics Short Course Introduction to Longitudinal Studies

Biostatistics Short Course Introduction to Longitudinal Studies Zhangsheng Yu Division of Biostatistics Department of Medicine Indiana University School of Medicine Zhangsheng Yu (Indiana University) Longitudinal

### Family economics data: total family income, expenditures, debt status for 50 families in two cohorts (A and B), annual records from 1990 1995.

Lecture 18 1. Random intercepts and slopes 2. Notation for mixed effects models 3. Comparing nested models 4. Multilevel/Hierarchical models 5. SAS versions of R models in Gelman and Hill, chapter 12 1

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

### Random effects and nested models with SAS

Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/

### Chapter 6: The Information Function 129. CHAPTER 7 Test Calibration

Chapter 6: The Information Function 129 CHAPTER 7 Test Calibration 130 Chapter 7: Test Calibration CHAPTER 7 Test Calibration For didactic purposes, all of the preceding chapters have assumed that the

### Yiming Peng, Department of Statistics. February 12, 2013

Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

### Two Correlated Proportions (McNemar Test)

Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

### Introduction to Analysis Methods for Longitudinal/Clustered Data, Part 3: Generalized Estimating Equations

Introduction to Analysis Methods for Longitudinal/Clustered Data, Part 3: Generalized Estimating Equations Mark A. Weaver, PhD Family Health International Office of AIDS Research, NIH ICSSC, FHI Goa, India,

### Multivariate Logistic Regression

1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

Economics of Strategy (ECON 4550) Maymester 015 Applications of Regression Analysis Reading: ACME Clinic (ECON 4550 Coursepak, Page 47) and Big Suzy s Snack Cakes (ECON 4550 Coursepak, Page 51) Definitions

### Name: Date: Use the following to answer questions 2-3:

Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

### 171:290 Model Selection Lecture II: The Akaike Information Criterion

171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information

### Hierarchical Insurance Claims Modeling

Hierarchical Insurance Claims Modeling Edward W. (Jed) Frees, University of Wisconsin - Madison Emiliano A. Valdez, University of Connecticut 2009 Joint Statistical Meetings Session 587 - Thu 8/6/09-10:30

### 5. Ordinal regression: cumulative categories proportional odds. 6. Ordinal regression: comparison to single reference generalized logits

Lecture 23 1. Logistic regression with binary response 2. Proc Logistic and its surprises 3. quadratic model 4. Hosmer-Lemeshow test for lack of fit 5. Ordinal regression: cumulative categories proportional

### Factors affecting online sales

Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

### Introduction to Longitudinal Data Analysis

Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### 11. Analysis of Case-control Studies Logistic Regression

Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

### SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

### Models for Count Data With Overdispersion

Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extra-poisson variation and the negative binomial model, with brief appearances

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### HLM software has been one of the leading statistical packages for hierarchical

Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

### Geostatistics Exploratory Analysis

Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

### Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, Last revised March 28, 2015

Using Stata 11 & higher for Logistic Regression Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised March 28, 2015 NOTE: The routines spost13, lrdrop1, and extremes are

### BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS

BRIEF OVERVIEW ON INTERPRETING COUNT MODEL RISK RATIOS An Addendum to Negative Binomial Regression Cambridge University Press (2007) Joseph M. Hilbe 2008, All Rights Reserved This short monograph is intended

### Binary Logistic Regression

Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

### An Introduction to Generalized Linear Mixed Models Using SAS PROC GLIMMIX

An Introduction to Generalized Linear Mixed Models Using SAS PROC GLIMMIX Phil Gibbs Advanced Analytics Manager SAS Technical Support November 22, 2008 UC Riverside What We Will Cover Today What is PROC

### Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

### Chapter 3 Descriptive Statistics: Numerical Measures. Learning objectives

Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the

### The aspect of the data that we want to describe/measure is the degree of linear relationship between and The statistic r describes/measures the degree

PS 511: Advanced Statistics for Psychological and Behavioral Research 1 Both examine linear (straight line) relationships Correlation works with a pair of scores One score on each of two variables ( and

### Ordinal Regression. Chapter

Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### A C T R esearcli R e p o rt S eries 2 0 0 5. Using ACT Assessment Scores to Set Benchmarks for College Readiness. IJeff Allen.

A C T R esearcli R e p o rt S eries 2 0 0 5 Using ACT Assessment Scores to Set Benchmarks for College Readiness IJeff Allen Jim Sconing ACT August 2005 For additional copies write: ACT Research Report

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Statistics, Data Analysis & Econometrics

Using the LOGISTIC Procedure to Model Responses to Financial Services Direct Marketing David Marsh, Senior Credit Risk Modeler, Canadian Tire Financial Services, Welland, Ontario ABSTRACT It is more important

### This chapter will demonstrate how to perform multiple linear regression with IBM SPSS

CHAPTER 7B Multiple Regression: Statistical Methods Using IBM SPSS This chapter will demonstrate how to perform multiple linear regression with IBM SPSS first using the standard method and then using the

### Statistical Foundations: Measures of Location and Central Tendency and Summation and Expectation

Statistical Foundations: and Central Tendency and and Lecture 4 September 5, 2006 Psychology 790 Lecture #4-9/05/2006 Slide 1 of 26 Today s Lecture Today s Lecture Where this Fits central tendency/location

### A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### Lecture 14: GLM Estimation and Logistic Regression

Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South

### Chapter 15 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately

### 5. Multiple regression

5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén

### HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

### SUGI 29 Statistics and Data Analysis

Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,

### OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS

OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments

### An introduction to Value-at-Risk Learning Curve September 2003

An introduction to Value-at-Risk Learning Curve September 2003 Value-at-Risk The introduction of Value-at-Risk (VaR) as an accepted methodology for quantifying market risk is part of the evolution of risk

### This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

### 861 Example SPLH. 5 page 1. prefer to have. New data in. SPSS Syntax FILE HANDLE. VARSTOCASESS /MAKE rt. COMPUTE mean=2. COMPUTE sal=2. END IF.

SPLH 861 Example 5 page 1 Multivariate Models for Repeated Measures Response Times in Older and Younger Adults These data were collected as part of my masters thesis, and are unpublished in this form (to

### Multinomial Logistic Regression

Multinomial Logistic Regression Dr. Jon Starkweather and Dr. Amanda Kay Moske Multinomial logistic regression is used to predict categorical placement in or the probability of category membership on a

### Chapter 15. Mixed Models. 15.1 Overview. A flexible approach to correlated data.

Chapter 15 Mixed Models A flexible approach to correlated data. 15.1 Overview Correlated data arise frequently in statistical analyses. This may be due to grouping of subjects, e.g., students within classrooms,

### A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps

### Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

### Automated Biosurveillance Data from England and Wales, 1991 2011

Article DOI: http://dx.doi.org/10.3201/eid1901.120493 Automated Biosurveillance Data from England and Wales, 1991 2011 Technical Appendix This online appendix provides technical details of statistical

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### A Tutorial on Logistic Regression

A Tutorial on Logistic Regression Ying So, SAS Institute Inc., Cary, NC ABSTRACT Many procedures in SAS/STAT can be used to perform logistic regression analysis: CATMOD, GENMOD,LOGISTIC, and PROBIT. Each

### Introduction to Data Analysis in Hierarchical Linear Models

Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM

### Regression analysis in practice with GRETL

Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that

### What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

### Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation

### Technical Efficiency Accounting for Environmental Influence in the Japanese Gas Market

Technical Efficiency Accounting for Environmental Influence in the Japanese Gas Market Sumiko Asai Otsuma Women s University 2-7-1, Karakida, Tama City, Tokyo, 26-854, Japan asai@otsuma.ac.jp Abstract:

### Calculating Effect-Sizes

Calculating Effect-Sizes David B. Wilson, PhD George Mason University August 2011 The Heart and Soul of Meta-analysis: The Effect Size Meta-analysis shifts focus from statistical significance to the direction

### Introduction to Fixed Effects Methods

Introduction to Fixed Effects Methods 1 1.1 The Promise of Fixed Effects for Nonexperimental Research... 1 1.2 The Paired-Comparisons t-test as a Fixed Effects Method... 2 1.3 Costs and Benefits of Fixed

### Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13

Missing Data: Part 1 What to Do? Carol B. Thompson Johns Hopkins Biostatistics Center SON Brown Bag 3/20/13 Overview Missingness and impact on statistical analysis Missing data assumptions/mechanisms Conventional

### Survival Analysis Using SPSS. By Hui Bian Office for Faculty Excellence

Survival Analysis Using SPSS By Hui Bian Office for Faculty Excellence Survival analysis What is survival analysis Event history analysis Time series analysis When use survival analysis Research interest

### ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

### An Introduction to Modeling Longitudinal Data

An Introduction to Modeling Longitudinal Data Session I: Basic Concepts and Looking at Data Robert Weiss Department of Biostatistics UCLA School of Public Health robweiss@ucla.edu August 2010 Robert Weiss

### CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical

### Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity

### Statistical Rules of Thumb

Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### PowerScore Test Preparation (800) 545-1750

Question 1 Test 1, Second QR Section (version 1) List A: 0, 5,, 15, 20... QA: Standard deviation of list A QB: Standard deviation of list B Statistics: Standard Deviation Answer: The two quantities are

### When to Use a Particular Statistical Test

When to Use a Particular Statistical Test Central Tendency Univariate Descriptive Mode the most commonly occurring value 6 people with ages 21, 22, 21, 23, 19, 21 - mode = 21 Median the center value the

### Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

### Penalized regression: Introduction

Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

### Chapter 1 Introduction. 1.1 Introduction

Chapter 1 Introduction 1.1 Introduction 1 1.2 What Is a Monte Carlo Study? 2 1.2.1 Simulating the Rolling of Two Dice 2 1.3 Why Is Monte Carlo Simulation Often Necessary? 4 1.4 What Are Some Typical Situations