Understanding Variability in Multilevel Models for Categorical Responses. Sophia Rabe-Hesketh. Outline
|
|
- Erick Moore
- 7 years ago
- Views:
Transcription
1 Understanding Variability in Multilevel Models for Categorical Responses Outline Two-level logistic random-intercept model (1: students, 2: schools) Measures of residual variability and dependence Sophia Rabe-Hesketh University of California, Berkeley Median odds ratio Intraclass correlation of latent responses Intraclass correlation of observed responses & Institute of Education, University of London Standard deviation of probabilities Variance partition coefficient Graphical displays of variability Joint work with Anders Skrondal Densities of probabilities Percentiles of probabilities Probabilities for schools in the data Hierarchical Linear Modeling SIG AERA Annual Meeting, Vancouver, April 2012 p.1 Ordinal responses and three-level data (1: students, 2: teachers, 3: schools) p.2 Example: PISA (Programme for International Student Assessment) PISA: Distribution of SES PISA study assesses reading, math and science achievement among 15-year-old students in tens of countries every 3 years Here consider U.S. data from 2000 Schools j were randomly sampled and then students i were randomly sampled from the selected schools Variables: Reading proficiency y ij (1=yes, 0=no) Student SES x ij mn: School mean SES x j dv: Deviation of student SES from school mean x ij x j Student SES (mn+dv) mn+11 mn School mean SES (mn) Min to Max P25 to P75 For schools, mn: 25th & 75th %tiles: 39 & 51 Mean & Median: 45 For students, dv: 25th & 75th %tiles: -11 & 11 Mean & Median: 0 Three covariate patterns: x ij mn dv ses lo me hi Sample size (listwise) 2069 students in 148 schools Number of students per school: 1 to 28, mean/median 14 p.3 p.4
2 Two-level logistic random-intercept model Two-level logistic random-intercept model for unit i (level 1) nested in cluster j (level 2) logit[pr(y ij =1 x ij,ζ j )] = [ ] Pr(yij =1 x ij,ζ j ) log Pr(y ij =0 x ij,ζ j ) }{{} Odds = β 1 + β 2 dv ij + β 3 mn j + ζ j Maximum likelihood estimates of model parameters Null model Full model Param. Covariate Est (SE) Est (SE) OR (95% CI) β (0.10) 4.79 (0.43) 10β 2 dv/ (0.03) 1.2 (1.1,1.3) 10β 3 mn/ (0.09) 2.4 (2.1,2.9) ψ log[odds(y ij =1 x ij,ζ j )] = β 1 + β 2 dv ij + β 3 mn j + ζ j Odds(y ij =1 x ij,ζ j ) = exp(β 1 )exp(β 2 ) dv ij exp(β 3 ) mn j exp(ζ j ) x ij is vector of covariates (dv ij, mn j ) β 1 is the fixed intercept β 2 and β 3 are fixed regression coefficients ζ j is a random intercept ζ j N(0,ψ) Sometimes write x ij β β 1 + β 2 dv ij + β 3 mn j p.5 Conditional odds ratio associated with 10-point increase in dv ij for given mn j and ζ j : exp(β 1 )exp(β 2 ) a+10 exp(β 3 ) mn j exp(ζ j ) exp(β 1 )exp(β 2 ) a exp(β 3 ) mn =exp(β 2 ) 10 =exp(10β 2 ) j exp(ζ j ) Comparing two students from same school (given mn j and ζ j ) whose SES differs by 10 points, student with higher SES has 1.2 times the odds of being proficient as other student odds are 20% greater Conditional odds ratio associated with 10-point increase in mn j for given dv ij and ζ j : Comparing two schools that differ in their mean SES by 10 points and have the same random intercept, odds of a student (with SES at the school mean) being proficient are 2.4 times as great for higher-ses school p.6 Predicted conditional probabilities p(x ij,ζ j ) Conditional probability with estimates β 1, β 2, β 3 plugged in p(x ij,ζ j ) Pr(yij =1 x ij,ζ j )= exp( β 1 + β 2 dv ij + β 3 mn j + ζ j ) 1+exp( β 1 + β 2 dv ij + β 3 mn j + ζ j ) Plug in interesting values of x ij and ζ j, e.g., ζ j =0(mean & median) p(lo, 0) = 0.18, p(me, 0) = 0.32, p(hi, 0) = School mean SES (mn) dv = 11 dv = 0 dv = Deviation of SES from school mean (dv) mn = 39 mn = 45 mn = 51 p.7 Model for conditional odds Median odds ratio Odds(y ij =1 x ij,ζ j )=exp(β 1 )exp(β 2 ) dv ij exp(β 3 ) mn j exp(ζ j ) Randomly sample schools, then students (for given x) Odds ratio, for two students i and i from different schools j and j : Odds(y ij =1 x,ζ j ) Odds(y i j =1 x,ζ j ) =exp(ζ j ζ j ) Odds ratio, comparing student with greater odds to other student: OR =exp( ζ j ζ j ), (ζ j ζ j ) N(0, 2ψ) Odds ratio exp( ζ j ζ j ) varies randomly and has estimated median = exp{ 2 ψφ 1 (3/4)} [Larsen et al., 2000] OR median p.8
3 Median odds ratio (cont d) Latent response formulation For two randomly drawn students from two randomly drawn schools, estimated median odds ratio, comparing student from school with larger random intercept to other student, is OR median = 2.4 for null model OR median = 1.7 for full model In null model, odds ratio due to random intercept exceeds 2.4 half the time In full model, estimated odds ratio for 10-point increase in school-mean SES is 2.4 p.9 Imagine latent (unobserved) continuous response y ij (e.g., reading ability) Observed response is 1 if latent response is greater than 0 (e.g., proficient if ability greater than 0): 1 if yij y ij = > 0 0 otherwise Model for latent response: y ij = β 1 + β 2 dv ij + β 3 mn j +ζ j + ɛ ij, ζ j N(0,ψ), ɛ ij logistic }{{} x ijβ Logistic distribution has mean 0 and variance π 2 /3 and is similar to normal distribution Model for observed response: logit[pr(y ij =1 x ij,ζ j )] = β 1 + β 2 dv ij + β 3 mn j + ζ j, ζ j N(0,ψ) p.10 Equivalence of formulations Intraclass correlation of latent responses Model for yij is standard multilevel/hierarchical linear model, therefore use standard ICC Var(ζ j ) ICC lat = Var(ζ j )+Var(ɛ ij ) = ψ ψ + π 2 /3 Pr(yij =1 xij,ζj) 0.5 Correlation between students i and i in same school j Cor(y ij,y i j x ij, x i j)=cor(ζ j + ɛ ij,ζ j + ɛ i j)= ψ ψ + π 2 /3 y ij 0.0 x ij β + ζ j Proportion of variance that is between schools Var(yij x ij )=Var(ζ j + ɛ ij )=Var(ζ j )+Var(ɛ ij )=ψ + π 2 /3 For PISA data, plug in ψ ICC lat = 0.82 =0.20 for null model ICC lat = =0.08 for full model p.12 p.11
4 Intraclass correlation of observed responses Correlation Cor(y ij,y i j x ij, x i j) of observed responses for students i and i in same school j depends on covariates x ij, x i j For simplicity, assume x ij = x i j = x ICC obs = Ĉor(y ij,y i j x) = p 11(x) p(x) 2 p(x)[1 p(x)] Intraclass correlation of observed responses (cont d) PISA: Null model Full model lo me hi ICC obs ICC lat Among schools and students with covariates x, randomly choose a school and then randomly choose students from the school p(x) is probability that a student is proficient p 11 (x) is probability that two students are both proficient Population averaged or marginal probability p(x) = Pr(yij =1 x) = p(x,ζ j )g(ζ j ;0, ψ) dζ j Average over random effects with Gaussian density g(ζ j ;0, ψ) ICC Random-intercept standard deviation ψ Observed Latent logit[pr(y ij =1 ζ j )] = β 1 +ζ j ζ j N(0,ψ) ICC obs for β 1 =0to β 1 =3.7 ICC obs < ICC lat ICC obs largest for β 1 =0 [Rodríguez & Elo, 2003] [Rodríguez & Elo, 2003] p.13 p.14 Kappa coefficient of observed responses Standard deviation of probabilities Recall ICC obs = Ĉor(y ij,y i j x) = p 11(x) p(x) 2 p(x)[1 p(x)] Probabilities of agreement for students in same school: Pr(y ij = y i j =1 x) p 11 (x) = ICC obs p(x)[1 p(x)] + p(x) 2 Pr(y ij = y i j =0 x) p 00 (x) = ICC obs p(x)[1 p(x)] + [1 p(x)] 2 Probabilities of agreement for students in different schools: Pr(y ij = y i j =1 x) = p(x)2 Pr(y ij = y i j =0 x) = [1 p(x)]2 =1 2p(x)+p(x) 2 =1 2p(x)+p(x) 2 Pr(y ij = y i j x) = 1 2p(x)[1 p(x)] κ = Pr(y ij = y i j x) Pr(y ij = y i j x) 1 Pr(y ij = y i j x) = ICC obs 2p(x)[1 p(x)] = ICC obs 2p(x)[1 p(x)] Already considered median of p(x,ζ j ) Median[ p(x,ζ j )] = p(x, 0) Already considered mean of p(x,ζ j ) p(x) = Pr(yij =1 x) = p(x,ζ j )g(ζ j ;0, ψ) dζ j Get standard deviation of p(x,ζ j ) by integration or by simulation sd[ p(x,ζ j )] = Null model Var[ p(x,ζ j )] Full model lo me hi [Eldridge et al., 2009] p.15 p.16
5 General result Var(responses) }{{} Total variance Variance partition coefficient = Var(School-mean response) + Mean(Within-school variance) }{{}}{{} Between-school variance v 2 Within-school variance v 1 For school j: School-mean response: Mean(y ij x,ζ j )= p(x,ζ j ) Find variance over distribution of ζ j Within-school variance: Var(yij x,ζ j )= p(x,ζ j )[1 p(x,ζ j )] Find mean over distribution of ζ j Total variance: Var(y ij x ij )=v 2 + v 1 = p(x ij )[1 p(x ij )] All results for PISA x mn dv p(x, 0) p(x) sd[ p(x,ζ j )] ICC lat ICC obs VPC Null model Full model lo me hi Variance partition coefficient [Goldstein et al., 2002, Simulation, Method B] VPC = v 2 v 2 + v 1 = ICC obs ICC lat p.17 p.18 Density of probabilities Conditional probabilities, conditional on ζ j, depend on ζ j p(x,ζ j ) Pr(y ij =1 x ij = x,ζ j ) When ζ j varies, corresponding probability p p(x,ζ j ) varies with density f( p) =g(log[ p/(1 p)]; x β,ψ) 1 p(1 p) Percentiles of probabilities A given percentile of p(x,ζ j ), for given x, can be found by plugging in corresponding percentile of ζ j N(0, ψ) Left figure: p(x, ±1.96 ψ) and p(x, 0) versus mn with dv= 0 Randomly sample schools with a given mn: Interval is 95% range of probabilities for students with dv= 0 Density Null model Conditional probability Median Mean Density Full model Conditional probability lo me hi School mean SES (mn) Median 2.5th %tile to 97.5th %tile Deviation of SES from school mean (dv) mn = 39 mn = 51 25th %tile to 75th %tile 25th %tile to 75th %tile [Duchateau & Janssen, 2005] p.19 p.20
6 Probabilities for schools in the data Probabilities for schools in the data (cont d) Probability p(x) of proficiency for a randomly sampled student with covariates x in an existing school j Data on students in school j: y j =(y 1j,...,y njj) and X j = Information about ζ j (Bayes theorem): x 1j. x n jj Posterior(ζ j y j, X j )= g(ζ j;0,ψ)pr(y j X j,ζ j ) Pr(y j X j ) Best prediction of probability (in terms of mean-squared error) is p(x) = Posterior mean[ p(x,ζ j )] = p(x,ζ j )Posterior(ζ j y j, X j ) dζ j All schools, x =(0, mn j ) School mean SES (mn) Median th %tile to 97.5th %tile randomly chosen schools, x = x ij Student SES (mn+dv) Not equal to p(x, ζ j ), where ζ j is posterior mean of ζ j [Skrondal & Rabe-Hesketh, 2009] p.21 p.22 Probabilities for schools in the data (cont d) In previous graphs, p(x) varies between schools for given value of dv because mn j and ζ j vary Isolate variability due to ζ j by setting mn j =45and dv ij =0 (boxplot on right) Three-level ordinal cumulative logit model Tennessee class-size experiment In 4th grade, teachers assessed student participation 2217 students i of 262 teachers j in 75 schools k Ordinal response variable: Teacher s rating of how frequently student pays attention y ijk Rated as: 1 (never), 2, 3 (sometimes), 4, 5 (always) No covariates for simplicity dv=0 mn=45, dv=0 p.23 p.24
7 Three-level ordinal cumulative logit model: OR median Model probability of exceeding a category s =1, 2, 3, 4 logit[pr(y ijk >s ζ (2) jk,ζ(3) k )] = κ s + ζ (2) jk + ζ(3) k Teacher random intercept ζ (2) jk N(0,ψ(2) ), ψ(2) =0.44 School random intercept ζ (3) k N(0,ψ (3) ), ψ(3) =0.12 Odds ratio, comparing students of teachers j and j in school k Pr(y ijk >s ζ (2) jk,ζ(3) k ) Pr(y ij k >s ζ (2) ) =exp(ζ(2) jk j k,ζ(3) k ζ(2) j k ) Median of odds ratio, comparing student with larger odds to other student: OR median Different teacher, same school 1.88 Different teacher, different school 2.04 p.25 Three-level ordinal cumulative logit model: ICC lat Latent response y ijk y ijk = ζ (2) jk Observed response y ijk + ζ(3) k + ɛ ijk, ɛ ijk logistic results from cutting up y ijk at cut-points or thresholds κ 1 to κ 4 : yijk κ 1 κ 2 κ 3 κ 4 Intraclass correlations of latent responses Same school, different teacher: ψ (2) ICC lat (school) = ψ (2) + ψ (3) + π 2 /3 =0.031 Same school, same teacher: ICC lat (teacher, school) = ψ (2) + ψ (3) ψ (2) + ψ (3) + π 2 /3 =0.145 p.26 Three-level ordinal cumulative logit model: ICC obs Probabilities for teachers and schools in the data Two randomly chosen students i, i from same school (either different or same teachers) Pearson correlation, Cor(y ijk,y i j k), Cor(y ijk,y i jk), or Kendall s τ b : ICC lat Cor ICC obs Same school, different teacher Same school, same teacher Method: 5 5 table of response for student i (rows) versus response for student i (columns) π st (school) = Pr(y ijk = s, y i j k = t) π st (teacher, school) = Pr(y ijk = s, y i jk = t) τ b Probability Probability that student pays attention more than sometimes (y >3) Randomly chosen student, for: Teachers in data Schools in data (new teacher) Integrate over posterior of ζ (2) jk, ζ(3) k Integrate over prior of ζ (2) jk and posterior of ζ(3) k Probability 5 15 Use these probabilities as frequency weights for Cor and τ b p.27 p.28
8 Software Stata was used for the results and graphs Datasets and Stata commands available from Models estimated by ML with adaptive quadrature [Rabe-Hesketh et al., 2005] Commands for binary logit models: xtlogit, xtmelogit xtrho [Rodríguez & Elo, 2003] for ICC obs = VPC Command for binary, ordinal, or multinomial logit models (and other response types): gllamm [Rabe-Hesketh et al., 2005] gllapred used to obtain p(x), p(x), p st (x), etc. [Rabe-Hesketh & Skrondal, 2009] Discussion Have discussed measures and graphs for interpreting variability implied by model with estimated parameters Ignored parameter uncertainty (no SEs or CIs) Did not consider variance explained by covariates Extensions Include random coefficient of x ij for p(x), p(x), and ICC obs, integrate over all random effects ICC lat and OR median now depend on x ij More levels: Straightforward Relax normality assumption for random effects: non-parametric maximum likelihood estimation (NPMLE) Other response types Nominal responses: Similar to binary responses Counts: No ICC lat, but can obtain IRR median and ICC obs p.29 p.30 References Duchateau, L. & Janssen, P. (2005). Understanding heterogeneity in generalized linear mixed and frailty models. The American Statistician 59, Eldridge, S. M, et al. (2009). The intra-cluster correlation coefficient in cluster randomized trials: A review of definitions. International Statistical Review 77, Goldstein, H., et al. (2002). Partitioning variation in multilevel models. Understanding Statistics 1, Larsen, K., et al. (2000). Interpreting parameters in the logistic regression model with random effects. Biometrics 56, Rabe-Hesketh, S. & Skrondal, A. (2012). Multilevel and Longitudinal Modeling Using Stata (Third Edition). College Station, TX: Stata Press. Rabe-Hesketh, S. & Skrondal, A. (in prep). Understanding variability in multilevel models for categorical responses. Rabe-Hesketh, S., et al. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics 128, Skrondal, A. & Rabe-Hesketh, S. (2009). Prediction in multilevel generalized linear models. Journal of the Royal Statistical Society, Series A 172, Rodríguez, G. & Elo, I. (2003). Intra-class correlation in random-effects models for binary data. The Stata Journal 3, p.31
Prediction for Multilevel Models
Prediction for Multilevel Models Sophia Rabe-Hesketh Graduate School of Education & Graduate Group in Biostatistics University of California, Berkeley Institute of Education, University of London Joint
More informationMultilevel Modeling of Complex Survey Data
Multilevel Modeling of Complex Survey Data Sophia Rabe-Hesketh, University of California, Berkeley and Institute of Education, University of London Joint work with Anders Skrondal, London School of Economics
More informationModels for Longitudinal and Clustered Data
Models for Longitudinal and Clustered Data Germán Rodríguez December 9, 2008, revised December 6, 2012 1 Introduction The most important assumption we have made in this course is that the observations
More information10 Dichotomous or binary responses
10 Dichotomous or binary responses 10.1 Introduction Dichotomous or binary responses are widespread. Examples include being dead or alive, agreeing or disagreeing with a statement, and succeeding or failing
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationEfficient and Practical Econometric Methods for the SLID, NLSCY, NPHS
Efficient and Practical Econometric Methods for the SLID, NLSCY, NPHS Philip Merrigan ESG-UQAM, CIRPÉE Using Big Data to Study Development and Social Change, Concordia University, November 2103 Intro Longitudinal
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Beckman HLM Reading Group: Questions, Answers and Examples Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Linear Algebra Slide 1 of
More informationAuxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
More informationPower and sample size in multilevel modeling
Snijders, Tom A.B. Power and Sample Size in Multilevel Linear Models. In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science. Volume 3, 1570 1573. Chicester (etc.): Wiley,
More informationIntroduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationLinda K. Muthén Bengt Muthén. Copyright 2008 Muthén & Muthén www.statmodel.com. Table Of Contents
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationMS 2007-0081-RR Booil Jo. Supplemental Materials (to be posted on the Web)
MS 2007-0081-RR Booil Jo Supplemental Materials (to be posted on the Web) Table 2 Mplus Input title: Table 2 Monte Carlo simulation using externally generated data. One level CACE analysis based on eqs.
More informationHandling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationProblem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationIt is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
More informationMultilevel modelling of complex survey data
J. R. Statist. Soc. A (2006) 169, Part 4, pp. 805 827 Multilevel modelling of complex survey data Sophia Rabe-Hesketh University of California, Berkeley, USA, and Institute of Education, London, UK and
More informationAnalysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk
Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationCurriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
More informationPOS 6933 Section 016D Spring 2015 Multilevel Models Department of Political Science, University of Florida
POS 6933 Section 016D Spring 2015 Multilevel Models Department of Political Science, University of Florida Thursday: Periods 2-3; Room: MAT 251 POLS Computer Lab: Day, Time: TBA Instructor: Prof. Badredine
More informationCHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA
Examples: Multilevel Modeling With Complex Survey Data CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA Complex survey data refers to data obtained by stratification, cluster sampling and/or
More informationLinear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil.
Steven J Zeil Old Dominion Univ. Fall 200 Discriminant-Based Classification Linearly Separable Systems Pairwise Separation 2 Posteriors 3 Logistic Discrimination 2 Discriminant-Based Classification Likelihood-based:
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationSTA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance
Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis
More informationLecture 5 Three level variance component models
Lecture 5 Three level variance component models Three levels models In three levels models the clusters themselves are nested in superclusters, forming a hierarchical structure. For example, we might have
More information10. Analysis of Longitudinal Studies Repeat-measures analysis
Research Methods II 99 10. Analysis of Longitudinal Studies Repeat-measures analysis This chapter builds on the concepts and methods described in Chapters 7 and 8 of Mother and Child Health: Research methods.
More informationHLM software has been one of the leading statistical packages for hierarchical
Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationIntroduction to Multilevel Modeling Using HLM 6. By ATS Statistical Consulting Group
Introduction to Multilevel Modeling Using HLM 6 By ATS Statistical Consulting Group Multilevel data structure Students nested within schools Children nested within families Respondents nested within interviewers
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationScatter Plots with Error Bars
Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationCHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA
Examples: Mixture Modeling With Longitudinal Data CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA Mixture modeling refers to modeling with categorical latent variables that represent subpopulations
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationData analysis process
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
More informationStatistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives
More informationGeostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt
More informationLAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-
More informationMultiple Choice Models II
Multiple Choice Models II Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini Laura Magazzini (@univr.it) Multiple Choice Models II 1 / 28 Categorical data Categorical
More informationThe Basic Two-Level Regression Model
2 The Basic Two-Level Regression Model The multilevel regression model has become known in the research literature under a variety of names, such as random coefficient model (de Leeuw & Kreft, 1986; Longford,
More informationGoodness of fit assessment of item response theory models
Goodness of fit assessment of item response theory models Alberto Maydeu Olivares University of Barcelona Madrid November 1, 014 Outline Introduction Overall goodness of fit testing Two examples Assessing
More informationSPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationSome Essential Statistics The Lure of Statistics
Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationLogistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
More informationImputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
More informationA Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under Two-Level Models
A Composite Likelihood Approach to Analysis of Survey Data with Sampling Weights Incorporated under Two-Level Models Grace Y. Yi 13, JNK Rao 2 and Haocheng Li 1 1. University of Waterloo, Waterloo, Canada
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationLecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
More informationOverview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS
Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical
More informationLogit and Probit. Brad Jones 1. April 21, 2009. University of California, Davis. Bradford S. Jones, UC-Davis, Dept. of Political Science
Logit and Probit Brad 1 1 Department of Political Science University of California, Davis April 21, 2009 Logit, redux Logit resolves the functional form problem (in terms of the response function in the
More informationUNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
More informationThe Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities
The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities Elizabeth Garrett-Mayer, PhD Assistant Professor Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University 1
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationBasic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
More informationStudents' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)
Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationStatistics 104: Section 6!
Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationFactors affecting online sales
Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4
More informationIntroduction to Data Analysis in Hierarchical Linear Models
Introduction to Data Analysis in Hierarchical Linear Models April 20, 2007 Noah Shamosh & Frank Farach Social Sciences StatLab Yale University Scope & Prerequisites Strong applied emphasis Focus on HLM
More informationImplications of Big Data for Statistics Instruction 17 Nov 2013
Implications of Big Data for Statistics Instruction 17 Nov 2013 Implications of Big Data for Statistics Instruction Mark L. Berenson Montclair State University MSMESB Mini Conference DSI Baltimore November
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationgllamm companion for Contents
gllamm companion for Rabe-Hesketh, S. and Skrondal, A. (2012). Multilevel and Longitudinal Modeling Using Stata (3rd Edition). Volume I: Continuous Responses. College Station, TX: Stata Press. Contents
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationCollege Readiness LINKING STUDY
College Readiness LINKING STUDY A Study of the Alignment of the RIT Scales of NWEA s MAP Assessments with the College Readiness Benchmarks of EXPLORE, PLAN, and ACT December 2011 (updated January 17, 2012)
More information430 Statistics and Financial Mathematics for Business
Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions
More informationWeb-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni
1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed
More informationSTAT 360 Probability and Statistics. Fall 2012
STAT 360 Probability and Statistics Fall 2012 1) General information: Crosslisted course offered as STAT 360, MATH 360 Semester: Fall 2012, Aug 20--Dec 07 Course name: Probability and Statistics Number
More informationQualitative vs Quantitative research & Multilevel methods
Qualitative vs Quantitative research & Multilevel methods How to include context in your research April 2005 Marjolein Deunk Content What is qualitative analysis and how does it differ from quantitative
More informationThe Statistics Tutor s Quick Guide to
statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationEstablishing and Closing Down Plants - Assessing the Effects of Firms Financial Status *
Establishing and Closing Down Plants - Assessing the Effects of Firms Financial Status * Laura Vartia ** European University Institute First version: September 2004 This version: March 2005 Abstract This
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationThe Probit Link Function in Generalized Linear Models for Data Mining Applications
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationAN ILLUSTRATION OF MULTILEVEL MODELS FOR ORDINAL RESPONSE DATA
AN ILLUSTRATION OF MULTILEVEL MODELS FOR ORDINAL RESPONSE DATA Ann A. The Ohio State University, United States of America aoconnell@ehe.osu.edu Variables measured on an ordinal scale may be meaningful
More information