Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
|
|
- Cathleen Black
- 8 years ago
- Views:
Transcription
1 Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012
2 Outline 1 Model Comparison 2 Model Diagnostics in Proportional Hazards
3 Part I Model Comparison
4 Comparing Survival Curves Two groups Suppose that interest lies in whether two groups have different survival curves Plotting the curves gives an impression of whether they are the same What can we say more formally?
5 Comparing Survival Curves Log rank test The log-rank test allows us to compare survival curves for two groups The log-rank test works by comparing observed failures with expected failures under the null hypothesis of no difference between groups
6 Comparing Survival Curves Log rank test Recall the notation for ordered times 0 t (1) t (n) with n (i) the number at risk just prior to time t (i), and d (i) the number of failures at time t (i). Let k = 1, 2 represent the two groups and Define n k,(i) be the number at risk just prior to time t (i) in group k, so that n (i) = n 1,(i) + n 2,(i) d k,(i) be the number of failures at time t (i) in group k, so that d (i) = d 1,(i) + d 2,(i) E k = i d (i) n k,(i) n (i) O k = i d k,(i) as the expected and observed numbers of failures in group k.
7 Comparing Survival Curves Log rank test Under the null hypothesis of no difference between groups, the statistic X 2 = (O 1 E 1 ) 2 /V χ 2 1 as the sample size becomes large, where V = var(o 1 E 1 ): V = i d (i) n 1,(i) n 2,(i) (n (i) d (i) ) n 2 (i) (n (i) 1) This comes from the hypergeometric distribution (NB O 1 E 1 = (O 2 E 2 ), so either can be used in the definition)
8 Comparing Survival Curves Log rank test An alternative definition of the log-rank test statistic is 2 (E k O k ) 2 k=1 E k which is also asymptotically χ 2 1 under H 0. Either version can be used, but the second definition may be a bit more conservative (reject the null less often)
9 Log rank test in R survdiff{survival} Recall the Recidivism data; assume that fin (indicator of financial aid) is the only covariate > survdiff(surv(week, arrest) ~ fin, data=rossi) Call: survdiff(formula = Surv(week, arrest) ~ fin, data = Rossi) N Observed Expected (O-E)^2/E (O-E)^2/V fin= fin= Chisq= 3.8 on 1 degrees of freedom, p= The final column and the final line relate to the first test definition The penultimate column gives the statistics for the second test definition > [1] 3.82 > 1-pchisq(3.82,df=1) [1] The difference between groups is marginally non-significant at the 5% level
10 Comparing Survival Curves More than two groups The log-rank test can also be extended to compare more than two groups If there are G groups then the statistic asymptotically, where (O E) T V 1 (O E) χ 2 G 1 O E = (O 1 E 1,..., O G 1 E G 1 ) and V is its variance-covariance matrix. The formula G (E k O k ) 2 k=1 E k is also approximately χ 2 G 1 distributed.
11 Comparing Survival Curves Care needs to be taken when comparing groups with the log rank test, as the distribution of other covariates may not be the same within the two groups E.g. what if all the financial aid went to those over 40 years of age? This could cause us to infer a difference between groups, which is actually related to the effects of these other covariates More generally, we usually have a more complicated model than two groups, and so we want to know how to compare models in the presence of multiple covariates
12 Comparing Survival Curves Parametric models Let T denote the survival time, and z a vector of covariates Suppose that we model T z Weibull so that f (t z) = aλ a t a 1 e zβ exp{ (λt) a e zβ }, t 0. The log-likelihood for X i = min{t i, C i }, δ i = I(T i < C i ) is l(β, λ, α) = i δ i log f (x i z i, θ) + i (1 δ i ) log S(x i z i, θ). If we want to test any hypotheses about (β, λ, α), then we can use likelihood ratio tests
13 Likelihood ratio Paramteric models > wei <- survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + paro + prio,data=rossi) > summary(wei) Call: survreg(formula = Surv(week, arrest) ~ fin + age + race + wexp + mar + paro + prio, data = Rossi) Value Std. Error z p (Intercept) e-21 fin e-02 age e-02 race e-01 wexp e-01 mar e-01 paro e-01 prio e-03 Log(scale) e-04 Scale= 0.712
14 Likelihood ratio Paramteric models paro does not look significant; fit the model without it: wei2 <- survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + prio,data=rossi) > 2*(wei$loglik[2] - wei2$loglik[2]) [1] this likelihood ratio test has 1 df; so the critical value (at the 5% level) is We would not reject the hypothesis that paro has no effect.
15 Likelihood ratio Paramteric models Recall that the exponential model is a special case of the Weibull model Since it is nested, we can also do a likelihood ratio test for this > wei <- survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + paro + prio,data=rossi) > expn <- survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + paro + prio, dist="exponential", data=rossi) > 2*(wei$loglik[2]-expn$loglik[2]) [1] This is highly significant at the 5% level on 1 df.
16 Likelihood ratio Cox PH model Likelihood ratio tests are also applicable to the Cox Proportional Hazards partial likelihood The asymptotic distributions are the same, i.e. a likelihood ratio test with q constraints, asymptotically follows a χ 2 q distribution ## test for significance of paro under the Cox PH model > mod0 <- coxph(surv(week, arrest) ~ fin + age + race + wexp + mar + paro + prio,data=rossi) > mod1 <- coxph(surv(week, arrest) ~ fin + age + race + wexp + mar + prio, data=rossi) > 2*(mod0$loglik[2]-mod1$loglik[2]) [1] Not significant at the 5% level on 1 df
17 Cox PH model Other test statistics The output for a fitted Cox PH model gives three test statistics which compare the fitted model to a null model These are the likelihood ratio, Wald, and score tests all χ 2 under H 0 > mod5 <- coxph(surv(week, arrest) ~ age + prio,data=rossi) > summary(mod5) Call: coxph(formula = Surv(week, arrest) ~ age + prio, data = Rossi) n= 432, number of events= 114 coef exp(coef) se(coef) z Pr(> z ) age *** prio *** --- Signif. codes: 0 *** ** 0.01 * exp(coef) exp(-coef) lower.95 upper.95 age prio Rsquare= (max possible= ) Likelihood ratio test= on 2 df, p=2.657e-06 Wald test = on 2 df, p=4.766e-06 Score (logrank) test = on 2 df, p=2.723e-06 The score test is the same as the log rank test when there are only two groups
18 Part II Model Diagnostics in Proportional Hazards
19 Checking Proportional Hazards Schoenfeld residuals for the ith subject on the kth covariate is ˆr ik = δ i (z ik ˆ z xi k), where ˆ z xi k is given by j R(x i ) z jk e z j ˆβ j R(x i ) ez j ˆβ Scaled Schoenfeld residuals: ˆr ik = ˆr ik mean(ˆr ik, i = 1,..., n) sd(ˆr ik, i = 1,..., n) For PH model, the (scaled) residuals ˆr ik should exhibit a random (i.e. unsystematic) pattern at each failure time. Otherwise it suggests that as time passes, the covariate effect is changing.
20 Checking Proportional Hazards cox.zph{survival} mod2 <- coxph(surv(week, arrest) ~ fin + age + prio, data=rossi) cox.zph(mod2) rho chisq p fin age prio GLOBAL NA The function tests proportionality of all the predictors by looking at their interactions with time. The column rho is the Pearson correlation between the scaled Schoenfeld residuals and time for each covariate. The last row contains the global test for all the interactions tested at once. A p-value less than 0.05 indicates a violation of the proportionality assumption.
21 Checking Proportional Hazards plot, cox.zph {survival} Graphs of the scaled Schoenfeld residuals against time: par(mfrow=c(2,2)) plot(cox.zph(mod2)) Beta(t) for fin Time Beta(t) for age Time Beta(t) for prio Time The curve is a smoothing spline with ±2 standard-error envelopes around the fit. Systematic departures from a horizontal line are indicative of non-proportional hazards. Here, there appears to be a trend in the plot for age, with the age effect declining with time.
22 Checking linearity We assume in PH models that λ(t; z) = λ 0 (t)e zβ, This means that log λ(t; z) is linearly dependent on the covariates z. Is this true? The martingale residual for subject i is ˆM i = δ i e z i ˆβ xi 0 ˆλ 0 (u)du For each k, the plot of ˆM i against z ik, i = 1,..., n, should exhibit a random pattern with mean 0.
23 Checking linearity cox.zph{survival} res <- residuals(mod2, type="martingale") X <- as.matrix(rossi[,c("age", "prio")]) # matrix of covariates for (j in 1:2) { # residual plots plot(x[,j], res, xlab=c("age", "prio")[j], ylab="residuals") abline(h=0, lty=2) lines(lowess(x[,j], res, iter=0)) } residuals age residuals prio Nonlinearity is slight here.
24 Influential observations We want to check the influence of each observation on the estimate ˆβ. Let ˆβ i denote the estimated vector of coefficients computed on the sample with the ith subject deleted. The idea is to check for each i which component of the vector ˆβ ˆβ i has large absolute values. This involves fitting n + 1 Cox regression models, which can be computationally expensive. There is an approximation based on the fit obtained from the whole data: ˆβ ˆβ i can be approximated by dfbeta i = I( ˆβ) 1 (ˆr i1,..., ˆr ik ), where I( ˆβ) is the observed Fisher information matrix and, for k = 1,..., K, ˆr ik is a function of ˆβ and of Schoenfeld residual ˆr ik. Plots of the quantities ˆr ik against i are used to gauge the influence of the i subject on the k covariate.
25 dfbeta <- residuals(mod2, type="dfbeta") for (j in 1:3) { plot(dfbeta[,j], ylab=names(coef(mod2))[j]) abline(h=0, lty=2) } fin Index age Index prio Index Comparing the magnitudes of the largest dfbeta values to the regression coefficients ( 0.35, 0.07, 0.1) suggests that none of the observations is terribly influential individually.
Survival analysis methods in Insurance Applications in car insurance contracts
Survival analysis methods in Insurance Applications in car insurance contracts Abder OULIDI 1-2 Jean-Marie MARION 1 Hérvé GANACHAUD 3 1 Institut de Mathématiques Appliquées (IMA) Angers France 2 Institut
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationSurvey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups
Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln Log-Rank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More informationTests for Two Survival Curves Using Cox s Proportional Hazards Model
Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.
More informationModeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models
Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models Abstract This paper considers the modeling of claim durations for existing claimants under income
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationLecture 14: GLM Estimation and Logistic Regression
Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South
More informationIntroduction to Survival Analysis
John Fox Lecture Notes Introduction to Survival Analysis Copyright 2014 by John Fox Introduction to Survival Analysis 1 1. Introduction I Survival analysis encompasses a wide variety of methods for analyzing
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationWe extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationTime varying (or time-dependent) covariates
Chapter 9 Time varying (or time-dependent) covariates References: Allison (*) p.138-153 Hosmer & Lemeshow Chapter 7, Section 3 Kalbfleisch & Prentice Section 5.3 Collett Chapter 7 Kleinbaum Chapter 6 Cox
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More informationLecture 15 Introduction to Survival Analysis
Lecture 15 Introduction to Survival Analysis BIOST 515 February 26, 2004 BIOST 515, Lecture 15 Background In logistic regression, we were interested in studying how risk factors were associated with presence
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationLecture 6: Poisson regression
Lecture 6: Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction EDA for Poisson regression Estimation and testing in Poisson regression
More informationLinda Staub & Alexandros Gekenidis
Seminar in Statistics: Survival Analysis Chapter 2 Kaplan-Meier Survival Curves and the Log- Rank Test Linda Staub & Alexandros Gekenidis March 7th, 2011 1 Review Outcome variable of interest: time until
More informationTime-Series Regression and Generalized Least Squares in R
Time-Series Regression and Generalized Least Squares in R An Appendix to An R Companion to Applied Regression, Second Edition John Fox & Sanford Weisberg last revision: 11 November 2010 Abstract Generalized
More informationStatistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationMultivariate Analysis of Variance (MANOVA): I. Theory
Gregory Carey, 1998 MANOVA: I - 1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationOutline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation
More informationChecking proportionality for Cox s regression model
Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and
More informationMGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal
MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationMSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech.
MSwM examples Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of
More informationPsychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
More informationSPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
More informationEDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day
More informationThis can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.
One-Degree-of-Freedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationIndices of Model Fit STRUCTURAL EQUATION MODELING 2013
Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:
More informationRégression logistique : introduction
Chapitre 16 Introduction à la statistique avec R Régression logistique : introduction Une variable à expliquer binaire Expliquer un risque suicidaire élevé en prison par La durée de la peine L existence
More informationExamining a Fitted Logistic Model
STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationLogistic (RLOGIST) Example #1
Logistic (RLOGIST) Example #1 SUDAAN Statements and Results Illustrated EFFECTS RFORMAT, RLABEL REFLEVEL EXP option on MODEL statement Hosmer-Lemeshow Test Input Data Set(s): BRFWGT.SAS7bdat Example Using
More informationLatent Class Regression Part II
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationSurvival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]
Survival Analysis of Left Truncated Income Protection Insurance Data [March 29, 2012] 1 Qing Liu 2 David Pitt 3 Yan Wang 4 Xueyuan Wu Abstract One of the main characteristics of Income Protection Insurance
More informationUniversity of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
More informationANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.
ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a
More informationDeveloping Business Failure Prediction Models Using SAS Software Oki Kim, Statistical Analytics
Paper SD-004 Developing Business Failure Prediction Models Using SAS Software Oki Kim, Statistical Analytics ABSTRACT The credit crisis of 2008 has changed the climate in the investment and finance industry.
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationSemiparametric Multinomial Logit Models for the Analysis of Brand Choice Behaviour
Semiparametric Multinomial Logit Models for the Analysis of Brand Choice Behaviour Thomas Kneib Department of Statistics Ludwig-Maximilians-University Munich joint work with Bernhard Baumgartner & Winfried
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationInteraction between quantitative predictors
Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
More informationLecture 8: Gamma regression
Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing
More informationDeveloping Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationOverview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)
Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and
More informationLinear Models for Continuous Data
Chapter 2 Linear Models for Continuous Data The starting point in our exploration of statistical models in social research will be the classical linear model. Stops along the way include multiple linear
More informationStatistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationLecture 18: Logistic Regression Continued
Lecture 18: Logistic Regression Continued Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina
More informationOutline. Dispersion Bush lupine survival Quasi-Binomial family
Outline 1 Three-way interactions 2 Overdispersion in logistic regression Dispersion Bush lupine survival Quasi-Binomial family 3 Simulation for inference Why simulations Testing model fit: simulating the
More informationKSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationUNDERSTANDING THE INDEPENDENT-SAMPLES t TEST
UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly
More informationSection 13, Part 1 ANOVA. Analysis Of Variance
Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationWeek 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
More informationLogistic regression (with R)
Logistic regression (with R) Christopher Manning 4 November 2007 1 Theory We can transform the output of a linear regression to be suitable for probabilities by using a logit link function on the lhs as
More information