Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne


 Cathleen Black
 2 years ago
 Views:
Transcription
1 Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012
2 Outline 1 Model Comparison 2 Model Diagnostics in Proportional Hazards
3 Part I Model Comparison
4 Comparing Survival Curves Two groups Suppose that interest lies in whether two groups have different survival curves Plotting the curves gives an impression of whether they are the same What can we say more formally?
5 Comparing Survival Curves Log rank test The logrank test allows us to compare survival curves for two groups The logrank test works by comparing observed failures with expected failures under the null hypothesis of no difference between groups
6 Comparing Survival Curves Log rank test Recall the notation for ordered times 0 t (1) t (n) with n (i) the number at risk just prior to time t (i), and d (i) the number of failures at time t (i). Let k = 1, 2 represent the two groups and Define n k,(i) be the number at risk just prior to time t (i) in group k, so that n (i) = n 1,(i) + n 2,(i) d k,(i) be the number of failures at time t (i) in group k, so that d (i) = d 1,(i) + d 2,(i) E k = i d (i) n k,(i) n (i) O k = i d k,(i) as the expected and observed numbers of failures in group k.
7 Comparing Survival Curves Log rank test Under the null hypothesis of no difference between groups, the statistic X 2 = (O 1 E 1 ) 2 /V χ 2 1 as the sample size becomes large, where V = var(o 1 E 1 ): V = i d (i) n 1,(i) n 2,(i) (n (i) d (i) ) n 2 (i) (n (i) 1) This comes from the hypergeometric distribution (NB O 1 E 1 = (O 2 E 2 ), so either can be used in the definition)
8 Comparing Survival Curves Log rank test An alternative definition of the logrank test statistic is 2 (E k O k ) 2 k=1 E k which is also asymptotically χ 2 1 under H 0. Either version can be used, but the second definition may be a bit more conservative (reject the null less often)
9 Log rank test in R survdiff{survival} Recall the Recidivism data; assume that fin (indicator of financial aid) is the only covariate > survdiff(surv(week, arrest) ~ fin, data=rossi) Call: survdiff(formula = Surv(week, arrest) ~ fin, data = Rossi) N Observed Expected (OE)^2/E (OE)^2/V fin= fin= Chisq= 3.8 on 1 degrees of freedom, p= The final column and the final line relate to the first test definition The penultimate column gives the statistics for the second test definition > [1] 3.82 > 1pchisq(3.82,df=1) [1] The difference between groups is marginally nonsignificant at the 5% level
10 Comparing Survival Curves More than two groups The logrank test can also be extended to compare more than two groups If there are G groups then the statistic asymptotically, where (O E) T V 1 (O E) χ 2 G 1 O E = (O 1 E 1,..., O G 1 E G 1 ) and V is its variancecovariance matrix. The formula G (E k O k ) 2 k=1 E k is also approximately χ 2 G 1 distributed.
11 Comparing Survival Curves Care needs to be taken when comparing groups with the log rank test, as the distribution of other covariates may not be the same within the two groups E.g. what if all the financial aid went to those over 40 years of age? This could cause us to infer a difference between groups, which is actually related to the effects of these other covariates More generally, we usually have a more complicated model than two groups, and so we want to know how to compare models in the presence of multiple covariates
12 Comparing Survival Curves Parametric models Let T denote the survival time, and z a vector of covariates Suppose that we model T z Weibull so that f (t z) = aλ a t a 1 e zβ exp{ (λt) a e zβ }, t 0. The loglikelihood for X i = min{t i, C i }, δ i = I(T i < C i ) is l(β, λ, α) = i δ i log f (x i z i, θ) + i (1 δ i ) log S(x i z i, θ). If we want to test any hypotheses about (β, λ, α), then we can use likelihood ratio tests
13 Likelihood ratio Paramteric models > wei < survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + paro + prio,data=rossi) > summary(wei) Call: survreg(formula = Surv(week, arrest) ~ fin + age + race + wexp + mar + paro + prio, data = Rossi) Value Std. Error z p (Intercept) e21 fin e02 age e02 race e01 wexp e01 mar e01 paro e01 prio e03 Log(scale) e04 Scale= 0.712
14 Likelihood ratio Paramteric models paro does not look significant; fit the model without it: wei2 < survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + prio,data=rossi) > 2*(wei$loglik[2]  wei2$loglik[2]) [1] this likelihood ratio test has 1 df; so the critical value (at the 5% level) is We would not reject the hypothesis that paro has no effect.
15 Likelihood ratio Paramteric models Recall that the exponential model is a special case of the Weibull model Since it is nested, we can also do a likelihood ratio test for this > wei < survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + paro + prio,data=rossi) > expn < survreg(surv(week,arrest) ~ fin + age + race + wexp + mar + paro + prio, dist="exponential", data=rossi) > 2*(wei$loglik[2]expn$loglik[2]) [1] This is highly significant at the 5% level on 1 df.
16 Likelihood ratio Cox PH model Likelihood ratio tests are also applicable to the Cox Proportional Hazards partial likelihood The asymptotic distributions are the same, i.e. a likelihood ratio test with q constraints, asymptotically follows a χ 2 q distribution ## test for significance of paro under the Cox PH model > mod0 < coxph(surv(week, arrest) ~ fin + age + race + wexp + mar + paro + prio,data=rossi) > mod1 < coxph(surv(week, arrest) ~ fin + age + race + wexp + mar + prio, data=rossi) > 2*(mod0$loglik[2]mod1$loglik[2]) [1] Not significant at the 5% level on 1 df
17 Cox PH model Other test statistics The output for a fitted Cox PH model gives three test statistics which compare the fitted model to a null model These are the likelihood ratio, Wald, and score tests all χ 2 under H 0 > mod5 < coxph(surv(week, arrest) ~ age + prio,data=rossi) > summary(mod5) Call: coxph(formula = Surv(week, arrest) ~ age + prio, data = Rossi) n= 432, number of events= 114 coef exp(coef) se(coef) z Pr(> z ) age *** prio ***  Signif. codes: 0 *** ** 0.01 * exp(coef) exp(coef) lower.95 upper.95 age prio Rsquare= (max possible= ) Likelihood ratio test= on 2 df, p=2.657e06 Wald test = on 2 df, p=4.766e06 Score (logrank) test = on 2 df, p=2.723e06 The score test is the same as the log rank test when there are only two groups
18 Part II Model Diagnostics in Proportional Hazards
19 Checking Proportional Hazards Schoenfeld residuals for the ith subject on the kth covariate is ˆr ik = δ i (z ik ˆ z xi k), where ˆ z xi k is given by j R(x i ) z jk e z j ˆβ j R(x i ) ez j ˆβ Scaled Schoenfeld residuals: ˆr ik = ˆr ik mean(ˆr ik, i = 1,..., n) sd(ˆr ik, i = 1,..., n) For PH model, the (scaled) residuals ˆr ik should exhibit a random (i.e. unsystematic) pattern at each failure time. Otherwise it suggests that as time passes, the covariate effect is changing.
20 Checking Proportional Hazards cox.zph{survival} mod2 < coxph(surv(week, arrest) ~ fin + age + prio, data=rossi) cox.zph(mod2) rho chisq p fin age prio GLOBAL NA The function tests proportionality of all the predictors by looking at their interactions with time. The column rho is the Pearson correlation between the scaled Schoenfeld residuals and time for each covariate. The last row contains the global test for all the interactions tested at once. A pvalue less than 0.05 indicates a violation of the proportionality assumption.
21 Checking Proportional Hazards plot, cox.zph {survival} Graphs of the scaled Schoenfeld residuals against time: par(mfrow=c(2,2)) plot(cox.zph(mod2)) Beta(t) for fin Time Beta(t) for age Time Beta(t) for prio Time The curve is a smoothing spline with ±2 standarderror envelopes around the fit. Systematic departures from a horizontal line are indicative of nonproportional hazards. Here, there appears to be a trend in the plot for age, with the age effect declining with time.
22 Checking linearity We assume in PH models that λ(t; z) = λ 0 (t)e zβ, This means that log λ(t; z) is linearly dependent on the covariates z. Is this true? The martingale residual for subject i is ˆM i = δ i e z i ˆβ xi 0 ˆλ 0 (u)du For each k, the plot of ˆM i against z ik, i = 1,..., n, should exhibit a random pattern with mean 0.
23 Checking linearity cox.zph{survival} res < residuals(mod2, type="martingale") X < as.matrix(rossi[,c("age", "prio")]) # matrix of covariates for (j in 1:2) { # residual plots plot(x[,j], res, xlab=c("age", "prio")[j], ylab="residuals") abline(h=0, lty=2) lines(lowess(x[,j], res, iter=0)) } residuals age residuals prio Nonlinearity is slight here.
24 Influential observations We want to check the influence of each observation on the estimate ˆβ. Let ˆβ i denote the estimated vector of coefficients computed on the sample with the ith subject deleted. The idea is to check for each i which component of the vector ˆβ ˆβ i has large absolute values. This involves fitting n + 1 Cox regression models, which can be computationally expensive. There is an approximation based on the fit obtained from the whole data: ˆβ ˆβ i can be approximated by dfbeta i = I( ˆβ) 1 (ˆr i1,..., ˆr ik ), where I( ˆβ) is the observed Fisher information matrix and, for k = 1,..., K, ˆr ik is a function of ˆβ and of Schoenfeld residual ˆr ik. Plots of the quantities ˆr ik against i are used to gauge the influence of the i subject on the k covariate.
25 dfbeta < residuals(mod2, type="dfbeta") for (j in 1:3) { plot(dfbeta[,j], ylab=names(coef(mod2))[j]) abline(h=0, lty=2) } fin Index age Index prio Index Comparing the magnitudes of the largest dfbeta values to the regression coefficients ( 0.35, 0.07, 0.1) suggests that none of the observations is terribly influential individually.
Survival analysis methods in Insurance Applications in car insurance contracts
Survival analysis methods in Insurance Applications in car insurance contracts Abder OULIDI 12 JeanMarie MARION 1 Hérvé GANACHAUD 3 1 Institut de Mathématiques Appliquées (IMA) Angers France 2 Institut
More informationChapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) 
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationCox ProportionalHazards Regression for Survival Data in R
Cox ProportionalHazards Regression for Survival Data in R An Appendix to An R Companion to Applied Regression, Second Edition John Fox & Sanford Weisberg last revision: 23 February 2011 Abstract Survival
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 SigmaRestricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More informationNominal and ordinal logistic regression
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics: Behavioural
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationRegression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 16233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationSurvey, Statistics and Psychometrics Core Research Facility University of NebraskaLincoln. LogRank Test for More Than Two Groups
Survey, Statistics and Psychometrics Core Research Facility University of NebraskaLincoln LogRank Test for More Than Two Groups Prepared by Harlan Sayles (SRAM) Revised by Julia Soulakova (Statistics)
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationChapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationTests for Two Survival Curves Using Cox s Proportional Hazards Model
Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationWe extended the additive model in two variables to the interaction model by adding a third term to the equation.
Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models  part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby
More informationLeast Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN13: 9780470860809 ISBN10: 0470860804 Editors Brian S Everitt & David
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 16233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate
More informationLecture 14: GLM Estimation and Logistic Regression
Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South
More informationVariance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.
Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationTime varying (or timedependent) covariates
Chapter 9 Time varying (or timedependent) covariates References: Allison (*) p.138153 Hosmer & Lemeshow Chapter 7, Section 3 Kalbfleisch & Prentice Section 5.3 Collett Chapter 7 Kleinbaum Chapter 6 Cox
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationIntroduction to Survival Analysis
John Fox Lecture Notes Introduction to Survival Analysis Copyright 2014 by John Fox Introduction to Survival Analysis 1 1. Introduction I Survival analysis encompasses a wide variety of methods for analyzing
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationModeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models
Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models Abstract This paper considers the modeling of claim durations for existing claimants under income
More informationRegression, least squares
Regression, least squares Joe Felsenstein Department of Genome Sciences and Department of Biology Regression, least squares p.1/24 Fitting a straight line X Two distinct cases: The X values are chosen
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationParametric Models. dh(t) dt > 0 (1)
Parametric Models: The Intuition Parametric Models As we saw early, a central component of duration analysis is the hazard rate. The hazard rate is the probability of experiencing an event at time t i
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationRegression III: Advanced Methods
Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models
More information" Y. Notation and Equations for Regression Lecture 11/4. Notation:
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
More informationLecture 15 Introduction to Survival Analysis
Lecture 15 Introduction to Survival Analysis BIOST 515 February 26, 2004 BIOST 515, Lecture 15 Background In logistic regression, we were interested in studying how risk factors were associated with presence
More informationMultivariate Analysis of Variance (MANOVA): I. Theory
Gregory Carey, 1998 MANOVA: I  1 Multivariate Analysis of Variance (MANOVA): I. Theory Introduction The purpose of a t test is to assess the likelihood that the means for two groups are sampled from the
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationLecture 6: Poisson regression
Lecture 6: Poisson regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction EDA for Poisson regression Estimation and testing in Poisson regression
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationGeneralized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)
Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through
More informationLinda Staub & Alexandros Gekenidis
Seminar in Statistics: Survival Analysis Chapter 2 KaplanMeier Survival Curves and the Log Rank Test Linda Staub & Alexandros Gekenidis March 7th, 2011 1 Review Outcome variable of interest: time until
More informationStatistics in Retail Finance. Chapter 2: Statistical models of default
Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision
More informationIndices of Model Fit STRUCTURAL EQUATION MODELING 2013
Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:
More informatione = random error, assumed to be normally distributed with mean 0 and standard deviation σ
1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.
More informationPsychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
More information5. Linear Regression
5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4
More informationElements of statistics (MATH04871)
Elements of statistics (MATH04871) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis 
More informationSurvival Analysis Using SPSS. By Hui Bian Office for Faculty Excellence
Survival Analysis Using SPSS By Hui Bian Office for Faculty Excellence Survival analysis What is survival analysis Event history analysis Time series analysis When use survival analysis Research interest
More informationLogistic (RLOGIST) Example #1
Logistic (RLOGIST) Example #1 SUDAAN Statements and Results Illustrated EFFECTS RFORMAT, RLABEL REFLEVEL EXP option on MODEL statement HosmerLemeshow Test Input Data Set(s): BRFWGT.SAS7bdat Example Using
More informationOutline. Topic 4  Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares
Topic 4  Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test  Fall 2013 R 2 and the coefficient of correlation
More information2. Simple Linear Regression
Research methods  II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationPenalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20thcentury statistics dealt with maximum likelihood
More informationChecking proportionality for Cox s regression model
Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and
More informationANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R.
ANALYSING LIKERT SCALE/TYPE DATA, ORDINAL LOGISTIC REGRESSION EXAMPLE IN R. 1. Motivation. Likert items are used to measure respondents attitudes to a particular question or statement. One must recall
More informationMISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
More informationMGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal
MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationMSwM examples. Jose A. SanchezEspigares, Alberto LopezMoreno Dept. of Statistics and Operations Research UPCBarcelonaTech.
MSwM examples Jose A. SanchezEspigares, Alberto LopezMoreno Dept. of Statistics and Operations Research UPCBarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of
More informationEDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION
EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 510 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day
More informationTimeSeries Regression and Generalized Least Squares in R
TimeSeries Regression and Generalized Least Squares in R An Appendix to An R Companion to Applied Regression, Second Edition John Fox & Sanford Weisberg last revision: 11 November 2010 Abstract Generalized
More informationThis can dilute the significance of a departure from the null hypothesis. We can focus the test on departures of a particular form.
OneDegreeofFreedom Tests Test for group occasion interactions has (number of groups 1) number of occasions 1) degrees of freedom. This can dilute the significance of a departure from the null hypothesis.
More informationInteraction between quantitative predictors
Interaction between quantitative predictors In a firstorder model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
More informationSPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & Oneway
More informationANOVA. February 12, 2015
ANOVA February 12, 2015 1 ANOVA models Last time, we discussed the use of categorical variables in multivariate regression. Often, these are encoded as indicator columns in the design matrix. In [1]: %%R
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationRégression logistique : introduction
Chapitre 16 Introduction à la statistique avec R Régression logistique : introduction Une variable à expliquer binaire Expliquer un risque suicidaire élevé en prison par La durée de la peine L existence
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3 Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationExamining a Fitted Logistic Model
STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002Topics in StatisticsBiological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationLatent Class Regression Part II
This work is licensed under a Creative Commons AttributionNonCommercialShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationUniversity of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination February 14 th, 2014.
University of Ljubljana Doctoral Programme in Statistics ethodology of Statistical Research Written examination February 14 th, 2014 Name and surname: ID number: Instructions Read carefully the wording
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationSpearman s correlation
Spearman s correlation Introduction Before learning about Spearman s correllation it is important to understand Pearson s correlation which is a statistical measure of the strength of a linear relationship
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationDeveloping Business Failure Prediction Models Using SAS Software Oki Kim, Statistical Analytics
Paper SD004 Developing Business Failure Prediction Models Using SAS Software Oki Kim, Statistical Analytics ABSTRACT The credit crisis of 2008 has changed the climate in the investment and finance industry.
More informationSemiparametric Multinomial Logit Models for the Analysis of Brand Choice Behaviour
Semiparametric Multinomial Logit Models for the Analysis of Brand Choice Behaviour Thomas Kneib Department of Statistics LudwigMaximiliansUniversity Munich joint work with Bernhard Baumgartner & Winfried
More informationSydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1. 1. Introduction p. 2. 2. Statistical Methods Used p. 5. 3. 10 and under Males p.
Sydney Roberts Predicting Age Group Swimmers 50 Freestyle Time 1 Table of Contents 1. Introduction p. 2 2. Statistical Methods Used p. 5 3. 10 and under Males p. 8 4. 11 and up Males p. 10 5. 10 and under
More informationFactor analysis. Angela Montanari
Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number
More informationSimple Linear Regression in SPSS STAT 314
Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,
More informationKSTAT MINIMANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINIMANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationLecture 8: Gamma regression
Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationOverview Classes. 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7)
Overview Classes 123 Logistic regression (5) 193 Building and applying logistic regression (6) 263 Generalizations of logistic regression (7) 24 Loglinear models (8) 54 1517 hrs; 5B02 Building and
More informationDeveloping Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
More informationWeek 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
More informationANOVA Analysis of Variance
ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,
More informationMultivariate Normal Distribution
Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #47/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues
More information