Survival Analysis. René Böheim November 2013
|
|
- Beatrix Berry
- 3 years ago
- Views:
Transcription
1 Survival Analysis Based on Cleves, Gould, and Gutierrez (2004), An Introduction to Survival Analysis using Stata, Revised Edition, Stata Press, Texas. René Böheim November 2013
2 Introduction The key question is: How long does it take before a certain incidence takes place? Examples: Medicine: How long does it take to die after a surgery? Technics: How long does it take before a machine breaks down? Economics: How long does it take before an unemployed person can find a job? Also: time to failure, survival time, duration data. Ökonometrie 1 / 66
3 Example Point in time Characteristic of the incidence (t) (x) Ökonometrie 2 / 66
4 OLS? time i = β 0 + β 1 x i + ε i ε i N(0, σ 2 ) time i N(β 0 + β 1 x i, σ 2 ) Not always suitable, e.g. if the risk is constant over time. Assumption about the distribution of the ε i lead to parametric models. Ökonometrie 3 / 66
5 Probit o.ä.? Probability of a incidence after exactly one time unit: Pr[incidence i = 1] = F(β x i ). After t time units? In general yes, but inefficient. Moreover, we want to hold e.g., β constant over time. These considerations lead to semiparametric models. (Semiparametric, because there is no specific assumption about the distribution of time, but the x are parametrised.) Ökonometrie 4 / 66
6 Sampling Stock sampling: random sample of those, who were at point of time t in the observed state (of interest), e.g. Observation of the unemployment duration of all, who were unemployed on Dec., 22th. Problem: long durations are systematically collected more often! Inflow sampling: random sample of those, who start a epsiode at a specific point in timen, e.g. Observation of unemployment durations of those, who were unemployed between Jan., 1st and March, 31st as random sample for those, who were unemployed once. (Problem: seasonal variations.) Outflow sampling: random sample of those, who end one period, e.g. how long have unemployed people had a job? Population sampling: Evaluation of episodes, which were e.g. obtained from a Population Survey. Ökonometrie 5 / 66
7 Problems with data Censoring : not observed starting or ending points Data are called left-censored if the starting point cannot be observed. One knows that someone was already 10 days in hospital, but not the point in time when she felt ill. right-censored : at the observed point in time the transition did not take place. E.g. a unemployed person is still unemployed. Truncation : systematic exclusion of specific episodes left-truncated data : if only observations which have a minimum duration experienced are included in the sample ( Delayed entry, stock sampling with follow-up.) short durations are now systematically excluded, e.g. unemployed who were at least 4 weeks without employment. right-truncated data : if only observations which make a transition at a specific point in time are included in the sample. Long durations are systematically excluded, e.g. unemployed people who find a job during the year. Ökonometrie 6 / 66
8 continuous or discrete Discrete: time is a discrete sequence of fixed intervals. Data is often organized in discrete units, e.g. weeks or quarters. Continuous: time is a continuum and the duration of an episode can be described by positive real numbers. Ökonometrie 7 / 66
9 A simple example In a simple search model, the search for a job of an unemployed person with respect to a reservation wage, r, the frequency of a job offer, ξ(t) and the magnitude of the wage offer are explained by θ(t) = ξ(t)[1 W(t)]. The duration of the job search (the hazard of re-employment) is a function of the reservation wage. Ökonometrie 8 / 66
10 Explanatory Variables Fixed in the episode: e.g. sex, location. Time-dependent variables : Varied with calendar: e.g., age. Varied with duration of the episode: e.g., unemployment compensation. Ökonometrie 9 / 66
11 Search model The duration could be modeled the following way: θ(t) = θ(x(t, s), t), where t is a vector of the characteristics which vary in the curse of unemployment (t) or over time (calendar time) s. Ökonometrie 10 / 66
12 Time-varying covariates This permits e.g., Unemployment compensation varies with the duration of unemployment and/or over time (e.g. changed laws), Employers select job seeker and do not employ long-term unemployed, θ/ t < 0, the reservation wage decreases with lasting unemployment, r/ t < 0 θ/ t > 0!. discouraged seekers search less intensively, the longer they are unemployed, θ/ t < 0 Benefit-exhaustion effect : the search intensity increases, the sooner the unemployment compensation ends, θ(t)/ t > 0... Ökonometrie 11 / 66
13 Overview of some models 1. Non-parametric Kaplan-Meier estimator 2. Parametric Proportional Hazard Models Exponential Weibull Gompertz Accelerated Failure Time Exponential (Weibull) 3. Semi-parametric Cox Regression Ökonometrie 12 / 66
14 Stata -st-: survival time commands -stset-: declare data (analogue to -tsset-) -stdes-: describe survival time data -stsum-: summarise survival time data -sts-: graphs, lists, and tests -ltable-: Kaplan-Meier estimate -streg-: Weibull and other models -stcox-: Cox s model Ökonometrie 13 / 66
15 Non-parametric methods No assumptions about: Distribution of the survival function (of the hazard) Impact of covariates Ökonometrie 14 / 66
16 The Kaplan-Meier estimator The KM-estimator is a estimator of the survival function S(t), the probability of surviving beyond t : ( ) nj d j Ŝ(t) =, j t j t where n j is the number of observations which is in the risk set, this are those who have survived until point in time t. d j denotes those observations which died in the interval, e.g. either the incidence happened or it denotes censored observations. (It is also named product limit estimate.) n j Ökonometrie 15 / 66
17 Beispiel The observations are ordered after the duration until the occurrence of the incidence (or the end of the observation period (censored)). Person t died? (1=yes) Ökonometrie 16 / 66
18 t n_j d_j number number at risk failed Ökonometrie 17 / 66
19 t n_j d_j number number at risk failed p / / /3 Ökonometrie 18 / 66
20 t n_j d_j number number. at risk failed p S(t) /6 5/ /5 1/ /3 1/3 Ökonometrie 19 / 66
21 Graphically Ökonometrie 20 / 66
22 Censored data Person t died? (1=yes, 0=not observed) Ökonometrie 21 / 66
23 n_j d_j t n at risk n failed censored Ökonometrie 22 / 66
24 n_j d_j t n at risk n failed censored p / / / Ökonometrie 23 / 66
25 n_j d_j. t n at risk n failed censored p S(t) /6 5/ /5 1/ / /2 1/ /4 Ökonometrie 24 / 66
26 Graphically per type Ökonometrie 25 / 66
27 Stata-Example Beg. Net Survivor Std. Time Total Fail Lost Function Error [95% Conf. Int.] output omitted Net lost...censored (1=yes) Ökonometrie 26 / 66
28 The survival function [T is a non-negative random variable and describes the time until an incidence, with density f(t) and distribution F(t).] The survival function is S(t) Pr[T > t] = 1 F(t), the probability to live longer than t. S(t = 0) = 1, monotonous declining function of time. Ökonometrie 27 / 66
29 The hazard rate (Also: conditional failure function.) is the (instantaneous rate) of the occurrence of an incidence in the next (small) interval, under the condition that the incidence has not occurred yet: Pr[t + t > T > t T > t] h(t) = lim = f(t) t 0 t S(t) Ökonometrie 28 / 66
30 The hazard rate ist a rate, no probabilites, with metric 1/t ; [0, ], d.h. no risk until certain risk, can increase with increasing exposition, decrease, or can stay constant. Ökonometrie 29 / 66
31 Simple Relationships Knowledge of one of the four functions (hazard rate, survival, density, distribution) leads to the other three: H(t) = t 0 t h(u)du = f(u) 0 S(u) du = S(t) = exp( H(t)), F(t) = 1 exp( H(t)), f(t) = h(t) exp( H(t)), h(t) = f(t) S(t). accumulated hazard rate t 0 [ ] 1 d S(u) du S(u) du = ln(s(t)), Ökonometrie 30 / 66
32 Parametric models Parametric models are divided in proportional hazard models (PH) and in accelerated failure-time models (AFT) In economics, PH models; in engineering, typically AFT models. Ökonometrie 31 / 66
33 Parametric Proportional Hazard Models h(t x i ) = h 0 (t) exp(β x i ), where h 0 (t), the baseline hazard, can have different functional forms, parametrization. Those are dependent on time t, but not on the x i. For example: h 0 (t) = exp(c), exponential model (constant hazard) h 0 (t) = αt α 1 exp(c), Weibull model h 0 (t) = exp(γt) exp(c), Gompertz model Ökonometrie 32 / 66
34 Exponential model (PH) Is the simple model because the hazard is constant: h(t x i ) = h 0 (t) exp(β x i ) = exp(c) exp(β x i ) = exp(c + β x i ), H(t x i ) = exp(c + β x i )t, S(t x i ) = exp( exp(c + β x i )t). Ökonometrie 33 / 66
35 Weibull Modell (PH) h 0 (t) = αt α 1 exp(c) h(t x i ) = h 0 (t) exp(β x i ) = αt α 1 exp(c) exp(β x i ) = αt α 1 exp(c + β x i ), H(t x i ) = exp(c + β x i )t α, S(t x i ) = exp( exp(c + β x i )t α ). Ökonometrie 34 / 66
36 Weibull hazard rates Ökonometrie 35 / 66
37 Weibull hazard rates Ökonometrie 36 / 66
38 Weibull hazard rates Ökonometrie 37 / 66
39 Weibull survivor functions Ökonometrie 38 / 66
40 Example Weibull regression t Coef. Std. Err. z P> z treatment age _cons /ln_p h 0 (t) = αt α 1 exp(β 0 ) 1.68t 0.68 exp( 11.67) = t 0.68 Ökonometrie 39 / 66
41 Example Ökonometrie 40 / 66
42 Semi-parametric model: Cox h(t x i ) = h 0 (t) exp(β x i ), h 0 (t) is the baseline hazard and exp(β x i ) the relative risk/hazard. (β x the log-relative Hazard or also risk score.) The baseline hazard h 0 (t) is not parameterized and not estimated. The baseline hazard can take every arbitrary course, decreasing, increasing or oscillating the only assumption is that it is the same for all observations. Ökonometrie 41 / 66
43 Cox-model Assumption: proportional hazard in the Cox model. h(t x i ) h(t x n ) = exp(β x i ) exp(β x n ), i.e., the hazard is related to the relative values f the x i. Ökonometrie 42 / 66
44 Example _t Coef. Std. Err. z P> z treatment age Ökonometrie 43 / 66
45 Interpretation estimated coefficient z.b. for age 0,105: an increase of age of 1 year leads to an increase of hazard of 11% (exp(0, 105) = 1, 11). z.b. für Treatment (1=Yes, 0=No) -2,256: 90% lower risk leads to treated (exp( 2, 256) = 0, 105). Ökonometrie 44 / 66
46 Changes i-th observation with k covariates: h(t x 1, x 2,..., x k ) = h 0 (t) exp(β 1 x 1 + β 2 x β k x k ), h(t x 1, (x 2 + 1),..., x k ) = h 0 (t) exp(β 1 x 1 + β 2 (x 2 + 1) + β k x k ). h(t x 1, (x 2 + 1),..., x k ) h(t x 1, x 2,..., x k ) = exp(β 2 ) Ökonometrie 45 / 66
47 cumulated baseline hazard (H 0 (t)) Ökonometrie 46 / 66
48 Estimated Survival Function (S 0 (t)) Ökonometrie 47 / 66
49 Baseline??? A Cox-Regression without covariables results in the Kaplan-Meier estimator. h 0 : is the first derivative of H 0 (t). This is not defined at the point in times, as this is a step function. The estimation of the baseline requires the estimation of the hazard contributions. These are the increases of the cumulated hazards at the incidences. Ökonometrie 48 / 66
50 Estimated baseline hazard (h 0 (t)) Ökonometrie 49 / 66
51 Stratified Analysis Instead of h 0 (t) for all, divided into groups: h(t x i ) = h 0 (t) exp(β x i ) h(t x i ) = h 01 (t) exp(β x i ), if i is in group 1, h(t x i ) = h 02 (t) exp(β x i ), if i is in group 2, etc. The h 0 are different, but the ˆβ are all the same. Ökonometrie 50 / 66
52 Generalization: Shared Frailty Model Problem: Random-effects for group j ( within-group correlation ): h i,j (t i x i, α j ) = α j h 0 (t) exp(β x i ), h i,j (t i x i, α j ) = h 0 (t) exp(β x i + ν j ) and ν j = log α j α j unobserved. For α often a Gamma-distribution with mean 1 and variance θ assumed. Cox Random-effects Model. Ökonometrie 51 / 66
53 Time-varying Covariates E.g., the search behavior of an unemployed is constant during the unemployment benefit, but changes after the end of the benefit. id beginning end unemployment benefit per week Ökonometrie 52 / 66
54 Example A drug is taken and with the exponential rate of exp(0.35t) (=half-life of 2 days) absorbed. Other variables are constant: log(h(t x, t)) = log(h 0 ) + β X = log(h 0 ) + β 1 x β tv [initial drug dose exp( 0.35t)]: Cox regression _t Haz. Ratio Std. Err. z P> z rh treatment t drug-level rh: hazard ratio which is constant over time. t: hr varies over time. Interpretation: higher drug concentration reduces the risk/hazard (about 20%=exp( 0.12)). Ökonometrie 53 / 66
55 Diagnostics Define the Schoenfeld residuals, r u,i : n R x u,n exp(ˆβ x n ) r u,i = x u,i, n R exp(ˆβ x n ) are the difference between the explanatory variable x u,i and the mean of the other persons in the risk set, weighted by their estimated relative hazard. Under the H 0 of the PH-assumption, the slope of the residuals are zero: r u,i = δ 0 + δ t t. Ökonometrie 54 / 66
56 Schoenfeld Residuen Ökonometrie 55 / 66
57 Diagnostics graphically ln[ ln{ŝ(t)}] and ln(t) (Ŝ(t) is the Kaplan-Meier estimator): h(t x) = h 0 (t) exp(β x) S(t x) = S 0 (t) exp(β x) ln[ ln{s(t x)}] = ln[ ln{s 0 (t)}] β x, under the H 0 the curves for different values of x i are parallel. Ökonometrie 56 / 66
58 Test for proportionality (1) Ökonometrie 57 / 66
59 Test for proportionality (2) Ökonometrie 58 / 66
60 Diagnosis residuals Cox-Snell residuals: CSr i = Ĥ0(t i ) exp(ˆβ x i ), and Ĥ0(t i ), where ˆβ from the Cox-model. Under the H 0, the CSr have an exponential distribution with a hazard rate=1 for all t. (The cumulated hazard is a 45 -line.) Ökonometrie 59 / 66
61 Goodness of fit Ökonometrie 60 / 66
62 Independent Competing Risks Until now: one single destination, e.g. unemployed people find jobs. Now: unemployed persons find jobs, they leave the labor market, they emigrate, etc. Necessary: mutually excluding events (i.e., the probabilities of leaving to a destination sum to 1). Definition: h a (t): latent hazard rate for the destination a, with respective density f a (t), and point of time of the incidence T a ; h b (t): b, f b (t), T b. Observed point of time of the incidence T = min{t a, T b }. Ökonometrie 61 / 66
63 Likelihood h(t) = h a (t) + h b (t) independent! S(t) = S a (t)s b (t). The likelihood function for the hazard rate model with independent destinations has following components: L = L a L b, L a : Contribution of all observations which are missing after a, L a = i {a} f a (t) L b : Contribution of all observations which are missing after b, L b = i {b} f b (t) Ökonometrie 62 / 66
64 Likelihood δ a i and δ b i are the following indicators: δ a i = δ b i = { 1 depart from a, 0 leaving to b. { 1 i depart from b, 0 leaving to a. Ökonometrie 63 / 66
65 log-likelihood L = L a L b, = [f a (t i )] δa i [f b (t i )] δb i ln L = alle i δ a i ln[f a (t i )] + δ b i ln[f b (t i )], alle i the log-likelihood function for independent competing risks can be separated into parts which are independent from each other. Every of these parts depends only on the parameters which affect the particular destination. Ökonometrie 64 / 66
66 Simplifies the estimation 1. Define indicator variables for every destination. 2. Observations which leave to another destination are censored. 3. Estimate the hazard rate for each destination. Ökonometrie 65 / 66
67 Further Issues Unobserved heterogeneity Dependent competing risks Initial conditions... Ökonometrie 66 / 66
Introduction. Survival Analysis. Censoring. Plan of Talk
Survival Analysis Mark Lunt Arthritis Research UK Centre for Excellence in Epidemiology University of Manchester 01/12/2015 Survival Analysis is concerned with the length of time before an event occurs.
More informationDuration Analysis. Econometric Analysis. Dr. Keshab Bhattarai. April 4, 2011. Hull Univ. Business School
Duration Analysis Econometric Analysis Dr. Keshab Bhattarai Hull Univ. Business School April 4, 2011 Dr. Bhattarai (Hull Univ. Business School) Duration April 4, 2011 1 / 27 What is Duration Analysis?
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More informationIntroduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER
Introduction to Event History Analysis DUSTIN BROWN POPULATION RESEARCH CENTER Objectives Introduce event history analysis Describe some common survival (hazard) distributions Introduce some useful Stata
More informationCompeting-risks regression
Competing-risks regression Roberto G. Gutierrez Director of Statistics StataCorp LP Stata Conference Boston 2010 R. Gutierrez (StataCorp) Competing-risks regression July 15-16, 2010 1 / 26 Outline 1. Overview
More informationStatistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
More informationModelling spousal mortality dependence: evidence of heterogeneities and implications
1/23 Modelling spousal mortality dependence: evidence of heterogeneities and implications Yang Lu Scor and Aix-Marseille School of Economics Lyon, September 2015 2/23 INTRODUCTION 3/23 Motivation It has
More informationSurvival analysis methods in Insurance Applications in car insurance contracts
Survival analysis methods in Insurance Applications in car insurance contracts Abder OULIDI 1-2 Jean-Marie MARION 1 Hérvé GANACHAUD 3 1 Institut de Mathématiques Appliquées (IMA) Angers France 2 Institut
More informationParametric Survival Models
Parametric Survival Models Germán Rodríguez grodri@princeton.edu Spring, 2001; revised Spring 2005, Summer 2010 We consider briefly the analysis of survival data when one is willing to assume a parametric
More informationHURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009
HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal
More informationLecture 15 Introduction to Survival Analysis
Lecture 15 Introduction to Survival Analysis BIOST 515 February 26, 2004 BIOST 515, Lecture 15 Background In logistic regression, we were interested in studying how risk factors were associated with presence
More informationApplied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne
Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model
More informationTopic 3 - Survival Analysis
Topic 3 - Survival Analysis 1. Topics...2 2. Learning objectives...3 3. Grouped survival data - leukemia example...4 3.1 Cohort survival data schematic...5 3.2 Tabulation of events and time at risk...6
More information7.1 The Hazard and Survival Functions
Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence
More informationUNU MERIT Working Paper Series
UNU MERIT Working Paper Series #2013-039 How unemployment insurance savings accounts affect employment duration: Evidence from Chile Paula Nagler Maastricht Economic and social Research institute on Innovation
More informationLecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION One-sample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:
More informationSPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
More informationSurvival Analysis of Left Truncated Income Protection Insurance Data. [March 29, 2012]
Survival Analysis of Left Truncated Income Protection Insurance Data [March 29, 2012] 1 Qing Liu 2 David Pitt 3 Yan Wang 4 Xueyuan Wu Abstract One of the main characteristics of Income Protection Insurance
More informationPredicting Customer Default Times using Survival Analysis Methods in SAS
Predicting Customer Default Times using Survival Analysis Methods in SAS Bart Baesens Bart.Baesens@econ.kuleuven.ac.be Overview The credit scoring survival analysis problem Statistical methods for Survival
More informationOrdinal Regression. Chapter
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
More informationA LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY. Workshop
A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY Ramon Alemany Montserrat Guillén Xavier Piulachs Lozada Riskcenter - IREA Universitat de Barcelona http://www.ub.edu/riskcenter
More informationModeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models
Modeling the Claim Duration of Income Protection Insurance Policyholders Using Parametric Mixture Models Abstract This paper considers the modeling of claim durations for existing claimants under income
More informationThe Impact of UISA on Unemployment
Nagler IZA Journal of Labor & Development ORIGINAL ARTICLE Open Access How unemployment insurance savings accounts affect employment duration: evidence from Chile Paula Nagler Correspondence: paula.nagler@maastrichtuniversity.nl
More informationMultinomial and Ordinal Logistic Regression
Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,
More informationLOGIT AND PROBIT ANALYSIS
LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y
More informationTests for Two Survival Curves Using Cox s Proportional Hazards Model
Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.
More informationStandard errors of marginal effects in the heteroskedastic probit model
Standard errors of marginal effects in the heteroskedastic probit model Thomas Cornelißen Discussion Paper No. 320 August 2005 ISSN: 0949 9962 Abstract In non-linear regression models, such as the heteroskedastic
More informationInterpretation of Somers D under four simple models
Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationDURATION ANALYSIS OF FLEET DYNAMICS
DURATION ANALYSIS OF FLEET DYNAMICS Garth Holloway, University of Reading, garth.holloway@reading.ac.uk David Tomberlin, NOAA Fisheries, david.tomberlin@noaa.gov ABSTRACT Though long a standard technique
More informationDistribution (Weibull) Fitting
Chapter 550 Distribution (Weibull) Fitting Introduction This procedure estimates the parameters of the exponential, extreme value, logistic, log-logistic, lognormal, normal, and Weibull probability distributions
More informationESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics
ESTIMATING AVERAGE TREATMENT EFFECTS: IV AND CONTROL FUNCTIONS, II Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Quantile Treatment Effects 2. Control Functions
More informationSAS Software to Fit the Generalized Linear Model
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
More information200609 - ATV - Lifetime Data Analysis
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 200 - FME - School of Mathematics and Statistics 715 - EIO - Department of Statistics and Operations Research 1004 - UB - (ENG)Universitat
More informationStatistical Analysis of Life Insurance Policy Termination and Survivorship
Statistical Analysis of Life Insurance Policy Termination and Survivorship Emiliano A. Valdez, PhD, FSA Michigan State University joint work with J. Vadiveloo and U. Dias Session ES82 (Statistics in Actuarial
More informationDepartment of Economics Session 2012/2013. EC352 Econometric Methods. Solutions to Exercises from Week 10 + 0.0077 (0.052)
Department of Economics Session 2012/2013 University of Essex Spring Term Dr Gordon Kemp EC352 Econometric Methods Solutions to Exercises from Week 10 1 Problem 13.7 This exercise refers back to Equation
More informationMobility Tool Ownership - A Review of the Recessionary Report
Hazard rate 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 Residence Education Employment Education and employment Car: always available Car: partially available National annual ticket ownership Regional annual
More informationTips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD
Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes
More informationFirm and Product Life Cycles and Firm Survival
TECHNOLOGICAL CHANGE Firm and Product Life Cycles and Firm Survival By RAJSHREE AGARWAL AND MICHAEL GORT* On average, roughly 5 10 percent of the firms in a given market leave that market over the span
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationBayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
More informationMeasuring prepayment risk: an application to UniCredit Family Financing
Measuring prepayment risk: an application to UniCredit Family Financing Matteo Consalvi Giovanni Scotto di Freca Working Paper Series n. 05 May 2010 Statement of Purpose The Working Paper series of the
More informationEntry of Foreign Life Insurers in China: A Survival Analysis
Entry of Foreign Life Insurers in China: A Survival Analysis M.K. Leung * This paper uses survival analysis to examine the firm-specific factors determining the decision of a foreign firm to establish
More informationPROGRAM ON HOUSING AND URBAN POLICY
Institute of Business and Economic Research Fisher Center for Real Estate and Urban Economics PROGRAM ON HOUSING AND URBAN POLICY DISSERTATION AND THESES SERIES DISSERTATION NO. D07-002 MODELING RESIDENTIAL
More informationCHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES
Examples: Monte Carlo Simulation Studies CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES Monte Carlo simulation studies are often used for methodological investigations of the performance of statistical
More informationVocational high school or Vocational college? Comparing the Transitions from School to Work
Vocational high school or Vocational college? Comparing the Transitions from School to Work Cristina Lopez-Mayan Autònoma de Barcelona Catia Nicodemo Autònoma de Barcelona XERAP and IZA June 7, 2011 Abstract
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationIntroduction to Longitudinal Data Analysis
Introduction to Longitudinal Data Analysis Longitudinal Data Analysis Workshop Section 1 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section 1: Introduction
More informationFrom the help desk: hurdle models
The Stata Journal (2003) 3, Number 2, pp. 178 184 From the help desk: hurdle models Allen McDowell Stata Corporation Abstract. This article demonstrates that, although there is no command in Stata for
More informationLecture 15. Endogeneity & Instrumental Variable Estimation
Lecture 15. Endogeneity & Instrumental Variable Estimation Saw that measurement error (on right hand side) means that OLS will be biased (biased toward zero) Potential solution to endogeneity instrumental
More informationForecasting in STATA: Tools and Tricks
Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationChapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS
Chapter Seven Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS Section : An introduction to multiple regression WHAT IS MULTIPLE REGRESSION? Multiple
More informationDeterminants of the Adaption of Organic Agriculture in Egypt Using a. Duration Analysis Technique
Determinants of the Adaption of Organic Agriculture in Egypt Using a Duration Analysis Technique Amr Radwan *, **, José M. Gil*, Yaser A. A. Diab*** and Mohamed A. Abo- Nahoul*** *Centre de Recerca en
More informationApplying Survival Analysis Techniques to Loan Terminations for HUD s Reverse Mortgage Insurance Program - HECM
Applying Survival Analysis Techniques to Loan Terminations for HUD s Reverse Mortgage Insurance Program - HECM Ming H. Chow, Edward J. Szymanoski, Theresa R. DiVenti 1 I. Introduction "Survival Analysis"
More informationTRINITY COLLEGE. Faculty of Engineering, Mathematics and Science. School of Computer Science & Statistics
UNIVERSITY OF DUBLIN TRINITY COLLEGE Faculty of Engineering, Mathematics and Science School of Computer Science & Statistics BA (Mod) Enter Course Title Trinity Term 2013 Junior/Senior Sophister ST7002
More informationLean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY
TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationMultilevel Models for Longitudinal Data. Fiona Steele
Multilevel Models for Longitudinal Data Fiona Steele Aims of Talk Overview of the application of multilevel (random effects) models in longitudinal research, with examples from social research Particular
More informationThe first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com
The first three steps in a logistic regression analysis with examples in IBM SPSS. Steve Simon P.Mean Consulting www.pmean.com 2. Why do I offer this webinar for free? I offer free statistics webinars
More informationChecking proportionality for Cox s regression model
Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and
More informationAugust 2012 EXAMINATIONS Solution Part I
August 01 EXAMINATIONS Solution Part I (1) In a random sample of 600 eligible voters, the probability that less than 38% will be in favour of this policy is closest to (B) () In a large random sample,
More informationThe Cox Proportional Hazards Model
The Cox Proportional Hazards Model Mario Chen, PhD Advanced Biostatistics and RCT Workshop Office of AIDS Research, NIH ICSSC, FHI Goa, India, September 2009 1 The Model h i (t)=h 0 (t)exp(z i ), Z i =
More informationLOGISTIC REGRESSION ANALYSIS
LOGISTIC REGRESSION ANALYSIS C. Mitchell Dayton Department of Measurement, Statistics & Evaluation Room 1230D Benjamin Building University of Maryland September 1992 1. Introduction and Model Logistic
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationRockefeller College University at Albany
Rockefeller College University at Albany PAD 705 Handout: Hypothesis Testing on Multiple Parameters In many cases we may wish to know whether two or more variables are jointly significant in a regression.
More informationLatent and Behavioral Responses to Extensions in. Unemployment Insurance Benefits
Latent and Behavioral Responses to Extensions in Unemployment Insurance Benefits Gordon B. Dahl UC San Diego gdahl@ucsd.edu June 2011 Abstract An important question facing economists and policymakers is
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationComparison of resampling method applied to censored data
International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison
More informationRegression with a Binary Dependent Variable
Regression with a Binary Dependent Variable Chapter 9 Michael Ash CPPA Lecture 22 Course Notes Endgame Take-home final Distributed Friday 19 May Due Tuesday 23 May (Paper or emailed PDF ok; no Word, Excel,
More informationSTATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS
STATISTICAL ANALYSIS OF SAFETY DATA IN LONG-TERM CLINICAL TRIALS Tailiang Xie, Ping Zhao and Joel Waksman, Wyeth Consumer Healthcare Five Giralda Farms, Madison, NJ 794 KEY WORDS: Safety Data, Adverse
More informationEstimating the latent effect of unemployment benefits on unemployment duration
Estimating the latent effect of unemployment benefits on unemployment duration Simon M.S. Lo, Gesine Stephan ; Ralf A. Wilke August 2011 Keywords: dependent censoring, partial identification, difference-in-differences
More informationMarginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015
Marginal Effects for Continuous Variables Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ Last revised February 21, 2015 References: Long 1997, Long and Freese 2003 & 2006 & 2014,
More informationUSING ANALYTICS TO MEASURE THE VALUE OF EMPLOYEE REFERRAL PROGRAMS
USING ANALYTICS TO MEASURE THE VALUE OF EMPLOYEE REFERRAL PROGRAMS Evolv Study: The benefits of an employee referral program significantly outweigh the costs Using Analytics To Measure The Value Of Employee
More informationMultivariable survival analysis S10. Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands
Multivariable survival analysis S10 Michael Hauptmann Netherlands Cancer Institute Amsterdam, The Netherlands m.hauptmann@nki.nl 1 Confounding A variable correlated with the variable of interest and with
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationFinancial Vulnerability Index (IMPACT)
Household s Life Insurance Demand - a Multivariate Two Part Model Edward (Jed) W. Frees School of Business, University of Wisconsin-Madison July 30, 1 / 19 Outline 1 2 3 4 2 / 19 Objective To understand
More informationSurvival Distributions, Hazard Functions, Cumulative Hazards
Week 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution of a
More informationConfidence Intervals for Exponential Reliability
Chapter 408 Confidence Intervals for Exponential Reliability Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for the reliability (proportion
More informationSAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates
SAS and R calculations for cause specific hazard ratios in a competing risks analysis with time dependent covariates Martin Wolkewitz, Ralf Peter Vonberg, Hajo Grundmann, Jan Beyersmann, Petra Gastmeier,
More informationMicroeconomic Theory: Basic Math Concepts
Microeconomic Theory: Basic Math Concepts Matt Van Essen University of Alabama Van Essen (U of A) Basic Math Concepts 1 / 66 Basic Math Concepts In this lecture we will review some basic mathematical concepts
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More information10 Dichotomous or binary responses
10 Dichotomous or binary responses 10.1 Introduction Dichotomous or binary responses are widespread. Examples include being dead or alive, agreeing or disagreeing with a statement, and succeeding or failing
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationCHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
More informationA course on Longitudinal data analysis : what did we learn? COMPASS 11/03/2011
A course on Longitudinal data analysis : what did we learn? COMPASS 11/03/2011 1 Abstract We report back on methods used for the analysis of longitudinal data covered by the course We draw lessons for
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? 1. INTRODUCTION
ANNUITY LAPSE RATE MODELING: TOBIT OR NOT TOBIT? SAMUEL H. COX AND YIJIA LIN ABSTRACT. We devise an approach, using tobit models for modeling annuity lapse rates. The approach is based on data provided
More informationProblem of Missing Data
VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;
More informationNote on growth and growth accounting
CHAPTER 0 Note on growth and growth accounting 1. Growth and the growth rate In this section aspects of the mathematical concept of the rate of growth used in growth models and in the empirical analysis
More informationMULTIPLE REGRESSION EXAMPLE
MULTIPLE REGRESSION EXAMPLE For a sample of n = 166 college students, the following variables were measured: Y = height X 1 = mother s height ( momheight ) X 2 = father s height ( dadheight ) X 3 = 1 if
More informationCorrelated Random Effects Panel Data Models
INTRODUCTION AND LINEAR MODELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M. Wooldridge Michigan State University 1. Introduction 2. The Linear
More informationHandling missing data in Stata a whirlwind tour
Handling missing data in Stata a whirlwind tour 2012 Italian Stata Users Group Meeting Jonathan Bartlett www.missingdata.org.uk 20th September 2012 1/55 Outline The problem of missing data and a principled
More informationCEIS Tor Vergata RESEARCH PAPER SERIES. Vol. 6, Issue 5, No. 119 March 2008. Dual Labour Markets and Matching Frictions
CEIS Tor Vergata RESEARCH PAPER SERIES Vol. 6, Issue 5, No. 119 March 2008 Dual Labour Markets and Matching Frictions Dario Sciulli, Antonio Gomes de Menezes and José Cabral Vieira This paper can be downloaded
More informationProbability and Statistics Vocabulary List (Definitions for Middle School Teachers)
Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence
More informationModeling and Analysis of Call Center Arrival Data: A Bayesian Approach
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science
More informationLife Settlement Pricing
Life Settlement Pricing Yinglu Deng Patrick Brockett Richard MacMinn Tsinghua University University of Texas Illinois State University Life Settlement Description A life settlement is a financial arrangement
More information