R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models

Size: px
Start display at page:

Download "R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models"

Transcription

1 Faculty of Health Sciences R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with M-C. Fournier & E. Dantan (Nantes, France) July 2015 Slide 1/29

2 Slide 1/29 P Blanche et al. Context & Motivation Medical researchers hope to improve patient management using earlier diagnoses Statisticians can help by tting prediction models The making of so-called "dynamic" predictions has recently received a lot of attention In order to be useful for medical practice, predictions should be "accurate" How should we evaluate dynamic prediction accuracy?

3 Slide 2/29 P Blanche et al. Data & Clinical goal Data: DIVAT cohort data of kidney transplant recipients (subsample, n = 4, 119) Clinical goal: Dynamic prediction of risk of kidney graft failure (death or return to dialysis) Using repeated measurements of serum creatinine

4 Slide 3/29 P Blanche et al. DIVAT data sample (n=4,119) French cohort Adult recipients Transplanted after 2000 Creatinine measured every year (

5 Slide 3/29 P Blanche et al. DIVAT data sample (n=4,119) French cohort Adult recipients Transplanted after 2000 Creatinine measured every year 6 centers (

6 Slide 4/29 P Blanche et al. Statistical challenges discussed How to evaluate and/or compare dynamic predictions? Using concepts of: Discrimination Calibration Accounting for: Dynamic setting Censoring issue

7 Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations

8 Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!)

9 Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!) Concepts: Discrimination: Calibration:

10 Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!) Concepts: Discrimination: A model has high discriminative power if the range of predicted risks is wide and subjects with low (high) predicted risk are more (less) likely to experience the event. Calibration:

11 Slide 5/29 P Blanche et al. Basic idea & concepts for evaluating predictions Basic idea: comparing predictions and observations (simple!) Concepts: Discrimination: A model has high discriminative power if the range of predicted risks is wide and subjects with low (high) predicted risk are more (less) likely to experience the event. Calibration: A model is calibrated if we can expect that x subjects out of 100 experience the event among all subjects that receive a predicted risk of x% (weak denition).

12 Dynamic prediction Creatinine (mol/l) follow up time (years) Slide 6/29 P Blanche et al.

13 Dynamic prediction Creatinine (mol/l) follow up time (years) Slide 6/29 P Blanche et al.

14 Dynamic prediction Creatinine (mol/l) follow up time (years) Slide 6/29 P Blanche et al.

15 Dynamic prediction Creatinine (mol/l) follow up time (years) Slide 6/29 P Blanche et al.

16 Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) Creatinine (mol/l) Landmark time ''s'' 0 s = 4 years follow up time (years)

17 Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) Creatinine (mol/l) Landmark time ''s'' 0 s = 4 years follow up time (years)

18 Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) t: prediction horizon (xed) Creatinine (mol/l) Landmark time ''s'' Horizon ''t'' 0 s = 4 years s+t = 9 years follow up time (years)

19 Slide 6/29 P Blanche et al. Dynamic prediction s: Landmark time at which predictions are made (varies) t: prediction horizon (xed) Creatinine (mol/l) π i (s, t) = 64% Landmark time ''s'' Horizon ''t'' 0 s = 4 years s+t = 9 years follow up time (years) 0 % 100 % 1 Pr(Graft Failure in (s,s+t])

20 Slide 7/29 P Blanche et al. Right censoring issue Landmark time s (when predictions are made) Time s + t (end of prediction window) time D i (s, t) = 1{event occurs in (s, s + t]}

21 Slide 7/29 P Blanche et al. Right censoring issue : uncensored Landmark time s (when predictions are made) Time s + t (end of prediction window) time D i (s, t) = 1{event occurs in (s, s + t]}

22 Slide 7/29 P Blanche et al. Right censoring issue Landmark time s (when predictions are made) Time s + t (end of prediction window) : uncensored : censored time D i (s, t) = 1{event occurs in (s, s + t]}

23 Slide 7/29 P Blanche et al. Right censoring issue Landmark time s (when predictions are made) Time s + t (end of prediction window) : uncensored : censored time For subject i censored within (s, s + t] the status is unknown. D i (s, t) = 1{event occurs in (s, s + t]}

24 Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t}

25 Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: π i (s, t)

26 Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( ) π i (s, t) = P ξ ξ: previously estimated parameters (from independent training data)

27 Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( π i (s, t) = P ξ D i (s, t) = 1 ) ξ: previously estimated parameters (from independent training data)

28 Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( π i (s, t) = P ξ D i (s, t) = 1 T i > s, Y i (s), ) ξ: previously estimated parameters (from independent training data) Y i (s): marker measurements observed before time s

29 Slide 8/29 P Blanche et al. Notations for population parameters Indicator of event in (s, s + t]: where T i is the time-to-event. D i (s, t) = 1{s < T i s + t} Dynamic predictions: ( ) π i (s, t) = P ξ D i (s, t) = 1 T i > s, Y i (s), X i ξ: previously estimated parameters (from independent training data) Y i (s): marker measurements observed before time s X i : baseline covariates

30 Slide 9/29 P Blanche et al. Predictive accuracy How close are the predicted risks π i (s, t) to the true underlying risk P ( event occurs in (s,s+t] information at s )?

31 Slide 9/29 P Blanche et al. Predictive accuracy How close are the predicted risks π i (s, t) to the true underlying risk P ( event occurs in (s,s+t] information at s )? Prediction Error: [ { } 2 ] T PE π (s, t) = E D(s, t) π(s, t) > s

32 Predictive accuracy How close are the predicted risks π i (s, t) to the true underlying risk P ( event occurs in (s,s+t] information at s )? Prediction Error: [ { } 2 ] T PE π (s, t) = E D(s, t) π(s, t) > s the lower the better PE Bias 2 + Variance evaluates both Calibration and Discrimination depends on P ( event occurs in (s,s+t] at risk at s ) often called "Expected Brier Score" Slide 9/29 P Blanche et al.

33 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? [{ } 2 ] T PE π (s, t) =E D(s, t) π(s, t) > s [{D(s, t)+ E [ D(s, t) H π (s) ] E } {{ } true underlying risk ] π(s, t) T > s

34 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) =E [{D(s, t) E [ D(s, t) H π (s) ]} 2 [{D(s, t) + E [ D(s, t) H π (s) ] E } {{ } true underlying risk } 2 ] T π(s, t) > s

35 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) =E [{D(s, t) E [ D(s, t) H π (s) ]} 2 ] T > s [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } true underlying risk H π (s) = {X π (s), T > s} denotes the subject-specic history at time s.

36 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [{D(s, t) E [ D(s, t) H π (s) ]} 2 ] T > s } {{ } Inseparability [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Bias/Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s.

37 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Discrimination [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s.

38 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Discrimination [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s. the more discriminating H π (s) the smaller Var { D(s, t) H π (s) }

39 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Discrimination [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Calibration H π (s) = {X π (s), T > s} denotes the subject-specic history at time s. the more discriminating H π (s) the smaller Var { D(s, t) H π (s) } E [ D(s, t) H π (s) ] π(s, t) 0 denes "strong" calibration.

40 Slide 10/29 P Blanche et al. How does the PE relate to calibration and discrimination? PE π (s, t) = E [Var { D(s, t) H π (s) } ] T > s } {{ } Does NOT depend on π(s,t) [{ + E E [ D(s, t) H π (s) ] } 2 ] T π(s, t) > s } {{ } Depends on π(s,t) H π (s) = {X π (s), T > s} denotes the subject-specic history at time s. the more discriminating H π (s) the smaller Var { D(s, t) H π (s) } E [ D(s, t) H π (s) ] π(s, t) 0 denes "strong" calibration.

41 Slide 11/29 P Blanche et al. Rπ 2 -type criterion Benchmark PE 0 The best "null" prediction tool, which gives the same (marginal) predicted risk S(s + t s) = E [ D(s, t) H 0 (s) ], H 0 (s) = {T > s} to all subjects leads to { } PE 0 (s, t) = Var{D(s, t) T > s} = S(s + t s) 1 S(s + t s).

42 Slide 11/29 P Blanche et al. Rπ 2 -type criterion Benchmark PE 0 The best "null" prediction tool, which gives the same (marginal) predicted risk S(s + t s) = E [ D(s, t) H 0 (s) ], H 0 (s) = {T > s} to all subjects leads to { } PE 0 (s, t) = Var{D(s, t) T > s} = S(s + t s) 1 S(s + t s). Simple idea R 2 π(s, t) = 1 PE π(s, t) PE 0 (s, t)

43 Slide 12/29 P Blanche et al. Why bother? R 2 π(s, t) aims to circumvent the dicult interpretation of: the scale on which PE(s, t) is measured interpretation for trend of PE(s, t) vs s

44 Why bother? R 2 π(s, t) aims to circumvent the dicult interpretation of: the scale on which PE(s, t) is measured interpretation for trend of PE(s, t) vs s Because the meaning of the scale on which PE(s,t) is measured changes with s, an increasing/decreasing trend can be due to changes in: the quality of the predictions and/or the at risk population Slide 12/29 P Blanche et al.

45 Slide 13/29 P Blanche et al. Changes in the quality of the predictions "Essentially, all models are wrong, but some are useful.", G. Box

46 Slide 13/29 P Blanche et al. Changes in the quality of the predictions "Essentially, all models are wrong, but some are useful.", G. Box The prediction model from which we have obtained the predictions can be more wrong for some s than for some others. Calibration term of PE(s, t) changes with s We can work on it!

47 Slide 14/29 P Blanche et al. Changes in the at risk population An example: Patients with cardiovascular history (CV) all die early. Only those without CV remain at risk for late s. Then: the earlier s the more homogeneous the at risk population CV is useful for prediction for early s but useless for late s.

48 Slide 14/29 P Blanche et al. Changes in the at risk population An example: Patients with cardiovascular history (CV) all die early. Only those without CV remain at risk for late s. Then: the earlier s the more homogeneous the at risk population CV is useful for prediction for early s but useless for late s. The available information can be more informative for some s than for some others. Discriminating term of PE(s, t) changes with s This is just how it is, there is nothing we can do! (we can only work with the data we have)

49 Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk).

50 Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk). When predictions are calibrated (strongly): R 2 π(s, t) = Var{π(s, t) T > s} Var{D(s, t) T > s} explained variation (after little algebra)

51 Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk). When predictions are calibrated (strongly): R 2 π(s, t) = Var{π(s, t) T > s} Var{D(s, t) T > s} explained variation = Corr 2{ D(s, t), π(s, t) T > s } correlation (after little algebra)

52 Slide 15/29 P Blanche et al. R 2 π(s, t) interpretation Always true: Measure of how the prediction tool π(s, t) performs compared to the benchmark null prediction tool, which gives the same predicted risk to all subjects (marginal risk). When predictions are calibrated (strongly): R 2 π(s, t) = Var{π(s, t) T > s} Var{D(s, t) T > s} explained variation = Corr 2{ D(s, t), π(s, t) T > s } correlation { } = E π(s, t) D(s, t) = 1, T > s { } E π(s, t) D(s, t) = 0, T > s. mean risk dierence (after little algebra)

53 Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i }

54 Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs.

55 Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs. Inverse Probability of Censoring Weighting (IPCW) estimator: PE π(s, t) = 1 n n i=1 { Di (s, t) π i (s, t)} 2

56 Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs. Inverse Probability of Censoring Weighting (IPCW) estimator: PE π(s, t) = 1 n n i=1 { } 2 Ŵ i (s, t) Di (s, t) π i (s, t)

57 Slide 16/29 P Blanche et al. Observations & IPCW PE estimator Observations (i.i.d.) { ( Ti, i, π i (, ) ), i = 1,..., n} where T i = T i C i, i = 1{T i C i } Indicator of observed event occurrence in (s, s + t]: D i (s, t) = 1{s < T 1 : event occurred i s + t, i = 1} = 0 : event did not occur or censored obs. Inverse Probability of Censoring Weighting (IPCW) estimator: PE π(s, t) = 1 n n i=1 { } 2 Ŵ i (s, t) Di (s, t) π i (s, t) and R 2 π(s, t) = 1 PE π(s, t) PE 0(s, t)

58 Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = + + with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t time

59 Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = 1{s < T i s + t} i Ĝ( T i s) + + with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t : uncensored : censored time

60 Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = 1{s < T i s + t} i Ĝ( T i s) + 1{ T i > s + t} Ĝ(s + t s) + with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t time

61 Slide 17/29 P Blanche et al. Inverse Probability of Censoring Weights Ŵ i (s, t) = 1{s < T i s + t} i Ĝ( T i s) + 1{ T i > s + t} Ĝ(s + t s) + 0 with Ĝ(u s) the Kaplan-Meier estimator of P(C > u C > s). Landmark time s Time s + t time

62 Slide 18/29 P Blanche et al. Asymptotic i.i.d. representation Lemma: Assume that the censoring time C is independent of (T, η, π(, )) and let θ denote either PE π, R 2 π or a dierence in PE or R 2 π, then n ( θ(s, t) θ(s, t) ) = 1 n n IF θ ( T i, i, π i (s, t), s, t) + o p (1) i=1 where IF θ ( T i, i, π i (s, t), s, t) being : zero-mean i.i.d. terms easy to estimate (using Nelson-Aalen & Kaplan-Meier)

63 Slide 19/29 P Blanche et al. Pointwise condence interval (xed s) Asymptotic normality: n ( θ(s, t) θ(s, t) ) D ( ) N 0, σ 2 s,t 95% condence interval: { } σ s,t θ(s, t) ± z 1 α/2 n where z 1 α/2 is the 1 α/2 quantile of N (0, 1). Variance estimator: σ 2 s,t = 1 n n i=1 { IFθ ( T } 2 i, i, π i (s, t), s, t)

64 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Υ b = sup s S

65 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm 1 For b = 1,..., B, say B = 4000, do: Υ b = sup s S

66 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: Υ b = sup s S (Conditional multiplier central limit theorem)

67 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). Υ b = sup s S (Conditional multiplier central limit theorem)

68 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 (Conditional multiplier central limit theorem)

69 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 (Conditional multiplier central limit theorem)

70 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 (Conditional multiplier central limit theorem)

71 Slide 20/29 P Blanche et al. Simultaneous condence band over s S { θ(s, t) ± q (S,t) 1 α } σ s,t, s S n Computation of q (S,t) 1 α by the simulation algorithm ( Wild Bootstrap): 1 For b = 1,..., B, say B = 4000, do: 1 Generate {ω1, b..., ωn} b from n i.i.d. N (0, 1). 2 Using the plug-in estimator IF θ ( ), compute: Υ b 1 n = sup ω b IF θ ( T i, i, π i (s, t), s, t) i n σ s,t s S i=1 2 Compute q (S,t) 1 α as the 100(1 α)th percentile of { Υ 1,..., Υ B} (Conditional multiplier central limit theorem)

72 Slide 21/29 P Blanche et al. DIVAT sample Population based study of kidney recipients (n=4,119) Split the data into training (2/3) and validation (1/3) samples

73 Slide 21/29 P Blanche et al. DIVAT sample Population based study of kidney recipients (n=4,119) Split the data into training (2/3) and validation (1/3) samples T : time from 1-year after transplantation to graft failure which is: Death Return to dialysis OR

74 Slide 21/29 P Blanche et al. DIVAT sample Population based study of kidney recipients (n=4,119) Split the data into training (2/3) and validation (1/3) samples T : time from 1-year after transplantation to graft failure which is: Death Return to dialysis OR Censoring due to: delayed entries: end of follow-up: 2014 Baseline covariates: age, sex, cardiovascular history Longitudinal biomarker (yearly): serum creatinine

75 Descriptive statistics & censoring issue s S = {0, 0.5,..., 5} t = 5 years No. of subjects Censored in (s,s+5] Known as event free at s+5 Observed failure in (s,s+5] s=0 s=1 s=2 s=3 s=4 s=5 landmark time (year) Slide 22/29 P Blanche et al.

76 Slide 23/29 P Blanche et al. Joint model Longitudinal [ ] log Y i (t ij ) = (β 0 + b 0i ) + β 0,age AGE i + β 0,sex SEX i + ( β 1 + b 1i + β 1,age AGE i ) tij + ɛ ij = m i (t) + ε ij (tted using package JM)

77 Slide 23/29 P Blanche et al. Joint model Longitudinal [ ] log Y i (t ij ) = (β 0 + b 0i ) + β 0,age AGE i + β 0,sex SEX i + ( β 1 + b 1i + β 1,age AGE i ) tij + ɛ ij = m i (t) + ε ij Survival (hazard) h i (t) = h 0 (t) exp { γ age AGE i + γ CV CV i + α 1 m i (t) + α 2 dm i (t) dt } (tted using package JM)

78 R 2 π(s, t) vs s (t=5 years) 95% pointwise CI 95% simultaneous CB R π 2 (s, t) 0 % 10 % 20 % 30 % Landmark time s (years) Slide 24/29 P Blanche et al.

79 Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) Landmark time s (years) Slide 25/29 P Blanche et al.

80 Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV Landmark time s (years) Slide 25/29 P Blanche et al.

81 Slide 25/29 P Blanche et al. Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV + Y(t=0) T ~ Age + CV Landmark time s (years)

82 Slide 25/29 P Blanche et al. Comparing R 2 π(s, t) vs s for dierent π(s, t) R π 2 (s, t) 0 % 10 % 20 % 30 % T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV + Y(t=0) Landmark time s (years)

83 Comparing R 2 π(s, t) vs s for dierent π(s, t) Difference in R π 2 (s,t) 5 % 0 % 5 % 10 % 20 % Landmark time s Slide 25/29 P Blanche et al.

84 Comparing R 2 π(s, t) vs s for dierent π(s, t) Difference in R π 2 (s,t) 5 % 0 % 5 % 10 % 20 % Landmark time s Slide 25/29 P Blanche et al.

85 Calibration plot (example for s = 3 years) 5 year risk (%) Predicted Observed 2% 8(4)% (0,10) (0;3.5] 4% 15(6)% (10,20) (3.5;5.3] 7% 15(5)% (20,30) (5.3;8] 13(5)% 16(6)% 12% 15% 9% (30,40) (8;10.4] (40,50) (10.4;13.1] 29(7)% 26(6)% 27% 21(7)% 19% (50,60) (13.1;16.5] (60,70) (16.5;22.3] (70,80) (22.3;33] 42% 36(9)% (80,90) (33;55.1] 77% 74(8)% (90,100) (55.1;100] Risk groups (quantile groups, in %) Slide 26/29 P Blanche et al.

86 Slide 27/29 P Blanche et al. Area under the ROC(s, t) curve vs s AUC π (s, t) 50% 60% 70% 80% 100% T ~ Age + CV + m(t) + m'(t) (JM) T ~ Age + CV + Y(t=0) T ~ Age + CV Landmark time s (years)

87 Summing up R 2 -type curve summarizes calibration and discriminating simultaneously has an understandable trend Simple model free inference predictions can be obtained from any model we do not assume any model to hold allows fair comparisons of dierent predictions The method accounts for: Censoring Dynamic setting (the at risk population changes) Slide 28/29 P Blanche et al.

88 Discussion The strong calibration assumption allows dierent interesting interpretations: Explained variation Correlation Mean risk dierence Unfortunately the strong calibration cannot be checked (curse of dimensionality) However weak and strong denitions are closely related: strong calibration implies weak calibration weak calibration can often be seen as a reasonable approximation of strong calibration in practice weak calibration can be assessed (plots) Slide 29/29 P Blanche et al.

89 Discussion The strong calibration assumption allows dierent interesting interpretations: Explained variation Correlation Mean risk dierence Unfortunately the strong calibration cannot be checked (curse of dimensionality) However weak and strong denitions are closely related: strong calibration implies weak calibration weak calibration can often be seen as a reasonable approximation of strong calibration in practice weak calibration can be assessed (plots) Thank you for your attention! Slide 29/29 P Blanche et al.

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods

Lecture 2 ESTIMATING THE SURVIVAL FUNCTION. One-sample nonparametric methods Lecture 2 ESTIMATING THE SURVIVAL FUNCTION One-sample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:

More information

Credit Risk Models: An Overview

Credit Risk Models: An Overview Credit Risk Models: An Overview Paul Embrechts, Rüdiger Frey, Alexander McNeil ETH Zürich c 2003 (Embrechts, Frey, McNeil) A. Multivariate Models for Portfolio Credit Risk 1. Modelling Dependent Defaults:

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Introduction to mixed model and missing data issues in longitudinal studies

Introduction to mixed model and missing data issues in longitudinal studies Introduction to mixed model and missing data issues in longitudinal studies Hélène Jacqmin-Gadda INSERM, U897, Bordeaux, France Inserm workshop, St Raphael Outline of the talk I Introduction Mixed models

More information

1 Sufficient statistics

1 Sufficient statistics 1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

More information

Exam C, Fall 2006 PRELIMINARY ANSWER KEY

Exam C, Fall 2006 PRELIMINARY ANSWER KEY Exam C, Fall 2006 PRELIMINARY ANSWER KEY Question # Answer Question # Answer 1 E 19 B 2 D 20 D 3 B 21 A 4 C 22 A 5 A 23 E 6 D 24 E 7 B 25 D 8 C 26 A 9 E 27 C 10 D 28 C 11 E 29 C 12 B 30 B 13 C 31 C 14

More information

A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY. Workshop

A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY. Workshop A LONGITUDINAL AND SURVIVAL MODEL WITH HEALTH CARE USAGE FOR INSURED ELDERLY Ramon Alemany Montserrat Guillén Xavier Piulachs Lozada Riskcenter - IREA Universitat de Barcelona http://www.ub.edu/riskcenter

More information

Lecture 15 Introduction to Survival Analysis

Lecture 15 Introduction to Survival Analysis Lecture 15 Introduction to Survival Analysis BIOST 515 February 26, 2004 BIOST 515, Lecture 15 Background In logistic regression, we were interested in studying how risk factors were associated with presence

More information

Principles of Hypothesis Testing for Public Health

Principles of Hypothesis Testing for Public Health Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

More information

MODEL CHECK AND GOODNESS-OF-FIT FOR NESTED CASE-CONTROL STUDIES

MODEL CHECK AND GOODNESS-OF-FIT FOR NESTED CASE-CONTROL STUDIES MODEL CHECK AND GOODNESS-OF-FIT FOR NESTED CASE-CONTROL STUDIES by YING ZHANG THESIS for the degree of MASTER OF SCIENCE (Master i Modellering og dataanalyse) Faculty of Mathematics and Natural Sciences

More information

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1. Normal Distribution Definition A continuous random variable has a normal distribution if its probability density e -(y -µ Y ) 2 2 / 2 σ function can be written as for < y < as Y f ( y ) = 1 σ Y 2 π Notation:

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing

More information

Web-based Supplementary Materials for. Modeling of Hormone Secretion-Generating. Mechanisms With Splines: A Pseudo-Likelihood.

Web-based Supplementary Materials for. Modeling of Hormone Secretion-Generating. Mechanisms With Splines: A Pseudo-Likelihood. Web-based Supplementary Materials for Modeling of Hormone Secretion-Generating Mechanisms With Splines: A Pseudo-Likelihood Approach by Anna Liu and Yuedong Wang Web Appendix A This appendix computes mean

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

Introduction to Path Analysis

Introduction to Path Analysis This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Estimation of σ 2, the variance of ɛ

Estimation of σ 2, the variance of ɛ Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated

More information

α α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

Comparison of resampling method applied to censored data

Comparison of resampling method applied to censored data International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison

More information

The Ergodic Theorem and randomness

The Ergodic Theorem and randomness The Ergodic Theorem and randomness Peter Gács Department of Computer Science Boston University March 19, 2008 Peter Gács (Boston University) Ergodic theorem March 19, 2008 1 / 27 Introduction Introduction

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Course 4 Examination Questions And Illustrative Solutions. November 2000

Course 4 Examination Questions And Illustrative Solutions. November 2000 Course 4 Examination Questions And Illustrative Solutions Novemer 000 1. You fit an invertile first-order moving average model to a time series. The lag-one sample autocorrelation coefficient is 0.35.

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

Sales forecasting # 2

Sales forecasting # 2 Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides,

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Missing data and net survival analysis Bernard Rachet

Missing data and net survival analysis Bernard Rachet Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,

More information

Determining distribution parameters from quantiles

Determining distribution parameters from quantiles Determining distribution parameters from quantiles John D. Cook Department of Biostatistics The University of Texas M. D. Anderson Cancer Center P. O. Box 301402 Unit 1409 Houston, TX 77230-1402 USA cook@mderson.org

More information

University of Maryland Fraternity & Sorority Life Spring 2015 Academic Report

University of Maryland Fraternity & Sorority Life Spring 2015 Academic Report University of Maryland Fraternity & Sorority Life Academic Report Academic and Population Statistics Population: # of Students: # of New Members: Avg. Size: Avg. GPA: % of the Undergraduate Population

More information

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

More information

Is a Brownian motion skew?

Is a Brownian motion skew? Is a Brownian motion skew? Ernesto Mordecki Sesión en honor a Mario Wschebor Universidad de la República, Montevideo, Uruguay XI CLAPEM - November 2009 - Venezuela 1 1 Joint work with Antoine Lejay and

More information

Monte Carlo Simulation

Monte Carlo Simulation 1 Monte Carlo Simulation Stefan Weber Leibniz Universität Hannover email: sweber@stochastik.uni-hannover.de web: www.stochastik.uni-hannover.de/ sweber Monte Carlo Simulation 2 Quantifying and Hedging

More information

Checking proportionality for Cox s regression model

Checking proportionality for Cox s regression model Checking proportionality for Cox s regression model by Hui Hong Zhang Thesis for the degree of Master of Science (Master i Modellering og dataanalyse) Department of Mathematics Faculty of Mathematics and

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD

Tips for surviving the analysis of survival data. Philip Twumasi-Ankrah, PhD Tips for surviving the analysis of survival data Philip Twumasi-Ankrah, PhD Big picture In medical research and many other areas of research, we often confront continuous, ordinal or dichotomous outcomes

More information

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

More information

Testing against a Change from Short to Long Memory

Testing against a Change from Short to Long Memory Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer Goethe-University Frankfurt This version: January 2, 2008 Abstract This paper studies some well-known tests for the null

More information

Interpretation of Somers D under four simple models

Interpretation of Somers D under four simple models Interpretation of Somers D under four simple models Roger B. Newson 03 September, 04 Introduction Somers D is an ordinal measure of association introduced by Somers (96)[9]. It can be defined in terms

More information

6.2 Permutations continued

6.2 Permutations continued 6.2 Permutations continued Theorem A permutation on a finite set A is either a cycle or can be expressed as a product (composition of disjoint cycles. Proof is by (strong induction on the number, r, of

More information

Life Tables. Marie Diener-West, PhD Sukon Kanchanaraksa, PhD

Life Tables. Marie Diener-West, PhD Sukon Kanchanaraksa, PhD This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS Copyright 005 by the Society of Actuaries and the Casualty Actuarial Society

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Nonlinear Regression:

Nonlinear Regression: Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference

More information

A General Approach to Variance Estimation under Imputation for Missing Survey Data

A General Approach to Variance Estimation under Imputation for Missing Survey Data A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey

More information

Monte Carlo-based statistical methods (MASM11/FMS091)

Monte Carlo-based statistical methods (MASM11/FMS091) Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February 5, 2013 J. Olsson Monte Carlo-based

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

Sample Size Calculation for Longitudinal Studies

Sample Size Calculation for Longitudinal Studies Sample Size Calculation for Longitudinal Studies Phil Schumm Department of Health Studies University of Chicago August 23, 2004 (Supported by National Institute on Aging grant P01 AG18911-01A1) Introduction

More information

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091)

Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Monte Carlo and Empirical Methods for Stochastic Inference (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 5 Sequential Monte Carlo methods I February

More information

Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia

Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine Aït-Sahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times

More information

Properties of Future Lifetime Distributions and Estimation

Properties of Future Lifetime Distributions and Estimation Properties of Future Lifetime Distributions and Estimation Harmanpreet Singh Kapoor and Kanchan Jain Abstract Distributional properties of continuous future lifetime of an individual aged x have been studied.

More information

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009

HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 HURDLE AND SELECTION MODELS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics July 2009 1. Introduction 2. A General Formulation 3. Truncated Normal Hurdle Model 4. Lognormal

More information

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg

SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing lsun@smu.edu.sg IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way

More information

Imputation of missing data under missing not at random assumption & sensitivity analysis

Imputation of missing data under missing not at random assumption & sensitivity analysis Imputation of missing data under missing not at random assumption & sensitivity analysis S. Jolani Department of Methodology and Statistics, Utrecht University, the Netherlands Advanced Multiple Imputation,

More information

Life Table Analysis using Weighted Survey Data

Life Table Analysis using Weighted Survey Data Life Table Analysis using Weighted Survey Data James G. Booth and Thomas A. Hirschl June 2005 Abstract Formulas for constructing valid pointwise confidence bands for survival distributions, estimated using

More information

Modelling spousal mortality dependence: evidence of heterogeneities and implications

Modelling spousal mortality dependence: evidence of heterogeneities and implications 1/23 Modelling spousal mortality dependence: evidence of heterogeneities and implications Yang Lu Scor and Aix-Marseille School of Economics Lyon, September 2015 2/23 INTRODUCTION 3/23 Motivation It has

More information

Chapter 4: Statistical Hypothesis Testing

Chapter 4: Statistical Hypothesis Testing Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin

More information

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling

Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Nonlinear Regression:

Nonlinear Regression: Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day 1: Estimation and

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails 12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint

More information

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010 Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

More information

Testing against a Change from Short to Long Memory

Testing against a Change from Short to Long Memory Testing against a Change from Short to Long Memory Uwe Hassler and Jan Scheithauer Goethe-University Frankfurt This version: December 9, 2007 Abstract This paper studies some well-known tests for the null

More information

Unified Lecture # 4 Vectors

Unified Lecture # 4 Vectors Fall 2005 Unified Lecture # 4 Vectors These notes were written by J. Peraire as a review of vectors for Dynamics 16.07. They have been adapted for Unified Engineering by R. Radovitzky. References [1] Feynmann,

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

Lesson19: Comparing Predictive Accuracy of two Forecasts: Th. Diebold-Mariano Test

Lesson19: Comparing Predictive Accuracy of two Forecasts: Th. Diebold-Mariano Test Lesson19: Comparing Predictive Accuracy of two Forecasts: The Diebold-Mariano Test Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@univaq.it

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 1 - Bootstrap

M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 1 - Bootstrap Nathalie Villa-Vialanei Année 2015/2016 M1 in Economics and Economics and Statistics Applied multivariate Analsis - Big data analtics Worksheet 1 - Bootstrap This worksheet illustrates the use of nonparametric

More information

Topic 5: Stochastic Growth and Real Business Cycles

Topic 5: Stochastic Growth and Real Business Cycles Topic 5: Stochastic Growth and Real Business Cycles Yulei Luo SEF of HKU October 1, 2015 Luo, Y. (SEF of HKU) Macro Theory October 1, 2015 1 / 45 Lag Operators The lag operator (L) is de ned as Similar

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Autoregressive, MA and ARMA processes Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 212 Alonso and García-Martos

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

Statistical Analysis of Life Insurance Policy Termination and Survivorship

Statistical Analysis of Life Insurance Policy Termination and Survivorship Statistical Analysis of Life Insurance Policy Termination and Survivorship Emiliano A. Valdez, PhD, FSA Michigan State University joint work with J. Vadiveloo and U. Dias Session ES82 (Statistics in Actuarial

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards Week 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution of a

More information

UNTIED WORST-RANK SCORE ANALYSIS

UNTIED WORST-RANK SCORE ANALYSIS Paper PO16 UNTIED WORST-RANK SCORE ANALYSIS William F. McCarthy, Maryland Medical Research Institute, Baltime, MD Nan Guo, Maryland Medical Research Institute, Baltime, MD ABSTRACT When the non-fatal outcome

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8

More information

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples

More information

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA

A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA REVSTAT Statistical Journal Volume 4, Number 2, June 2006, 131 142 A LOGNORMAL MODEL FOR INSURANCE CLAIMS DATA Authors: Daiane Aparecida Zuanetti Departamento de Estatística, Universidade Federal de São

More information

How To Model The Fate Of An Animal

How To Model The Fate Of An Animal Models Where the Fate of Every Individual is Known This class of models is important because they provide a theory for estimation of survival probability and other parameters from radio-tagged animals.

More information

Parabolic Equations. Chapter 5. Contents. 5.1.2 Well-Posed Initial-Boundary Value Problem. 5.1.3 Time Irreversibility of the Heat Equation

Parabolic Equations. Chapter 5. Contents. 5.1.2 Well-Posed Initial-Boundary Value Problem. 5.1.3 Time Irreversibility of the Heat Equation 7 5.1 Definitions Properties Chapter 5 Parabolic Equations Note that we require the solution u(, t bounded in R n for all t. In particular we assume that the boundedness of the smooth function u at infinity

More information

Financial Risk Forecasting Chapter 8 Backtesting and stresstesting

Financial Risk Forecasting Chapter 8 Backtesting and stresstesting Financial Risk Forecasting Chapter 8 Backtesting and stresstesting Jon Danielsson London School of Economics 2015 To accompany Financial Risk Forecasting http://www.financialriskforecasting.com/ Published

More information

Tests for Two Survival Curves Using Cox s Proportional Hazards Model

Tests for Two Survival Curves Using Cox s Proportional Hazards Model Chapter 730 Tests for Two Survival Curves Using Cox s Proportional Hazards Model Introduction A clinical trial is often employed to test the equality of survival distributions of two treatment groups.

More information

A random point process model for the score in sport matches

A random point process model for the score in sport matches IMA Journal of Management Mathematics (2009) 20, 121 131 doi:10.1093/imaman/dpn027 Advance Access publication on October 30, 2008 A random point process model for the score in sport matches PETR VOLF Institute

More information