Modelling Time Series of Counts



Similar documents
Transcription:

Modellig ime Series of Cous Richard A. Davis Colorado Sae Uiversiy William Dusmuir Uiversiy of New Souh Wales Yig Wag Colorado Sae Uiversiy /3/00 Modellig ime Series of Cous

wo ypes of Models for Poisso Cous Parameer-drive models Poisso regressio whe serial depedece esig for a lae process Esimaig serial depedece Fiig lae processes Observaio-drive models Fiig disribuio ad sadard errors Applicaio o ashma daa /3/00 Modellig ime Series of Cous

Eample: Daily Ashma Preseaios (990:993) 0 6 4 Ja Feb Mar Apr May Ju Jul Aug Sep Oc Nov Dec Year 990 0 6 4 Ja Feb Mar Apr May Ju Jul Aug Sep Oc Nov Dec Year 99 0 6 4 Ja Feb Mar Apr May Ju Jul Aug Sep Oc Nov Dec Year 99 0 6 4 Ja Feb Mar Apr May Ju Jul Aug Sep Oc Nov Dec Year 993 /3/00 Modellig ime Series of Cous 3

Polio Icidece i he U.S. he rae of polio ifecio dropped dramaically followig he iacivaed polio vaccie (IPV) iroducio i 955. he declie coiued followig he iroducio of live oral polio vaccie (OPV) i 96. I 960 here were 55 cases of paralyic polio repored i he Uied Saes ad i 965 here were oly 6. Bewee 980 ad 990 a average of 8 cases were repored per year mos of which were vaccie associaed. Sice 979 here has o bee a sigle case of polio caused by wild virus i he Uied Saes ad oly a average of oe impored case per year. CENER FOR DISEASE CONROL AND PREVENION Dae Las Revised: March 9 995 /3/00 Modellig ime Series of Cous 4

Eample: Polio Cous (Zeger 988) Cous 0 4 6 8 0 4 970 97 974 976 978 980 98 984 /3/00 Modellig ime Series of Cous 5 Year

Noaio ad Seup Cou Daa: Y... Y Regressio variable: Model: Disribuio of he Y give ad a sochasic process ν are idep Poisso disribued wih mea µ ep( β+ ν ). he disribuio of he sochasic process ν vecor of parameers γ. may deped o a Noe: If ν 0 he i sadard Poisso regressio model. Objecive: Iferece abou β. /3/00 Modellig ime Series of Cous 6

Liear Regressio Model-A Review Suppose {Y } follows he liear model wih ime series errors give by Y β+ W where {W } is a saioary (ARMA) ime series. Esimae β by ordiary leas squares (OLS). OLS esimae has same asympoic efficiecy as MLE. Asympoic covariace mari of ^β OLS depeds o ARMA parameers. Ideify ad esimae ARMA parameers usig he esimaed residuals W Y - ^β OLS Re-esimae β ad ARMA parameers usig full MLE. /3/00 Modellig ime Series of Cous 7

Regressio fucio: Eample: Polio (co) ( /000 cos(π /) si(π /) cos(π /6) si(π /6)) where (-73). Summary of various models fis o Polio daa: Sudy red(β) SE(β) -raio GLM Esimae -4.80.40-3.43 Zeger (988) -4.35.68 -.6 Cha ad Ledoler (995) -4.6.38-3.35 Kuk&Che (996) MCNR -3.79.95 -.8 Jorgese e al (995) -.64.08-9. Fahrmeir ad uz (994) -3.33.00 -.67 /3/00 Modellig ime Series of Cous 8

Desideraa for models - Zeger ad Qaqish Zeger & Qaqish (988) offer 3 desideraa ha should be me.. Ease of ierpreaio. Margial mea of Y should be approimaely E(Y ) µ ep( β) (regressio coefficie β ca be ierpreed as he proporioal chage i he margial epecaio of Y give a ui chage i ). Fleibiliy. Boh posiive ad egaive serial correlaio should be possible i he model. 3. Orhogoaliy of he esimaes of β ad γ. (Eables implemeaio of a -sage esimaio procedure?) /3/00 Modellig ime Series of Cous 9

Desideraa for models - coiued Codiio 3 is me for liear regressio models wih ime series errors. For cou daa his codiio may be overly resricive sice he mea ad variace of Y are liked. 4. Ease of producig forecass. Ofe his is primary goal of ime series modellig. 5. Procedures for model fiig ad iferece. 6. Diagosic ools. Required for assessig model adequacy. /3/00 Modellig ime Series of Cous 0

Lae Process or Parameer Drive Model Cou Daa: Y... Y Codiioal disribuio of Y give ad a o-egaive sochasic process ε is Poisso disribued wih mea ε ep( β) i.e. Y ε P(ε ep( β)). Noe: E Y ep( β )E ε. We assume E ε for ideificaio purposes. Assumpios o lae process: {ε } is a o-egaive saioary ime series wih mea ad ACVF γ ε (h) E(ε +h -) (ε -). Ofe assume ε ep(α ) where {α } is a saioary Gaussia.S. ( α ~ N(σ α / σ α ) ) /3/00 Modellig ime Series of Cous

Mome Properies of he Poisso Cou Process Mea of Y : µ E(Y ) ep( β) Variace of Y : Var(Y ) µ + µ σ ε Auocovariace fucio of Y : Cov(Y +h Y ) µ µ +h γ ε (h). Auocorrelaio fucio of Y : Cor(Y +h Y ) ρ ε (h)/((+ µ - σ ε - )(+ µ +h - σ ε - )) / Special case ad ε ep(α ): 0 Cor(Y +h Y ) ρ α (h) Implicaio: difficul o deec correlaio i lae process from Y /3/00 Modellig ime Series of Cous

/3/00 Modellig ime Series of Cous 3 GLM Esimaes Model: Y ε P(ε ep( β)). GLM log-likelihood: (Likelihood igores presece of he lae process.) Assumpios o regressors: + ) Y Y e l! log ( β β β ) ( ) ( ) ( β Ω γ µ µ Ω β Ω µ Ω ε II s s II I I s s

heorem for GLM Esimaes heorem. Le ^β be he GLM esimae of β obaied by maimizig l(β) for he Poisso regressio model wih a saioary logormal lae process. he Noes: d / ( β β) N(0 ΩI + ΩI ΩII ΩI ).. - Ω I - is he asympoic cov mari from a sd GLM aalysis.. - Ω I - Ω II Ω I - is he addiioal coribuio due o he presece of he lae process. 3. Resul also valid for more geeral lae processes (miig ec) 4. Ca have deped o he sample size. /3/00 Modellig ime Series of Cous 4

/3/00 Modellig ime Series of Cous 5 Whe does CL Apply? Codiios o he regressors hold for:. red fucios. f(/) where f is a coiuous fucio o [0]. I his case Remark. ( /) correspods o liear regressio ad works. However ( ) does o produce cosise esimaes say if he rue slope is egaive.. ) ( ) ( ) ( ) ( ) ( ) ( ) ( 0 ) ( 0 ε β ε β γ γ µ µ µ h s s h d e s d e f f f f f s f

Whe does CL apply? (co). Harmoic fucios o specify aual or weekly effecs e.g. cos(π/7) 3. Saioary process. (e.g. seasoally adjused emperaure series.) /3/00 Modellig ime Series of Cous 6

Applicaio o Polio Daa Use he same regressio fucio as before. Assume lae process is a log-ormal AR() i.e. l ε α where (α +σ /) φ(α - + σ /) +η {η }~IID N(0 σ (φ )) wih φ.8 σ.57. Zeger GLM Fi Asym Simulaio ^β Z s.e. ^β GLM s.e. s.e. ^β GLM s.d. Iercep 0.7 0.3.07.075.05.50.3 red( 0-3 ) -4.35.68-4.80.40 4. -4.89 3.94 cos(π/) -0. 0.6-0.5.097.57 -.45.44 si(π/) -.048 0.7-0.53.09.68 -.53.68 cos(π/6) 0.0 0.4.69.098..67.3 si(π/6) -0.4 0.4 -.43.0.5 -.440.5 /3/00 Modellig ime Series of Cous 7

Polio Daa Wih Esimaed Regressio Fucio Cous 0 4 6 8 0 4 970 97 974 976 978 980 98 984 /3/00 Modellig ime Series of Cous 8 Year

/3/00 Modellig ime Series of Cous 9 esig for he Eisece of a Lae Process Uder H 0 : o lae process (i.e. ε ) he Pearso residuals are appro IID N(0). es saisic has a appro N(0) disribuio. es does o perform well. α.00.050.05 P(Q>z -α ).036.00.004 Y e µ µ / + µ σ σ Q Q e Q

/3/00 Modellig ime Series of Cous 0 Adjusmes o es Saisic Sadardized Pearso residuals: where h is he h diagoal value of he ha mari. Braas ad Johasso (994) es saisic: based o a local aleraive hypohesis agais a eg biomial aleraive. (S a is he versio adaped by Dea ad Lawless (989) ad geerally worked bes.) ) ( ~ h Y e µ µ ( ) [ ] / µ µ + µ a h Y Y S

Zeger s esimaes of auocovariaces Zeger (988) proposed he followig esimaes of he ACVF of he lae process σ γ [( ) ] µ µ / ε Z Y µ h h ε Z ( h) ( Y µ )( Y + h µ + h )/ µ µ + h ρ ε Z ( h) γε Z ( h) / σ ε Z /3/00 Modellig ime Series of Cous

Bias Adjusmes o Zeger s esimaes Leig β 0 deoe he rue parameer value wrie µ ep( ( β β )) µ 0 Usig he heorem β β 0 is approimaely disribued as N(0G ) where G Ω - I + Ω - I Ω II Ω - I µ has a approimae logormal disribuio wih mea ad secod mome E E ( ) ( ep( ( ))) µ µ E β β ep( / ) ( ) ( ) µ µ E ep( ( β β )) µ ep( G ) hus boh firs ad secod momes have posiive bias. A early ubiased esimae of µ is he µ ep( G / ) 0 /3/00 Modellig ime Series of Cous 0 µ G

/3/00 Modellig ime Series of Cous 3 Bias Adjusmes o Zeger s esimaes (co) Usig hese resuls a biased adjusme of he variace of he lae process is where he limiig covariace mari is esimaed by ( ) ( ) [ ] / ε µ µ + + µ µ σ G G G G UB e e e e Y ) ( ) mi( ) ma( h h G Z h h h L L h II I I II I I ε + + γ µ µ Ω µ Ω Ω Ω + Ω Ω

Simulaio Resuls (liear regressio fucio) Auocovariace esimaes of a log-ormal AR() lae process wih φ.9 variace.693 ad reg fucio+/ (00). Meas SD Lag rue Zeg Z.UB Zeg Z.UB 0.00.50.70.30.63.87.40.58.7.56.75.3.48.4.5 3.66.4.39..46 4.58.9.3.9.4 5.5.4.6.7.36 6.45.0..5.33 /3/00 Modellig ime Series of Cous 4

Simulaio Resuls (co) Auocorrelaio esimaes of a log-ormal AR() lae process wih φ.9 variace.693 ad regressio fucio +/ (00). Meas SD Lag rue Zeg Z.UB Zeg Z.UB.87.79.8.7.6.75.60.64.0.9 3.66.45.50.3. 4.58.38.40.4.3 5.5.33.30.5.4 6.45..3.5.5 /3/00 Modellig ime Series of Cous 5

Simulaio Resuls (cosie regressio fucio) Auocovariace esimaes of a log-ormal AR() lae process wih φ.9 variace.693 ad reg fucio+cos(π/) (00). Meas SD Lag rue Zeg Z.UB Zeg Z.UB 0.00.73.06.44.87.87.6.90.39.79.75.5.78.36.7 3.66.45.68.33.66 4.58.38.59.30.6 5.5.33.5.8.56 6.45..46.6.53 /3/00 Modellig ime Series of Cous 6

Simulaio Resuls (co) Auocorrelaio esimaes of a log-ormal AR() lae process wih φ.9 variace.693 ad reg fucio+cos(π/) (00). Meas SD Lag rue Zeg Z.UB Zeg Z.UB.87.8.84.5.4.75.69.7.7.6 3.66.58.60.9.8 4.58.49.5.0.9 5.5.4.44.. 6.45.35.38.. /3/00 Modellig ime Series of Cous 7

Opimaly Weighed Esimaes Cosider weighed esimaes of he variace of he lae process of he form σ his esimae is approimaely ubiased for ay lae process. Choose weighs o miimize variace of he esimae whe lae process is IID. * Opimal weighs: W / Var( ) give by complicaed formula! Zeger esimaes: [ ( Y µ ) / µ ] W E / W E µ ε W E W µ Z /3/00 Modellig ime Series of Cous 8

Variace Formulas for Opimal Esimaes Suppose µ g(/)ep( β). he uder a IID lae process assumpio Var( γ ( h)) I : ) ε W Z Z g ( )( σ + εg( ) ) d / g ( d 0 0 Var ( γ * ( h)) I : / g ( )( σεg( ) + ) d ε W Op Clearly I Z I op ad for he 0 polio daa regressio fucio f ()β Sceario sqr(i Z )sqr(i Op ).. 3. 4. µ ( ) e µ ( ) e µ ( ) e µ ( ) e f ( ) β f ( ) β f ( ) β f ( ) β σ ε σ ε σ σ ε.77. 3. 0.54.. 84 ε.77. 49.05.54. 75.73 /3/00 Modellig ime Series of Cous 9

ess for Zero Auocorrelaio i Lae Process Use Bo-Pierce or Ljug-Bo pormaeau ess applied o correlaio esimaes of residuals. Pearso residuals: e early IID if lae ( Y µ ) / µ process is IID. ACF of Pearso residuals: ρ P ( h) Ljug-Bo saisic: H ρ ( h) / Var( ρ ( h)) has a chi-square P L h disribuio wih L degrees of freedom uder H 0 : o spaial correlaio. P h e e + h P / e /3/00 Modellig ime Series of Cous 30

ess for Zero Auocorrelaio i Lae Process (co) Lack of power of H P for some aleraives: o see his oe Eρ e f ( ) β 0 P( h) ρ ( ) 0 as σ ε h ε f ( ) β σε + e d 0 d 0 while Var( ρp( h)) for σε small. his problem arises i he aalysis of he ashma daa (see laer). Aleraive LB esimae: H Z UB ρz UB( h) / Var( ρz UB( h)) Relaive performace of es saisics deped o regressio fc. /3/00 Modellig ime Series of Cous 3 L h

A Simulaio Illusraio Model: Y ε P(ε ep( β)) where β is he esimaed regressio fucio from polio daa l ε α where (α +σ /) φ(α - + σ /) +η {η }~IID N(0 σ (φ )) wih φ.8 σ.57. Sample size is 68 000 reps. Resuls: H 0 was rejeced 97.7% usig es based o S a (α.05). 88% of hese cases rejeced 0 correlaio i lae process usig H ZUB (78% usig H P ) ρ ρ ε UB () () rue Mea SD Mi Ma %<.78.79.4.05.9 84%.8.8..06.0 84% α UB /3/00 Modellig ime Series of Cous 3

Applicaio o Sydey Ashma Cou Daa Daa: Y... Y 46 daily ashma preseaios i a Campbellow hospial. Prelimiary aalysis ideified. o upward or dowward red a riple peaked aual cycle modelled by pairs of he form cos(πk/365) si(πk/365) k3458. day of he week effec modelled by separae idicaor variables for Sudays ad Moday (icrease i admiace o hese days compared o ues-sa). Of he meeorological variables (ma/mi emp humidiy) ad polluio variables (ozoe NO NO ) oly humidiy a lags of -0 days appears o have a associaio. /3/00 Modellig ime Series of Cous 33

Applicaio o Sydey Ashma Cou Daa (co) Humidiy variable: H 7 6 i 0 h - -i where h is he residual from a aual cycle harmoic model fi o he daily average of humidiy a 0900 ad 500 hours. GLM aalysis: GLM heorem Effec ^ β s.e. s.e. Suday.30.05.055 Moday.36.05.055 H.0.048.066 -raios for humidiy are 4.4 ad 3.9 /3/00 Modellig ime Series of Cous 34

Applicaio o Sydey Ashma Cou Daa (co) es for presece of lae process: S a was 3.30 (highly sigifica) ess of correlaio i lae process: Degrees of freedom es saisic 5 0 5 H ZUB 44.63(e-08) 74.86(5e-) 8.3(4e-) H P 0.78(.056) 5.60(.004) 6.83(.030) /3/00 Modellig ime Series of Cous 35

Applicaio o Sydey Ashma Cou Daa (co) ACVF ad ACV esimaes. lag h γ Z γ ZUB s.e. ZUB 0.054.067.00.000.04.053.09.79.047.030.04.4.6.0 3.038.050.4.74.055 4.03.033.4.50.033 5.05.036.4.54.06 6.00.030.4.45.05 Noe: (46) -.5.06 implies ACF for Pearso residuals are barely sigifica a lags ad 3? he small values of ACF ca be parially eplaied by.934 E( ρ P ()) (.76).078.054 +.934 /3/00 Modellig ime Series of Cous 36 ρ ρ P

/3/00 Modellig ime Series of Cous 37 Ashma Cous Wih Esimaed red Fucio Year Cous 990 99 99 993 994 0 4 6 8 0 4

/3/00 Modellig ime Series of Cous 38 Observaio Drive Models Cou Daa: Y... Y Le H (Y (-) X () ) be iformaio coaied i he pas of he observed cou process ad he pas ad prese of he regressor variables. Zeger & Qaqish (988) models: Assume Y H is Poisso wih mea µ where Model : Model : Model 3: ). ep( 0 ) ep( ) ep( 0 ) ep( ) ma( ) ep( i p i i p i i i p i i i Y c c c Y c c Y i i + > + + > γ β µ β β µ β β µ γ γ

Observaio Drive Models (co) Remarks: Z&Q argue ha model is preferred o heir hree desideraa. Model 3 cao be saioary (if p ad γ <0). I Model i he case p c is ierpreed as a immigraio rae addig o cous a every ime poi. Esimaio of c i boh Models & is problemaic. /3/00 Modellig ime Series of Cous 39

For λ > 0 defie ad assume ha New Observaio Drive Model e / λ ( Y µ ) µ logµ W Sice he codiioal mea µ is based o he whole pas he model is o loger Markov. Neverheless his specificaio could lead o saioary soluios alhough he sabiliy heory appears difficul. p β + θ i i e i. /3/00 Modellig ime Series of Cous 40

Properies of he New Model Assumig ha λ.5 we have so ha Var( W E( µ ) Var( ) E( e e p i W ) θ β+ Var( W i e i ) / ) p i θ i e p β+ θ i i / which holds approimaely if W is early Gaussia. I follows ha he iercep erm ca be adjused i order for E(µ ) o be ierpreable as ep( β). /3/00 Modellig ime Series of Cous 4

Properies of he New Model (co) he model proposed here is. Easily ierpreable o he liear predicor scale ad o he scale of he mea µ wih he regressio parameers direcly ierpreable as he amou by which he mea of he cou process a ime will chage for a ui chage i he regressor variable.. A approimaely ubiased plo of he µ ca be geeraed by 3. Is easy o predic wih. µ ep(.5 W θ 4. Provides a mechaism for adjusig he iferece abou he regressio parameer β for a form of serial depedece. 5. Geeralizable o ARMA ype lag srucure. 6. Esimaio (appro MLE) is easy o carry ou. /3/00 Modellig ime Series of Cous 4 p i i ).

Ashma Daa w/ Deermiisic Par of Mea Fc Cous 0 4 6 8 0 4 990 99 99 993 994 /3/00 Modellig ime Series of Cous 43 Year

Ashma Daa: Deermiisic Par + AR i Pearso Resid Cous 3 4 5 de. par de. par + AR 990 99 99 993 994 Year /3/00 Modellig ime Series of Cous 44