Outline TIME SERIES ANALYSIS: AN OVERVIEW Hernando Ombao University of California at Irvine November 26, 2012
Outline OUTLINE OF TALK 1 TIME SERIES DATA 2 OVERVIEW OF TIME DOMAIN ANALYSIS 3 OVERVIEW SPECTRAL ANALYSIS
Outline OUTLINE OF TALK 1 TIME SERIES DATA 2 OVERVIEW OF TIME DOMAIN ANALYSIS 3 OVERVIEW SPECTRAL ANALYSIS
Outline OUTLINE OF TALK 1 TIME SERIES DATA 2 OVERVIEW OF TIME DOMAIN ANALYSIS 3 OVERVIEW SPECTRAL ANALYSIS
TIME SERIES DATA Visual-motor electroencephalogram (HAND EEG) Global temperature series Seismic recordings LA county environmental data (mortality, pollution, temperature)
NEUROSCIENCE DATA AND STATISTICAL GOALS Electrophysiologic data: multi-channel EEG, local field potentials Hemodynamic data: fmri time series at several ROIs
NEUROSCIENCE DATA AND STATISTICAL GOALS Multi-channel (multivariate) Two movement conditions: leftward vs. rightward
NEUROSCIENCE DATA AND STATISTICAL GOALS External Stimulus Visual, Auditory, Somatosensory, Stress Personality traits, Genes, Socio-Environmental Factors Unobserved: brain network/cell assemblies Brain Signals (indirect measures of neuronal activity) Functional: fmri, EEG, MEG, PET Anatomical: DTI Acute Outcomes Emotion, Skin conductance, Motor response
NEUROSCIENCE DATA AND STATISTICAL GOALS Stimulus Neuronal Response Brain Signals Behavior
NEUROSCIENCE DATA AND STATISTICAL GOALS Stimulus Neuronal Response Brain Signals Behavior Moderators Modifiers Genes Trait Socio-Environment
NEUROSCIENCE DATA AND STATISTICAL GOALS Changes in the mean Changes in variance Changes in Cross-Dependence
NEUROSCIENCE DATA AND STATISTICAL GOALS Characterize dependence in a brain network Temporal: Y 1 (t) [Y 1 (t 1), Y 2 (t 1),...] Spectral: interactions between oscillatory activities at Y 1, Y 2 Develop estimation and inference methods for connectivity Investigate potential for connectivity as a biomarker Predicting behavior Motor intent (left vs. right movement) [Brain-Computer-Interface] State of learning Level of mental fatigue Differentiating patient groups (bipolar vs. healthy children) Connectivity between left DLPFC right STG is greater for bipolar than healthy
NEUROSCIENCE DATA AND STATISTICAL GOALS New dependence measures must be easily interpretable Models must incorporate information across trials, across subjects Models must account differences in brain network between conditions Take advantage of multi-modal data (EEG, fmri, DTI) Model should be informed by physiology and physics Dimension reduction: extract information from massive data that is most relevant for estimating dependence Develop formal statistical inference procedures
NEUROSCIENCE DATA AND STATISTICAL GOALS Selected References Automatic methods: SLEX Transform (Smooth Localized Complex EXponnetials) Ombao et al. (2001, JASA) Ombao et al. (2001, Biometrika) Ombao et al. (2002, Ann Inst Stat Math) Huang, Ombao and Stoffer (2004, JASA) Ombao et al. (2005, JASA) Böhm, Ombao et al. (2010, JSPI) Massive data; Complex-dependence; Mixed Effects Bunea, Ombao and Auguste (2006, IEEE Trans Sig Proc) Ombao and Van Bellegem (2008, IEEE Trans Sig Proc) Freyermuth, Ombao, von Sachs (2009, JASA) Fiecas and Ombao (2011, Annals of Applied Statistics) Gorrostieta, Ombao et al. (2012, NeuroImage) Kang, Ombao et al. (2012, JASA)
GLOBAL TEMPERATURE SERIES Global Temperature Deviation 0.4 0.2 0.0 0.2 0.4 0.6 1880 1900 1920 1940 1960 1980 2000 Time
GLOBAL TEMPERATURE SERIES Model: Y(t) = µ(t)+ǫ(t) Deterministic part: µ(t) = β 0 +β 1 t Stochastic part: ǫ(t) colored noise ǫ t E ǫ t = 0,Cov(ǫ r,ǫ s ) = γ(r, s) Inference on the trend β 1 Is this simply a local" trend? What is the impact of man s activities on temperature fluctuations? What is the impact of temperature increases on natural calamities? Statistical challenge: develop a model that captures the (a.) complexities in the spatio-temporal covariance structure and (b.) causual relationships
SEISMIC RECORDINGS
SEISMIC RECORDINGS Two classes: Π 1 (earthquake) and Π 2 (explosion) Discrimination What features" separate Π 1 and Π 2? Classification Given a new time series x, classify into Π 1 or Π 2 D(x, f 1 ) vs D(x, f 2 ) Background: ban nuclear testing (classify Novaya Zelmya event of unknown origin) Seminal work: Shumway (1982, 1998, 2003) Statistical challenges: feature extraction, feature selection from massive data
LA COUNTY MORTALITY AND ENVIRONMENTAL DATA Cardiovascular Mortality 70 100 130 1970 1972 1974 1976 1978 1980 Temperature 50 70 90 1970 1972 1974 1976 1978 1980 Particulates 20 60 100 1970 1972 1974 1976 1978 1980
LA COUNTY MORTALITY AND ENVIRONMENTAL DATA Shumway, Azari and Pawitan (1988); Shumway and Stoffer (2010) LA County Weekly data on mortality, temperature and pollution levels Model mortality (temperature + pollution) Granger causality: does past knowledge of temperature and pollution help improve prediction for mortality? Causation vs Association Practical issues Hospitalization (rather than mortality) Effect of pollution might be long term (rather than short term)
COVARIANCE AND CORRELATION Bivariate time series Y(t) = [Y 1 (t), Y 2 (t)] Auto-covariance γ ll (s, t) = Cov[Y l (s), Y l (t)], l = 1, 2 Variance γ ll (t, t) = Cov[Y l (t), Y l (t)], l = 1, 2 Auto-correlation ρ ll (s, t) = γ ll (s,t) γll (s,s)γ ll (t,t)
COVARIANCE AND CORRELATION Cross-covariance γ pq (s, t) = Cov[Y p (s), Y q (t)] Cross-correlation ρ pq (s, t) = γ pq(s,t) γpp(s,s)γ qq(t,t)
COVARIANCE AND CORRELATION Trivariate time series: X, Y, Z Cross-correlation ρ(x, Y) = Cov(X,Y) Var XVar Y Partial cross-correlation between X and Y given Z Remove Z from X: ǫ X = X β X Z Remove Z from Y: ǫ Y = Y β Y Z ρ(x, Y Z) = Cov(ǫ X,ǫ Y ) VarǫX Varǫ Y
COVARIANCE AND CORRELATION Model A Model B Cross-Corr Yes Yes Partial CC NO Yes
WEAK STATIONARITY Bivariate time series Y(t) = [Y 1 (t), Y 2 (t)] Under weak stationarity, the following quantities do not change with time t: EY(t) = [µ 1,µ 2 ], Cov[Y p (s), Y q (t)] = λ pq ( s t ) Var Y p (t) = λ pp (0)
TIME DOMAIN MODELS Univariate time series Y(t) Moving Average (MA) Auto-Regressive (AR) Moving Average Auto-Regressive (ARMA) Multivariate time series Y(t) Vector Moving Average (VMA) Vector Auto-Regressive (VAR) Vector ARMA (VARMA) VARMA with exogenous series (VARMAX)
WHITE NOISE {W(t)} is a white-noise time series if EW(t) = 0 for all t {W(t)} is uncorrelated Cov[W(t), W(s)] = { σ 2 W, s = t 0, s t
MOVING AVERAGE (MA) TIME SERIES Let {W(t)} be a white noise time series Y(t) is a first order MA [MA(1)] if it can be expressed as MA(Q) time series Y(t) = Y(t) = W(t)+θ 1 W(t 1) Q θ q W(t q) where θ 0 = 1. q=1 Linear System where input is white noise; output is the moving averaqe time series MA gives a one-sided linear combination of present and past white noise Consider the case θ 0 =... = θ Q = 1 then Y(t) is a summed" or smoothed" version of the white noise
AUTO-REGRESSIVE TIME SERIES (AR) Backshift Operator B BY(t) = Y(t 1) Let k be a positive integer. Then B k Y(t) = Y(t k) Let Φ(B) = 1 φ 1 B... φ P B P. Then Φ(B)Y(t) = Y(t) φ 1 Y(t 1)... φ P Y(t P) Let Φ(B) = 1 φ 1 B. Then Φ(B)Y(t) = 1 φ 1 Y(t 1) Φ 1 (B) = l=0φ l 1 Bl when φ 1 < 1
AUTO-REGRESSIVE TIME SERIES (AR) Let {W(t)} be a white noise series Let φ 1 ( 1, 1) {Y(t)} is a stationary first-order auto-regressive model AR(1) series if Y(t) = φ 1 Y(t 1)+W(t) One can also express Y(t) as having an infinite-order MA process [MA( )] W(t) = Y(t) φ 1 Y(t 1) W(t) = [1 φ 1 B]Y(t) Y(t) = [ φ l 1 Bl ]W(t) l=0 Y(t) = φ l 1W(t l)
AUTO-REGRESSIVE TIME SERIES (AR) {Y(t)} is an AR(P) time series if it can be expressed as Y(t) = P φ p Y(t p)+w(t) p=1 Φ(B) = 1 φ 1 B... φ P B P is called the AR polynomial equation Y(t) is stb causal if the roots of Φ(z) lie outside of the unit circle Example: AR(1) Φ(B) = 1 φ 1 B The solution to Φ(z) = 0 is z = 1 φ 1 where z > 1
AUTO-REGRESSIVE TIME SERIES (AR) On ESTIMATION AND INFERENCE Y(t) = φ 1 Y(t 1)+W(t), φ 1 ( 1, 1) Impose W(t) N(0,σW 2 ) Y(t) N(0, σ 2 W 1 φ 2 1 ) Y(t) Y(t 1),...Y(1) N(φ 1 Y(t 1),σW 2 ) Data: {Y(1),...,Y(T)}
AUTO-REGRESSIVE TIME SERIES (AR) Conditional likelihood L C (φ 1,σW 2 ) = f(y(2) y(1))... f(y(t) y(1)... y(t 1)) T 1 = 1 2πσW 2 ( ) exp 1 T 2σW 2 (y(t) φ 1 y(t 1)) 2 t=2
AUTO-REGRESSIVE TIME SERIES (AR) Full likelihood L(φ 1,σ 2 W ) = f(y(1)) L C(φ 1,σ 2 W ) = 1 φ 2 1 ( ) T exp( 1 2σ 2 2πσW 2 W [ (1 φ 2 1 )y(1)2 + T (y(t) φ 1 y(t 1)) ]). 2 t=2 Conditional likelihood asymptotically equivalent to the full likelihood
THE SPECTRUM X(t) STATIONARY TEMPORAL PROCESS Cramér Representation X(t) = exp(i2πωt)dz(ω), t = 0,±1,±2,... Basis Fourier waveforms exp(i2πωt), ω ( 0.5, 0.5) Random coefficients dz(ω) increment random process EdZ(ω) = 0 and Cov[dZ(ω), dz(λ)] = δ(ω λ)f(ω)dωdλ Var dz(ω) = f(ω)dω Spectrum f(ω) DECOMPOSITION OF VARIANCE of X(t): Var X(t) = f(ω)dω
THE SPECTRUM Stochastic Regression Model Wave (2 oscillations) Distribution of Random Coeff 4 0 4 Wave (10 oscillations) Distribution of Random Coeff 4 0 4
THE SPECTRUM SPECTRUM decomposition of variance X = [X(1),..., X(T)] - zero mean stationary time series Φ - columns are the orthonormal Fourier waveforms d = [d(ω 0 ),..., d(ω T 1 )] - Fourier coefficients X = Φd X X = d d - Parseval s identity 1 T EX X = 1 T Ed d Var X(t) f(ω)dω
THE SPECTRUM A more formal derivation... X(t) = exp(i2πωt)dz(ω) γ(h) = Cov[X(t + h), X(t)] f(ω) = h= γ(h) exp( i2πωh) γ(h) = 0.5 0.5 f(ω) exp( i2πωh)dω γ(0) = f(ω)dω
SPECTRUM = VARIANCE DECOMPOSITION AR(1): X t = 0.9X t 1 +ǫ t Low Frequency Oscillations 8 6 4 2 0 2 4 6 8 10 0 100 200 300 400 500 600 700 800 900 1000 Time
SPECTRUM = VARIANCE DECOMPOSITION Spectrum of AR(1) with φ = 0.9 0.5 Frequency 0 0 1 Time
SPECTRUM = VARIANCE DECOMPOSITION AR(1): X t = 0.9X t 1 +ǫ t High Frequency Oscillations 6 4 2 0 2 4 6 8 0 100 200 300 400 500 600 700 800 900 1000 Time
SPECTRUM = VARIANCE DECOMPOSITION Spectrum of AR(1) with φ = 0.9 0.5 Frequency 0 0 1 Time
SPECTRUM = VARIANCE DECOMPOSITION Mixture: Low + High Frequency Signal 20 15 10 5 0 5 10 15 0 100 200 300 400 500 600 700 800 900 1000 Time
SPECTRUM = VARIANCE DECOMPOSITION Spectrum of the mixed signal 0.5 Frequency 0 0 1 Time
CROSS-COHERENCE - A MEASURE OF DEPENDENCE An Illustration: Interactions between oscillatory components Latent Signals U 1 (t) - low frequency signal U 2 (t) - high frequency signal Observed Signals X(t) = U 1 (t) + U 2 (t) + Z 1 (t) Y(t) = U 1 (t +l) + Z 2 (t) X and Y are linearly related through U 1.
CROSS-COHERENCE - A MEASURE OF DEPENDENCE X1(t) X2(t) Delta Theta Alpha Beta Gamma
CROSS-COHERENCE - A MEASURE OF DEPENDENCE Time series at 3 channels: X, Y, Z Cross-correlation ρ(x, Y) = Cov(X,Y) Var XVar Y Partial cross-correlation between X and Y given Z Remove Z from X: ǫ X = X β X Z Remove Z from Y: ǫ Y = Y β Y Z ρ(x, Y Z) = Cov(ǫ X,ǫ Y ) VarǫX Varǫ Y
CROSS-COHERENCE - A MEASURE OF DEPENDENCE Model A Model B Cross-Corr Yes Yes Partial CC NO Yes
CROSS-COHERENCE - A MEASURE OF DEPENDENCE When ρ(x, Y Z) 0, we want to identify the frequency bands that drive the direct linear association. When ρ(x, Y) 0, we want to identify the frequency bands that drive the linear association. Notation U(t) = X(t) Y(t) Z(t) dz(ω) = dz X (ω) dz Y (ω) dz Z (ω) Spectral representation of a stationary process U(t) = 0.5 0.5 exp(i2πωt)dz(ω). Spectral matrix Cov dz(ω) = f(ω)dω Formal definition of coherence ρ X,Y (ω) = Corr(dZ X (ω), dz Y (ω)) 2
CROSS-COHERENCE: AN INTUITIVE INTERPRETATION Ombao and Van Bellegem (2008, IEEE Trans Signal Processing) Filtered Signals X ω (t) = F ω X(t) Y ω (t) = F ω Y(t) Z ω (t) = F ω Z(t) Coherence at frequency band around ω Partial coherence ρ X,Y (ω) Corr(X ω (t), Y ω (t)) 2 Remove Z ω (t) from X ω (t): ξω(t) X = X ω (t) β X Z ω (t) Remove Z ω (t) from Y ω (t): ξω Y(t) = Y ω(t) β Y Z ω (t) ρ X,Y Z (ω) = Cov(ξ X ω (t),ξy ω (t)) 2 Varξ X ω (t)varξω Y(t) Relevant work: Pupin (1898) Estimator for f X (ω) is Var X ω (t).