Granger Causality Analysis in Irregular Time Series



Similar documents
Bullwhip Effect Measure When Supply Chain Demand is Forecasting

FORECASTING MODEL FOR AUTOMOBILE SALES IN THAILAND

A panel data approach for fashion sales forecasting

Modelling Time Series of Counts

UNDERWRITING AND EXTRA RISKS IN LIFE INSURANCE Katarína Sakálová

Combining Adaptive Filtering and IF Flows to Detect DDoS Attacks within a Router

Modeling the Nigerian Inflation Rates Using Periodogram and Fourier Series Analysis

PERFORMANCE COMPARISON OF TIME SERIES DATA USING PREDICTIVE DATA MINING TECHNIQUES

Research Article Dynamic Pricing of a Web Service in an Advance Selling Environment

CHAPTER 22 ASSET BASED FINANCING: LEASE, HIRE PURCHASE AND PROJECT FINANCING

REVISTA INVESTIGACION OPERACIONAL VOL. 31, No.2, , 2010

Hilbert Transform Relations

Kyoung-jae Kim * and Ingoo Han. Abstract

A Strategy for Trading the S&P 500 Futures Market

Why we use compounding and discounting approaches

Studies in sport sciences have addressed a wide

The Term Structure of Interest Rates

A Queuing Model of the N-design Multi-skill Call Center with Impatient Customers

Managing Learning and Turnover in Employee Staffing*

Ranking Optimization with Constraints

A formulation for measuring the bullwhip effect with spreadsheets Una formulación para medir el efecto bullwhip con hojas de cálculo

Ranking of mutually exclusive investment projects how cash flow differences can solve the ranking problem

1/22/2007 EECS 723 intro 2/3

Introduction to Statistical Analysis of Time Series Richard A. Davis Department of Statistics

14 Protecting Private Information in Online Social Networks

Exchange Rates, Risk Premia, and Inflation Indexed Bond Yields. Richard Clarida Columbia University, NBER, and PIMCO. and

Distributed Containment Control with Multiple Dynamic Leaders for Double-Integrator Dynamics Using Only Position Measurements

A New Hybrid Network Traffic Prediction Method

Financial Data Mining Using Genetic Algorithms Technique: Application to KOSPI 200

Monitoring of Network Traffic based on Queuing Theory

Reaction Rates. Example. Chemical Kinetics. Chemical Kinetics Chapter 12. Example Concentration Data. Page 1

Predicting Indian Stock Market Using Artificial Neural Network Model. Abstract

Capital Budgeting: a Tax Shields Mirage?

4. Levered and Unlevered Cost of Capital. Tax Shield. Capital Structure

COLLECTIVE RISK MODEL IN NON-LIFE INSURANCE

THE FOREIGN EXCHANGE EXPOSURE OF CHINESE BANKS

Mechanical Vibrations Chapter 4

Determinants of Public and Private Investment An Empirical Study of Pakistan

Using Kalman Filter to Extract and Test for Common Stochastic Trends 1

UNIT ROOTS Herman J. Bierens 1 Pennsylvania State University (October 30, 2007)

Circularity and the Undervaluation of Privatised Companies

A Heavy Traffic Approach to Modeling Large Life Insurance Portfolios

Testing the Weak Form of Efficient Market Hypothesis: Empirical Evidence from Jordan

ON THE RISK-NEUTRAL VALUATION OF LIFE INSURANCE CONTRACTS WITH NUMERICAL METHODS IN VIEW ABSTRACT KEYWORDS 1. INTRODUCTION

Improving Survivability through Traffic Engineering in MPLS Networks

IDENTIFICATION OF MARKET POWER IN BILATERAL OLIGOPOLY: THE BRAZILIAN WHOLESALE MARKET OF UHT MILK 1. Abstract

The Norwegian Shareholder Tax Reconsidered

Duration and Convexity ( ) 20 = Bond B has a maturity of 5 years and also has a required rate of return of 10%. Its price is $613.

On Motion of Robot End-effector Using The Curvature Theory of Timelike Ruled Surfaces With Timelike Ruling

12. Spur Gear Design and selection. Standard proportions. Forces on spur gear teeth. Forces on spur gear teeth. Specifications for standard gear teeth

APPLICATIONS OF GEOMETRIC

Teaching Bond Valuation: A Differential Approach Demonstrating Duration and Convexity

Experience and Innovation

Chapter 4 Return and Risk

A simple SSD-efficiency test

An Approach for Measurement of the Fair Value of Insurance Contracts by Sam Gutterman, David Rogers, Larry Rubin, David Scheinerman

Principal components of stock market dynamics. Methodology and applications in brief (to be updated ) Andrei Bouzaev, bouzaev@ya.

Chapter 8: Regression with Lagged Explanatory Variables

General Bounds for Arithmetic Asian Option Prices

Hanna Putkuri. Housing loan rate margins in Finland

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Output Analysis (2, Chapters 10 &11 Law)

FEBRUARY 2015 STOXX CALCULATION GUIDE

Properties of MLE: consistency, asymptotic normality. Fisher information.

3. Cost of equity. Cost of Debt. WACC.

Theorems About Power Series

Research Article Sign Data Derivative Recovery

Maximum Likelihood Estimators.

Modified Line Search Method for Global Optimization

Estimating Non-Maturity Deposits

Abstract. 1. Introduction. 1.1 Notation. 1.2 Parameters

Data Protection and Privacy- Technologies in Focus. Rashmi Chandrashekar, Accenture

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008


Chapter 7 Methods of Finding Estimators

Hypothesis testing. Null and alternative hypotheses

APPLIED STATISTICS. Economic statistics

Department of Economics Working Paper 2011:6

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Acceleration Lab Teacher s Guide

Introduction to Hypothesis Testing

Bond Valuation I. What is a bond? Cash Flows of A Typical Bond. Bond Valuation. Coupon Rate and Current Yield. Cash Flows of A Typical Bond

Fuzzy Task Assignment Model of Web Services Supplier

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

ACCOUNTING TURNOVER RATIOS AND CASH CONVERSION CYCLE

DBIQ USD Investment Grade Corporate Bond Interest Rate Hedged Index

Handbook on Residential Property Prices Indices (RPPIs)

A GLOSSARY OF MAIN TERMS

Overview of some probability distributions.

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

CHAPTER 3 DIGITAL CODING OF SIGNALS

Derivative Securities: Lecture 7 Further applications of Black-Scholes and Arbitrage Pricing Theory. Sources: J. Hull Avellaneda and Laurence

1. C. The formula for the confidence interval for a population mean is: x t, which was

Transcription:

Grager Causaliy Aalysis i Irregular Time Series Mohammad Taha Bahadori Ya Liu Absrac Learig emporal causal srucures bewee ime series is oe of he key ools for aalyzig ime series daa. I may real-world applicaios, we are cofroed wih Irregular Time Series, whose observaios are o sampled a equally-spaced ime samps. The irregulariy i samplig iervals violaes he basic assumpios behid may models for srucure learig. I his paper, we propose a oparameric geeralizaio of he Grager graphical models called Geeralized Lasso Grager (GLG) o ucover he emporal depedecies from irregular ime series. Via heoreical aalysis ad exesive experimes, we verify he effeciveess of our model. Furhermore, we apply GLG o he applicaio daase of δ 8 O isoope of Oxyge records i Asia ad achieve promisig resuls o discover he moisure rasporaio paers i a 800-year period. Iroducio I he era of daa suami, we are cofroed wih large-scale ime series daa, i.e., a sequece observaios of cocered variables over a period of ime. For example, erabyes of ime series microarray daa are produced o record he gee expressio levels uder differe reames over ime; peabyes of climae ad meeorological daa, such as emperaure, solar radiaio, ad carbo-dioxide coceraio, are colleced over he years; ad exabyes of social media coes are geeraed over ime o he Iere. As we ca see, how o develop effecive ad scalable machie learig algorihms o ucover emporal depedecy srucures bewee ime series ad reveal isighs from daa has become a key problem i machie learig ad daa miig. Up o ow may differe approaches have bee developed o solve he problem, such as auocorrelaio, cross-correlaios [8], rasfer eropy [4], radomizaio es [7], phase slope idex [0], Grager causaliy [3] ad so o, ad achieved successes i may applicaios. However, all he exisig ap- Compuer Sciece Deparme, Vierbi School of Egieerig, Uiversiy of Souher Califoria, USA Correspodece auhor. Email: yaliu.cs@usc.edu proaches assume ha he ime series observaios are obaied a equally spaced ime samps ad fail i aalyzig irregular ime series. Irregularly sampled ime series are hose wih samples missig a blocks of samplig pois or colleced a o-uiformly spaced ime pois. I is a commo challege i pracice due o aural cosrais or huma facors. For example, i biology, i is very difficul o obai blood samples of huma beigs a regular ime iervals for a log period of ime; i climae daases, may climae parameers (e.g., emperaure ad CO coceraio) are measured by differe equipmes wih varyig frequecies. Exisig mehods for aalyzig irregular ime series ca be caegorized io hree direcios: he repair approach, which recovers he missig observaios via smoohig or ierpolaio [5, 4, 6, 0, ]; (ii) geeralizaio of specral aalysis ools, such as Lomb-Scargle Periodogram (LSP) [4] or waveles [, 8, 9]; ad (iii) he kerel mehods []. While he firs wo approaches provide sufficie flexibiliy i aalyzig irregular ime series, hey have bee show o magify he smoohig effecs ad resul i huge errors for ime series wih large gaps [5]. I his paper, we propose he Geeralized Lasso- Grager (GLG) framework for causaliy aalysis of irregular ime series. I defies a geeralizaio of ier produc for irregular ime series based o o-parameric kerel fucios. As a resul, he GLG opimizaio problem akes he form of a Lasso problem ad ejoys he compuaioal scalabiliy for large scale problems. We also ivesigae a weighed versio of GLG (W-GLG), which aims o improve he performace of GLG by givig higher weighs o impora observaios. For heoreical coribuio, we propose a sufficie codiio for he asympoic cosisecy of our GLG mehod. I paricular, we show ha compared wih he popular locally weighed regressio (LWR), GLG has he same asympoic cosisecy behavior, bu achieves lower absolue error. I order o demosrae he effeciveess of GLG, we coduc experimes o four syheic daases wih differe samplig paers ad objecive fuc-

ios so ha we ca es he asympoic covergece behavior, effecs of differe kerels, ad impac of differe irregulariy paers respecively. GLG ouperforms all sae-of-he-ar algorihms ad improves he accuracy of he iferred emporal casual graph up o 5%. I addiio, we also apply GLG o ucover he paers of rasfer of moisure across Asia durig 800 years by aalysis of he emporal causal relaioship bewee desiy of δ 8 O (a radioacive isoope of oxyge) recorded i four caves i Idia ad Chia. While he resuls ideify he moisure rasfer paers, hey cofirm ha durig he warm periods he air masses had more eergy o ravel across he coie. The res of he paper is orgaized as follows: we firs formally defie he problem of learig emporal srucures i irregular ime series i Secio ad review relaed work i Secio 3. The we describe he proposed GLG framework ad provide heoreical isighs. I Secio 5, we demosrae he superior performace of GLG hrough exesive experimes. Afer he summary of he paper ad his o fuure work, we provide he proof of he heorem i he Appedix. Problem Defiiios ad Noaio Irregular ime series are ime series whose observaios are o sampled a equally-spaced ime samps. They could appear i may applicaios due o various facors. I summary, here are hree ypes of irregular ime series: icludig Gappy Time Series, Nouiformly Sampled Time Series ad Time Series wih Missig Daa. Gappy Time Series refer o hose wih regular samplig rae bu havig fiiely may blocks of daa missig. For example, i asroomy, a elescope s sigh could be blocked by a cloud or celesial obsacles for cerai period of ime, which makes he recorded samples uavailable durig ha ime [0]. Nouiformly Sampled Time Series refer o hose wih observaios a o-uiform ime pois. For example, i healh care applicaios paies usually have difficulies i rigorously recordig daily (or weekly) healh codiios or pill-akig logs for a log period of ime [6]. Time series wih missig daa appear commoly i applicaios such as sesor eworks, where some daa pois are missig or corruped due o sesor failure. I his paper, uless oherwise saed, he erm irregular ime series refers o all he meioed hree groups or ay combiaios of hem. While he regular ime series eed oly oe vecor of values o uiformly spaced ime samps, irregular ime series eed a secod vecor specifyig he ime samps a which he daa are colleced. Followig he oaio of [], we represe a irregular ime series wih a uple of (ime samp, value) pairs. Formally, we have he followig defiiio: Defiiio.. A irregular ime series x of legh N is deoed by x = {(, x )} N = where ime-samp sequece { } are sricly icreasig, i.e., l < <... < N ad x are he value of he ime series a he correspodig ime samps. The ceral ask of his paper is as follows: give P umber of irregular ime series {x (),..., x (P ) }, we are ieresed i developig efficie learig algorihms o ucover he emporal causal eworks ha reveals emporal depedece bewee hese ime series. 3 Relaed Work I his secio, we review exisig work o learig emporal causal eworks ad irregular ime series aalysis. 3. Temporal Causal Neworks Learig emporal causal eworks for regular ime series has become oe of he mos impora problems for ime series aalysis. I has broad applicaios i biology, climae sciece, social sciece, ad so o. May approaches have bee developed o solve he problem, such as auocorrelaio, cross-correlaios [8], radomizaio es [7], phase slope idex [0], ad so o. I urs ou, a regressio-based mehod called Grager Causaliy has bee iroduced i he area of ecoomerics for ime series aalysis [3]. I saes ha a variable x is he cause of aoher variable y if he pas values of x are helpful i predicig he fuure values of y. I oher words, amog he followig wo regressios: (3.) (3.) x = x = L a l x l l= L a l x l + l= L b l y l, l= where L is he maximal ime lag, if Equaio (3.) is a sigificaly beer model ha Equaio (3.), we deermie ha ime series y Grager causes ime series x. Grager causaliy has gaied remedous success across may domais due o is simpliciy, robusess, ad exedabiliy [, 9]. Mos exisig algorihms for deecig Grager causaliy are based o a saisical sigificace es.

I is exremely ime-cosumig ad very sesiive o he umber of observaios i he auoregressio. I [], he Lasso-Grager mehod is proposed o solve he problem ad show o have superior performace i erms of accuracy ad scalabiliy. Suppose we have P umber of ime series x (),..., x (P ) wih observaios a equally spaced ime samps =,..., T. The basic idea of he Lasso-Grager mehod is o uilize L -pealy o perform variable selecio, which correspods o eighborhood selecio i learig emporal causal ework []. Specifically, for each ime series x, we ca obai a sparse soluio of he coefficies a by solvig he followig Lasso problem: (3.3) T P mi {a i} a i,jx (j),lagged + λ a i, =L+ x j= where x (j),lagged is he cocaeaed vecor of lagged [ ] observaios, i.e., x (j),lagged = x (j) L,..., x(j), a i,j is he j-h vecor of coefficies a i modelig he effec of he ime series j o ime series i, λ is he pealy parameer, which deermies he sparseess of a i. The resulig opimizaio problem ca be solved efficiely by sub-gradie mehod [7], LARS [9] ad so o. The we deermie ha here is a edge from ime series j o ime series i if ad oly if a i,j is ozero vecor. The Lasso-Grager mehod o oly sigificaly reduces he compuaioal complexiy compared wih pairwise sigifica ess, bu also achieves ice heoreical properies o is cosisecy []. I has bee widely applied o biology applicaios [30], climae aalysis [], fmri daa aalysis [3] ad so o wih srog empirical resuls. 3. Irregular Time Series Aalysis Exisig mehods for aalyzig irregular ime series ca be summarized io hree caegories: Repairig he irregulariies of he ime series by eiher fillig he gaps or resamplig he series a uiform iervals ad Geeralizaio of specral aalysis for irregular ime series; ad (iii) Applicaios of kerel echiques o ime series daa. Repair Mehods The basic idea of repair mehod is o ierpolae he give ime series i regularly spaced ime samps. The produced ime series ca be used i he emporal causal aalysis for regular ime series. For Gappy Time Series or Time Series wih Missig Daa, we ca use a regressio algorihm o recover he ime series by fillig he blak ime samps, which is also kow as Surrogae Daa Modelig [6]. I he case of o-uiformly sampled ime series, he commo pracice is o fid he value of he ime series i regularly-spaced ime samps via a regressio algorihm [5, 6, ]. I some applicaios, a rasie ime model ha describes he behavior of he sysem (e.g., by differeial equaios) is available ad ca be used o recover he missig daa [4, 0]. However, his approach srogly depeds o he model accuracy, which preves i beig applicable o he daases wih complex aural processes (e.g. climae sysem) []. Oe major issue wih all repair mehods is ha he ierpolaio error propagaes hroughou all seps afer daa processig. As a resul, quaifyig he effec of he ierpolaio o ay resulig saisical iferece becomes a challegig ask [9]. Geeralizaio of he Specral Aalysis Tools The basic idea of his approach is o fid he specrum of he irregular ime series by geeralizaio of he Fourier rasform. Lomb-Scargle Periodogram (LSP) [4] is oe classical example of his approach. I LSP, we firs fi he ime series o a sie curve i order o obai heir specrum, which ca be used o calculae he power specral desiy (e.g., by Fourier rasform). The he auo-correlaio ad cross-correlaio fucios ca be foud by akig he iverse Fourier rasform (usig ifft) of he correspodig power specral desiy ad cross power specral desiy fucios [3,, 6]. The versailiy of Wavele rasforms ad heir umerous applicaios i aalysis of ime series have moivaed researchers o exed he rasform o irregular ime series [8,, 9]. LSP-based algorihms are specially desiged for specral aalysis of he periodic sigals. They do o perform well for o-periodic sigals [7], ad are o robus o he ouliers i he daa []. Furhermore, compuig he eire crosscorrelaio fucio is a much more challegig ask ha our ask of learig emporal depedece. Furhermore he produced correlaio fucio ca be a complex fucio, which is difficul o ierpre ad coduc heoreical aalysis o. Kerel-based Mehod The basic idea of kerel-based mehods is o geeralize regular correlaio operaor via kerels wihou he complex compuaio of correlaio fucio. Oe classical example of his approach is he Sloig Techiques [], which compue a weighed (kerelized) correlaio of wo irregularly sampled ime series. For wo irregular ime series {( x, x )} Nx = ad {(y m, y m )} Ny m= he 3

correlaio is compued as follows: (3.4) ˆρ(l, ) = Nx Ny = Nx = m= x y m w l, ( x, y m) Ny m= x, y m ( y m x ) where w is he kerel fucio, l is he lag umber, ad is he average samplig ierval legh. For example, w ca be Gaussia kerel as follows: (3.5) w l, (, ) = exp ( ( l ) σ where σ is he kerel badwih. The mai advaage of kerel-based mehods is ha hey are oparameric ad ca be easily exeded o ay regressio-based mehods. Furhermore, hey do o have specific assumpios o he daa, e.g. he model assumpio i he repair mehods ad he periodiciy assumpio i LSP-based mehods. 4 Mehodology The major challege o ucover emporal causal eworks for irregular ime series is how o effecively capure he emporal depedece wihou direcly esimaig he values of missig daa or makig resriced assumpios abou he geeraio process of he ime series. I his paper, we propose a simple ye exremely effecive approach by geeralizig he ier produc operaor via kerels so ha regressiobased emporal casual models ca be applicable o irregular ime series. I his secio, we describe i deail our proposed model, aalyze is heoreical properies, ad discuss oe exesio ha akes io accou he irregular paers of he ipu ime series for accurae predicio. 4. Geeralized Lasso Grager (GLG) The key idea of our model is as follows: if we rea a i,j i eq (3.3) as a ime series, a i,j x(j) ca be cosidered as is ier produc wih aoher ime series x (j). If we are able o geeralize he ier produc operaor o irregular ime series, he emporal causal models for regular ime series ca be easily exeded o hadle irregular oes. Le us deoe he geeralizaio of do produc bewee wo irregular ime series x ad y by x y, which ca be ierpreed as a (possibly o-liear) fucio ha measures he uormalized similariy bewee hem. Depedig o he arge applicaio, oe ca defie differe similariy measures, ad hus ier produc defiiios. For example, we ca defie ), he ier produc as a fucio liear wih respec o he firs ime series compoes as follows: (4.6) x y = N x = Ny m= x y m w ( x, y m) Ny m= w, (x, x m) where w is he kerel fucio. For example w ca be he Gaussia kerel defied as i Equaio (3.5). May oher similariy measures for ime series have bee developed for he classificaio ad cluserig asks [3, 34], ad ca be used for our do produc defiiio. Give he geeralizaio of he ier produc operaor, we ca ow exed he regressio i Equaio (3.3) o obai he desired opimizaio problem for irregular ime series. Formally, suppose P umber of irregular ime series x (),..., x (P ), are give. Le deoe he average legh of he samplig iervals for he arge ime series (e.g. x ) ad a i,j () be a pseudo ime series, i.e.: a i,j() = {( l, a i,j,l ) l =,..., L, l = l }, which meas ha for differe value of, a i,j () share he same observaio vecors (i.e, {a i,j }), bu he ime samp vecors vary accordig o he value of. We ca perform he causaliy aalysis by geeralized Lasso Grager (GLG) mehod ha solves he followig opimizaio problem: (4.7) where l 0 N i mi {a i,j} x =l 0 P j= a i,j( ) x (j) + λ a i, is he smalles value of ha saisfies L. The above opimizaio problem is o covex i geeral ad he covex opimizaio algorihms ca oly fid a local miimum. However if he geeralized ier produc is defied o make Problem 4.7 covex, here are efficie algorihms such as FISTA [5] o solve opimizaio problems of he form f(θ) + θ where f(θ) is covex. I his paper, we use he liear geeralizaio of he ier produc give by Equaio (4.6) wih which Problem (4.7) ca be reformulaed as liear predicio of x usig parameers a i,j ( ) subjec o orm- cosrai o he value of he parameers. Thus, he problem is a Lasso problem ad ca be solved more efficiely by opimized Lasso solvers such as [9]. 4. Exesio of GLG Mehod Noice ha every daa poi x i he GLG mehod has equal impac i he regressio. We make he followig wo

𝐿 + Δ𝑡 𝑥𝑛 Time Series # 𝑥𝑛 TS # Time Series # GLG Time Series #3 𝐿 + Δ 𝐿 + Δ () () () () () (3) (3) (3) () 𝑥𝑛 𝑧𝑡𝑛 4Δ𝑡 𝑧𝑡𝑛 3Δ𝑡 𝑧𝑡𝑛 Δ𝑡 𝑧() 𝑡𝑛 Δ𝑡 TS # 𝑧𝑡𝑛 4Δ𝑡 𝑧𝑡𝑛 3Δ𝑡 𝑧𝑡𝑛 Δ𝑡 𝑧(3) 𝑡𝑛 Δ𝑡 TS #3 Time () 𝑧𝑡𝑛 4Δ𝑡 𝑧𝑡𝑛 3Δ𝑡 𝑧𝑡𝑛 Δ𝑡 𝑧() 𝑡𝑛 Δ𝑡 Time Figure : Time Series # is he arge ime series i () his figure. Predicio of x should receive a higher Figure 3: The sources of repair errors i GLG whe () weigh ha x i he depiced sceario because i x() is beig prediced. I order o predic daa poi 𝐿 + Δ𝑡 () ca be prediced more accuraely. x GLG repairs he ime series i L𝑡 pois before 𝑡 𝑡 4Δ𝑡 𝑡 3Δ𝑡 𝑧𝑡 𝛥𝑡 z𝑡 Δ𝑡 𝑥 + 𝑧 z z he ime. A each poi of ime a repair error TS # 𝑥𝑛 𝑥𝑛 Repair z ` is produced. Time Series # Mehods Time z𝑡 4Δ𝑡 z𝑡 3Δ𝑡 𝑧𝑡 𝛥𝑡 z𝑡 Δ𝑡 TS # z3𝑡 4Δ𝑡 z3𝑡 3Δ𝑡 𝑧3𝑡 𝛥𝑡 z3𝑡 Δ𝑡 TS #3 Ni X P X 0 (j) a ( ) x v x i,j j= mi v Figure : Time Series # is he arge ime series {ai,j } =`0 Time i his figure while he oher ime series are o () + λ kai k show. Predicio of x should have higher weigh (4.8) () () i he causaliy iferece ha x because x is i 4.3 Asympoic Cosisecy of GLG We fola deser regio of he ime series. low he procedure i [8] o sudy he cosisecy of our geeralized Lasso Grager Mehod. Suppose here are wo isaiaios for he se of radom proobservaios, which could help improve he perforcesses xi, oe wih regular samplig frequecy ad mace. he oher oe wih irregular samplig frequecy. The firs observaio, depiced i Figure, saes () Figure 3 shows he source of he errors whe x ha he samples ha have more daa pois for predicio will be prediced more accuraely. Thus, is beig prediced usig GLG. I ca be see i he hey should have higher coribuio i learig he figure ha GLG ierpolaes he value of he ime causaliy graph. Thus we defie he followig weigh series i regular seps [ L,..., ] ad uses () hem for predicio of x. We ca model he errors for he subproblem of predicio of x, iduced a ime ` i he j h ime series by (j) z ` ad defie he regular ime series x = x + Nj L X P X X (j) v x w =. z for i =,..., P ad = L,...,. `, m j= `= m= Now, i suffices o show ha he graph iferred usig he regular ime series x will produce he same The secod observaio i Figure saes ha graph compared o he case ha we had he origial he learig should be uiformly disribued i ime. regular ime series x. I oher words, he samples ha are i a deser I order o proceed wih he proof we eed o regio should coribue less ha hose i sparse assume ha x are Gaussia radom variables wih regios. Thus we defie he followig weighs, zero mea. Similar o he regular case, le X be he vecor of legh of samples of x i. We ca wrie GLG as, w x, x (4.9) v x = PNi w x, x X (j) m m= λ a i,j = arg mi ai,j Usig he iroduced weighs, we defie he Weighed Geeralized Lasso Grager (W-GLG) as follows: X X j ai,j + λ kai,j k. Le Γ deoe he se of all he ime series. Suppose we have he regular ime series. The eighborhood e of a ime series x is he smalles subse 5

of Γ\{} so ha x is codiioally idepede of all he variables i he remaiig ime series. Now, defie he esimaed eighborhood of x usig irregular ime series by ñe λ = {x(j) ã λ i,j 0} i he soluio of he above problem. Theorem 4.. Le assumpios -6 i [8] hold for he soluio of Problem (4.9). (a) Suppose E[z p x q ] = 0 for all i =,..., N ad p, q =,..., L, p q. Le he pealy parameer saisfy λ d ( ɛ)/ wih κ < ɛ < ξ ad d > 0. There exiss some c > 0 so ha for all x i Γ, ) P (ñe λ e = O (exp ( c ɛ )), for. ( ) P e ñe λ = O (exp ( c ɛ )), for. (b) The above resuls ca be violaed if E[z p x q ] 0 for all i =,..., N ad p, q =,..., L, p q. Proof. A proof is give i he Appedix. The heorem specifies a sufficie codiio wih which he emporal causal graph iferred by he irregular ime series is asympoically equal o he graph ha could be iferred if he ime series were available. The codiio saes ha if he repairig error durig GLG s operaio is orhogoal o he repaired daa pois, he iferred emporal causal graph will be he same as he acual oe wih probabiliy approachig oe as he legh of he ime series grows. Choice of Kerel i GLG Mehod Cosider he o-uiformly sampled ime series. Suppose he o-uiformiy is due o clock jier which is modeled by iid zero-mea Gaussia variables a differe samplig imes. If he kerel fucio w saisfies he followig equaio, he eighborhood leared by our algorihm is guaraeed o be he same as he regular ime series case: (4.0) N [K(l + a l, q) K(p, q)] w(l + a l, p) = 0, l= for all i, l =,..., N. I his equaio K(, ) is he covariace fucio of he ime series ad a j is he radom variable modelig he clock jier a j h ime samp. Proof. By Theorem 4., i is sufficie o show ha he followig equaio holds for all p, q =,..., N which meas error should be orhogoal o all he observaios used i Lasso-Grager. [ N l= (4.) E (x (l+a l ) ] x p )w(l + a l, p) x N l= w(l + a x q = 0. l, p) The expecaio is wih respec o x, hus: (4.) [( N ) ] E x (x p )w(l + a l, p) x q = 0. l= (l+a l ) x Takig he expecaio ad usig [ he defiiio ] of he covariace fucio K(, ) = E x x x yields he resul. A quick ispecio of Equaio (4.0) shows ha he Gaussia kerel does o saisfy i. I is clear by seig q = p ad oig ha he covariace fucio always saisfies K(l + a l, p) K(p, p), ay oegaive kerel such as he Gaussia kerel cao saisfy Equaio (4.0). However if he ime series is smooh eough so ha K(, p) K(p, p) for i a eighborhood of p, he Gaussia kerel ca aeuae he effec of K(, p) K(p, p) for values of ouside he eighborhood ad he lef side of Equaio (4.0) ca become close o zero. I coclusio, while he Gaussia kerel is a subopimal kerel o use for causaliy aalysis purposes, for smooh ime series i approximaely saisfies Equaio (4.0) ad is expeced o perform well. Compariso of GLG Mehod wih Time Decay Kerels ad Locally Weighed Regressio Aalysis of he cosisecy of he repair mehods such as Kerel Weighed Regressio algorihm ca be doe similar o he aalysis of GLG mehod by iroducig he repair error variables. However GLG is expeced o have lower absolue error because i ries o predic wihou addiioal repair error. I coras, repair mehods such as LWR firs ierpolae he ime series a regular ime samps; he ry o predic he repaired samples which carry repair errors wih hemselves. As alluded i Secio 3, he repair mehods iroduce huge amou of error durig recosrucio of he irregular ime series i he large gaps, see he acual observaios x Figure 4. I coras, sice here is o sample i he gaps, GLG does o aemp o predic he value of he ime series i he gaps ad avoids hese ypes of errors. 5 Experime Resuls I order o examie performace of GLG, we perform wo series of experimes, oe o he syheic

Time Figure 4: The black circles show he ime samps of he give irregular ime series. The crosses are he ime samps used for repair of he ime series. The red ime samp shows he mome i ha he repair mehods produce large errors ad propagae he error by predicig he erroeous repaired sample. GLG skips hese iervals because i predics oly he observed samples. daases, ad he oher oe o he Paleo daase for ideifyig he mosoo climae paers i Asia. 5. Syheic Daases Desig of syheic daases wih irregular samplig ime have bee discussed exesively i he lieraure. We follow he mehodology i [] o creae four differe syheic daases. These daases are cosruced o emulae differe samplig paers ad fucioal behavior. Mixure of Siusoids wih Missig Daa (MS-M): I order o creae he MS-M daase, we geerae P ime series accordig o a mixure of several siusoidal fucios: (5.3) 3 x = A j cos(πf j j= where f j +ϕ j )+ɛ, for i =,..., P Uif(f mi, f max ), ϕ j Uif(0, π) ad he vecor of ampliudes A j is disribued as Dirichle( 3 ) so ha all he ime series have a leas amou of power. The rage of he frequecies is seleced i a way ha oly few periods of sies will be repeaed i he daase. The oise erm ɛ is seleced o be zero-mea Gaussia oise. We creae he arge ime series accordig o a liear model, x () = P L i= l= α l x l + ɛ, We se α l o be sparse o model sparse graphs. Fially we drop samples from all he ime series idepedely wih probabiliy p m. We creae 0 isaces of MS-M radom daases ad repor he average performace o hem. Mixure of Siusoids wih Jiery Clock (MS-J): For creaig his daase, firs we creae he samplig ime samps for he arge ime series: = 7 [,..., N] + e where he radom vecor e = [e,..., e N ] is a zero mea Gaussia radom vecor wih covariace marix γi ad γ is called he jier variace. We selec he parameers of he oher ime series ad use hem o calculae he value of he arge ime series a he give ime samps: x () = P L i= l= α l 3 j= A j cos(πf j ( l)+ϕ j )+ɛ, The we produce he samplig imes for he oher ime series similarly as a Gaussia radom vecor wih mea [,..., N] ad covariace marix γi ad use Equaio (5.3) o produce he ime series. We creae 50 isaces of MS-J radom daases ad repor he average performace o hem. Auo-Regressive Time Series wih Missig Daa (AR- M): The procedure of creaig he AR-M daase is similar o he oe described for MS-M wih a sigle differece i producig ime series o P accordig o AR processes: x = 3 l= β i,l x l + ɛ, for i =,..., P where β i,l are chose radomly while keepig he AR ime series sable. We creae 50 isaces of AR-M radom daases ad repor he average performace o hem. Mixure of Siusoids wih Poisso Process Samplig imes (MS-P): The procedure of creaig he MS- P daase is similar o he oe described for MS-J wih he differece of producig he samplig imes accordig o a Poisso Poi process; i.e. iersamplig imes are disribued accordig o Exp(). We creae 50 isaces of MS-P radom daases ad repor he average performace o hem. Baselies We compare he performace of our algorihm wih four sae-of-he-ar algorihms. We use Locally Weighed Regressio (LWR) ad Gaussia Process (GP) regressio o repair he ime series ad perform he regular Lasso-Grager. The Sloig Techique [] ad he LSP mehod are he wo algorihms used for fidig muual correlaio. Sice performace of W-GLG is close o he performace of GLG, performace of W-GLG will be compared i oly oe figure o avoid cluered plos. Performace Measures I order o repor he accuracy of he iferred graph, we use he Area Uder he Curve (AUC) score. The value of AUC is he probabiliy ha he algorihm will assig a higher value o a radomly chose posiive (exisig) edge

0.9 0.7 0.5 0.3 GLG LWR GP Sloig LSP 0. 0 50 00 50 00 50 300 Legh of Time Series 0. GLG LWR GP Sloig LSP 0 0 50 00 50 00 50 300 Legh of Time Series 0.9 0.7 0.5 0.3 GLG LWR GP Sloig LSP 0. 0 50 00 50 00 50 300 Legh of Time Series Figure 5: Sudy of covergece behavior of he algorihms i he Mixure of Siusoids wih (lef) Missig daa pois daase, (middle) Jiery clock daase ad (righ) Auoregressive Time Series wih Missig daa pois daase. ha a radomly chose egaive (o-exisig) edge i he graph. Parameer Tuig Uless oherwise saed, i all of he kerel-based mehods (GLG, LWR, ad Sloig echique) we use Gaussia kerel wih badwidh σ =. We selec he value of λ i GLG by 5-fold cross-validaio. 5. Resuls o he Syheic Daases. Experime #: Covergece Behavior of The Algorihms I Figures??, we icrease he legh of he ime series i daases MS-M ad MS- J o sudy he covergece behavior of GLG. Boh figures demosrae excelle covergece behavior of GLG ad he fac ha GLG cosisely ouperforms oher algorihms by a large margi. Noe ha he covergece behavior of LWR is very similar o GLG; bu as alluded i he heoreical aalysis, GLG has lower absolue error. The Sloig echique ad he LSP mehod are desiged for aalysis of crosscorrelaio of ime series ad as expeced do o perform well i our seigs. The poor performace of LSP ca be liked o he periodiciy of sigals assumpio i LSP aalysis mehod, which is violaed i our daase. Experime #: Compariso of Differe Kerels I order o sudy he effec of differe kerels o performace of GLG we es he covergece behavior of GLG wih hree differe kerels. The kerels are Gaussia: w(, ) = exp( ( ) /σ), Sic: w(, ) = σ si(( )/σ)/( ), ad he Iverse disace kerel w(, ) = ( ). As show i Figure 6, as poied ou i he aalysis secio he Gaussia kerel shows accepable covergece behavior. Performace of he Iverse Disace kerel suggess a asympoic covergece behavior bu wih much higher absolue error. The Sic kerel does o show properly coverge. 0.9 0.7 0.5 Gaussia Sic Iverse Disace 0 50 00 50 00 50 300 Legh of Time Series Figure 6: Covergece of differe kerels i he Mixure of Siusoids wih missig daase. 0.9 0.7 0.5 0.3 GLG LWR GP Sloig LSP 0. 0 0. 0. 0.3 0.5 Missig Probabiliy Figure 7: The effec of missig daa i he Mixure of Siusoids wih missig daa pois. 0.9 0.7 0.5 0.3 GLG LWR GP Sloig LSP 0. 0 0. Jier Figure 8: The effec of clock jier (γ) he Mixure of Siusoids wih Jiery Clock daase.

GLG W GLG 0. 0. 0 GLG LWR GP Sloig LSP 0 MS J MS M AR M MS P Figure 9: Performace compariso of he algorihms i he Mixure of Siusoids wih Poisso pois samplig imes. Experime #3: The Impac of Missig Rae The effec of missig daa is examied i Figure 7. I is clear ha as he probabiliy of missig a daa decreases GLG becomes more accurae. Noe he superior performace of GLG compared o ohers, (ii) Whe oly 0% of he daa pois are missig GLG perfecly ucovers he correc causaliy relaioship bewee he variables. Experime #4: Clock Jier Impac The effec of clock jier (γ) is examied i Figure 8. GLG is robus wih respec o he amou of clock jier. This is because he ime series are smooh, (ii) we are esig i a reasoable rage of jier. Experime #5: Performace of he Algorihms i Exremely Irregular Daases Due o he limied space we oly repor he resul of oe experime o he Mixure of Siusoids wih Poisso pois samplig imes, see Figure 9. I his exremely irregularly sampled daase, our algorihm ouperforms oher algorihms by a large margi. This is due o high possibiliy of large gaps i his daase ad he fac ha GLG avoids huge errors caused by ierpolaio i he gaps. Experime #6: Compariso of W-GLG ad GLG Figure 0 compares he performace of weighed versio of our algorihm (WGLG) wih he o-weighed versio (GLG) i all he four syheic daases. WGLG performs oly margially beer ha GLG i all he daases excep he daase wih Poisso samplig pois. This is because oly his exremely irregular daase provides he siuaio for weighs o show heir advaages. 5.3 Paleo Daase Now we apply our mehod o a Climae daase o discover he weaher moveme paers. Climae scieis usually rely o models wih eormous umber of parameers ha are eeded o be measured. The aleraive approach is he daa- Figure 0: Compariso of performace of W-GLG vas. GLG i all four daases. ceric approach which aemps o fid he paers i he observaios. The Paleo daase which is sudied i his paper is he collecio of desiy of δ 8 O, a radio-acive isoope of Oxyge, i four caves across Chia ad Idia. The geologiss were able o fid he esimae of δ 8 O i acie ages by drillig deeper i he wall of he caves. The daa are colleced from Dadak [6], Dogge [3], Heshag [5], Waxiag [33] caves, see Figure, wih irregular samplig paer described i Table. The ier-samplig ime varies from high resoluio 0.5 ± 0.35 o low resoluio 7.79 ± 9.79; however here is o large gap bewee he measureme imes. Dadak Waxiag Heshag Dogge Figure : Map of he locaios ad he mosoo sysems i Asia. The desiy of δ 8 O i all he daases is liked o he amou of precipiaio which is affeced by he Asia mosoo sysem durig he measureme period. Asia mosoo sysem, depiced i Figure, affecs a large share of world s populaio by rasporig moisure across he coie. The moveme of mosooal air masses ca be discovered by aalysis of heir δ 8 O race. Sice he daases are colleced from locaios i Asia, we are able o 9

Dadak Dadak Waxiag Heshag Dogge Waxiag Heshag Dogge Dadak Dadak Waxiag Heshag Dogge Waxiag Heshag Dogge Dadak (a) (b) (c) Dadak (d) (e) (f) Waxiag Heshag Dogge Waxiag Heshag Dogge Figure : Compariso of he resuls o he Paleo Daase: (a) GLG i period 850AD-563AD. (b) GLG i he period 50AD-564AD. (c) GLG i he period 850AD-50AD. (d) Sloig echique i period 850AD-563AD. (e) Sloig echique i he period 50AD-564AD. (f) Sloig echique i he period 850AD-50AD. aalyze he spaial variabiliy of he Asia mosoo sysem. I order o aalyze he spaial rasporaio of he moisure we ormalize all he daases by subracig he mea ad divide hem by heir sadard deviaio. We use GLG wih he Gaussia kerel wih badwidh equal o 0.5(y) ad maximum lag of 5(y); i.e. L = 50. I order o compare our resuls wih he resuls produced by he sloig mehod i [] we aalyze he spaial relaioship amog he locaios i hree age iervals: The eire overlappig age ierval 850AD-563AD, (ii) The Cold phase 850AD-50AD, ad (ii) The medieval warm period 50AD-563AD. Figure compares he graphs produced by GLG wih he oes repored by []. Table : Descripio of he Paleo Daase. Locaio Measur. Period Sd( ) Dadak 64AD-56AD 0.50 (y) 0.35 (y) Dogge 6930BC-000AD 4. (y).63 (y) Heshag 750BC-00AD 7.79 (y) 9.78 (y) Waxiag 9AD-003AD.5 (y).9 (y) 5.4 Resuls o he Paleo Daase Figure pars (a) ad (d) show he resuls of causaliy aalysis wih GLG ad sloig echique, respecively. Our resuls ideify wo mai rasporaio paers. Firs, he edges from Dogge o oher locaios which ca be ierpreed as he effec of moveme of air masses from souher Chia o oher regios via he Eas Asia Mosoo Sysem (EAMS). Secod, a edge from Dadak o Dogge which shows he Idia Mosoo Sysem (IMS) sigificaly affecs Dogge i souher Chia. The graph i he period 50AD- 563AD is sparser ha he graph i 850AD-50AD which ca be due o he fac ha he former age period is a cold period, i which air masses do o have eough eergy o move from Idia o Chia, while i coras he laer age period is a warm phase ad he air masses iiiaed i Idia impac souher Chia regios. Durig he warm period we ca see ha oher braches of EAMS are also more acive which resul i deser graph i he warm period. The differeces bewee our resuls ad he resuls from Sloig echique ca be because of he fac ha i he Sloig echique a edge is posiively ideified eve if wo ime series have sigifica correlaio a zero lag. However, by he defiiio of Grager causaliy, oly pas values of oe ime series should help predicio of he oher oe i order o be cosidered as a cause of i. The sigifica correlaio a zero lag ca be due o eiher fas moveme of he air masses or producio of he δ 8 O by a exeral source, such as chages i he su s radiaio sregh, ha impacs all he places wih he same amou. Thus, he correlaio value a zero lag cao be a reliable sig for iferece abou he moveme of he air masses. The sparsiy of he ideified causaliy graphs ca be due o several reasos. While we cao rule ou he possibiliy of o-liear causaliy relaioship bewee he locaios, as [] have oiced, he sparsiy ca happe because he relaioships are eiher i large milleial scales or shor aual scales. I he former case he relaioship cao be capured hrough aalysis of periods wih legh several ceuries. I he laer case, he resoluio of he daase is i he order of 3-4 years which does o capure aual or bieial liks. 6 Coclusio ad Fuure Work I his paper, we propose a oparameric geeralizaio of he Grager graphical models (GLG) o ucover he emporal causal srucures from irregular ime series. We provide heoreical proof o he cosisecy of he proposed model ad demosrae is effeciveess o four simulaio daa ad oe realapplicaio daa. For fuure work, we are ieresed i ivesigaig how o desig effecive kerels for he GLG model ad exedig he algorihm o large scale daa aalysis.

Ackowledgeme We hak Umaa Rebbapragada from JPL for discussig he problem wih us, Kira Rehfeld from PIK for sharig he Paleo daase wih us ad he aoymous reviewers for valuable commes. This research was suppored by he NSF research gras IIS- 34990. Refereces [] H. Akaike. A ew look a he saisical model ideificaio. IEEE Trasacios o Auomaic Corol, 9(6):76 73, Dec. 974. [] A. Arold, Y. Liu, ad N. Abe. Temporal causal modelig wih graphical grager mehods. I KDD 07, page 66, New York, New York, USA, Aug. 007. ACM Press. [3] P. Babu ad P. Soica. Specral aalysis of ouiformly sampled daa - a review. Digial Sigal Processig, 0():359 378, Mar. 00. [4] L. Bare, A. B. Barre, ad A. K. Seh. Grager causaliy ad rasfer eropy are equivale for Gaussia variables. Oc. 009. [5] A. Beck ad M. Teboulle. A Fas Ieraive Shrikage-Thresholdig Algorihm for Liear Iverse Problems. SIAM Joural o Imagig Scieces, ():83, Ja. 009. [6] M. Berkelhammer, A. Siha, M. Mudelsee, H. Cheg, R. L. Edwards, ad K. Caariao. Persise mulidecadal power of he Idia Summer Mosoo. Earh ad Plaeary Sciece Leers, 90(-):66 7, 00. [7] D. P. Bersekas ad D. P. Bersekas. Noliear Programmig. Ahea Scieific, d ediio, Sep. 999. [8] G. E. P. Box ad G. M. Jekis. Time series aalysis; forecasig ad corol [by] George E. P. Box ad Gwilym M. Jekis. Holde-Day Sa Fracisco,, 970. [9] A. Brovelli, M. Dig, A. Ledberg, Y. Che, R. Nakamura, ad S. L. Bressler. Bea oscillaios i a largescale sesorimoor corical ework: direcioal iflueces revealed by Grager causaliy. Proceedigs of he Naioal Academy of Scieces of he Uied Saes of America, 0(6):9849 54, Jue 004. [0] J. C. Cuevas-Tello, P. Tio, S. Raychaudhury, X. Yao, ad M. Harva. Ucoverig delayed paers i oisy ad irregularly sampled ime series: a asroomy applicaio. Paer Recogiio, 43(3):36, Aug. 009. [] J. B. Elser. Grager causaliy ad Alaic hurricaes. Tellus - Series A: Dyamic Meeorology ad Oceaography, 59(4):476 485, 007. [] G. Foser. Waveles for period aalysis of uevely sampled ime series. The Asroomical Joural, :709, Oc. 996. [3] C. W. J. Grager. Ivesigaig causal relaios by ecoomeric models ad cross-specral mehods. Ecoomerica, 37(3):pp. 44 438, 969. [4] W. K. Hareveld, R. F. Mudde, ad H. E. A. Va De Akker. Esimaio of urbulece power specra for bubbly flows from Laser Doppler Aemomery sigals. Chemical Egieerig Sciece, 60():660 668, 005. [5] C. Hu, G. M. Hederso, J. Huag, S. Xie, Y. Su, ad K. R. Johso. Quaificaio of Holocee Asia mosoo raifall from spaially separaed cave records. Earh ad Plaeary Sciece Leers, 66(3-4): 3, Feb. 008. [6] D. M. Kreidler ad C. J. Lumsde. The effecs of he irregular sample ad missig daa i ime series aalysis. Noliear dyamics, psychology, ad life scieces, 0():87 4, Apr. 006. [7] T. La Fod ad J. Neville. Radomizaio ess for disiguishig social ifluece ad homophily effecs. I WWW 0, pages 60 60, New York, NY, USA, 00. ACM. [8] N. Meishause ad P. Bühlma. High- Dimesioal Graphs ad Variable Selecio wih he Lasso. The Aals of Saisics, 34(3):436 46, Jue 006. [9] D. Modal ad D. B. Percival. Wavele variace aalysis for gappy ime series. Aals of he Isiue of Saisical Mahemaics, 6(5):943 966, Sep. 008. [0] G. Nole, A. Ziehe, V. V. Nikuli, A. Schlögl, N. Krämer, T. Brismar, ad K.-R. Müller. Robusly esimaig he flow direcio of iformaio i complex physical sysems. Physical review leers, 00(3):340, Jue 008. [] C. D.. V. Pacheko. Modified hiemsra-joes es for grager o-causaliy. Compuig i Ecoomics ad Fiace 004 9, Sociey for Compuaioal Ecoomics, 004. [] K. Rehfeld, N. Marwa, J. Heizig, ad J. Kurhs. Compariso of correlaio aalysis echiques for irregularly sampled ime series. Noliear Processes i Geophysics, 8(3):389 404, 0. [3] S. Ryali, K. Supekar, T. Che, ad V. Meo. Mulivariae dyamical sysems models for esimaig causal ieracios i fmri. NeuroImage, 54():807 3, Ja. 0. [4] J. D. Scargle. Sudies i asroomical ime series aalysis. I - Modelig radom processes i he ime domai. The Asrophysical Joural Suppleme Series, 45(Ja): 7, 98. [5] M. Schulz ad K. Saegger. Specrum: specral aalysis of uevely spaced paleoclimaic ime series. Compuers & Geoscieces, 3(9):99 945, 997. [6] P. Soica, P. Babu, ad J. Li. New Mehod of Sparse Parameer Esimaio i Separable Models ad Is Use for Specral Aalysis of Irregularly Sampled Daa. IEEE Trasacios o Sigal Processig,

59():35 47, Ja. 0. [7] P. Soica, J. Li, ad H. He. Specral Aalysis of Nouiformly Sampled Daa: A New Approach Versus he Periodogram. IEEE Trasacios o Sigal Processig, 57(3):843 858, Mar. 009. [8] W. Sweldes. The Lifig Scheme: A Cosrucio of Secod Geeraio Waveles. SIAM Joural o Mahemaical Aalysis, 9():5, Mar. 998. [9] R. Tibshirai, I. Johsoe, T. Hasie, ad B. Efro. Leas agle regressio. The Aals of Saisics, 3():407 499, Apr. 004. [30] P. A. Valdés-Sosa, J. M. Sáchez-Boro, A. Lage- Casellaos, M. Vega-Herádez, J. Bosch-Bayard, L. Melie-García, ad E. Caales-Rodríguez. Esimaig brai fucioal coeciviy wih sparse mulivariae auoregressio. Philosophical rasacios of he Royal Sociey of Lodo. Series B, Biological scieces, 360(457):969 8, May 005. [3] Y. Wag, H. Cheg, R. L. Edwards, Y. He, X. Kog, Z. A, J. Wu, M. J. Kelly, C. A. Dykoski, ad X. Li. The Holocee Asia mosoo: liks o solar chages ad Norh Alaic climae. Sciece (New York, N.Y.), 308(573):854 7, May 005. [3] L. Wu, C. Falousos, K. P. Sycara, ad T. R. Paye. Falco: Feedback adapive loop for coe-based rerieval. I VLDB 00, VLDB 00, pages 97 306, Sa Fracisco, CA, USA, 000. Morga Kaufma Publishers Ic. [33] P. Zhag, H. Cheg, R. L. Edwards, F. Che, Y. Wag, X. Yag, J. Liu, M. Ta, X. Wag, J. Liu, C. A, Z. Dai, J. Zhou, D. Zhag, J. Jia, L. Ji, ad K. R. Johso. A es of climae, su, ad culure relaioships from a 80-year Chiese cave record. Sciece (New York, N.Y.), 3(5903):940, Nov. 008. [34] Y. Zhu ad D. Shasha. Warpig idexes wih evelope rasforms for query by hummig. I SIGMOD 03, SIGMOD 03, pages 8 9, New York, NY, USA, 003. ACM. (a) If E[z p x q ] = 0 for all i =,..., N ad p, q =,..., L, p q, he ier producs i Equaios (7.4) ad (7.5) are he same ad sufficiecy par of Lemma A. i [8] guaraees he same â k,l = ã k,l. This proves ha he eighborhoods iferred by aalysis of he repaired ime series is equal o he oes obaied by aalysis of he regular ime series. Now usig he assumpios -6 ad a applicaio of Theorems ad i [8] cocludes he proof. (b) Le, w.l.o.g, a k,l > 0 for some (k, l). The ecessiy par of he Lemma A. i [8] guaraees ha G k,l (a) = λ. I is clear ha by akig io accou he Z i Equaio (7.5), we ca have G k,l (ã) > λ. (Noe ha we cao show G k,l (ã) > λ happes oly wih dimiishig probabiliy.) Now he sufficiecy codiio is o saisfied ad we are o loger guaraeed o have ã k,l = 0. Moreover, if he soluio for Problem (7.5) is o uique ad λ < G k,l (ã) < λ he ã k,l = 0 ad he soluio of Problems (7.4) ad (7.5) are ideed differe. 7 Appedix Proof of Theorem 4. Le The error variables z l are zero mea Gaussia variables due o zero mea Gaussia assumpio abou he disribuio of X ad X. Followig he proof i [8], cosruc he followig correlaio fucios: (7.4) G k,l (a) = X j X (j) a i,j, X (k) l ad (7.5) Gk,l (ã) = X j X (j) ã i,j, X (k) l