An Introduction to the Wavelet Analysis of Time Series



Similar documents
Admin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis

Wavelet Analysis Based Estimation of Probability Density function of Wind Data

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)

Statistical Modeling by Wavelets

A Wavelet Based Prediction Method for Time Series

Chapter 8 - Power Density Spectrum

Probability and Random Variables. Generation of random variables (r.v.)

Point Lattices in Computer Graphics and Visualization how signal processing may help computer graphics

An Evaluation of Irregularities of Milled Surfaces by the Wavelet Analysis

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

FFT Algorithms. Chapter 6. Contents 6.1

Part II Redundant Dictionaries and Pursuit Algorithms

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Introduction to Medical Image Compression Using Wavelet Transform

The continuous and discrete Fourier transforms

SPEECH SIGNAL CODING FOR VOIP APPLICATIONS USING WAVELET PACKET TRANSFORM A

Advanced Signal Processing and Digital Noise Reduction

Integrated Wavelet Denoising Method for High-Frequency Financial Data Forecasting

Lecture 8 ELE 301: Signals and Systems

Short-time FFT, Multi-taper analysis & Filtering in SPM12

Least Squares Estimation

CS Introduction to Data Mining Instructor: Abdullah Mueen

TCOM 370 NOTES 99-4 BANDWIDTH, FREQUENCY RESPONSE, AND CAPACITY OF COMMUNICATION LINKS

Introduction to time series analysis

Chapter 4 Lecture Notes

3. Regression & Exponential Smoothing

5 Signal Design for Bandlimited Channels

A simple and fast algorithm for computing exponentials of power series

Co-integration of Stock Markets using Wavelet Theory and Data Mining

Optimum Resolution Bandwidth for Spectral Analysis of Stationary Random Vibration Data

1 Short Introduction to Time Series

The Fourier Analysis Tool in Microsoft Excel

Introduction to Time Series Analysis. Lecture 1.

Notes on Factoring. MA 206 Kurt Bryan

Time series analysis Matlab tutorial. Joachim Gross

Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm

A Secure File Transfer based on Discrete Wavelet Transformation and Audio Watermarking Techniques

Time Series Analysis

Lecture 3: Linear methods for classification

MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

Financial TIme Series Analysis: Part II

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

Non-Data Aided Carrier Offset Compensation for SDR Implementation

B3. Short Time Fourier Transform (STFT)

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

EE 179 April 21, 2014 Digital and Analog Communication Systems Handout #16 Homework #2 Solutions

Univariate and Multivariate Methods PEARSON. Addison Wesley

Linear Threshold Units

NRZ Bandwidth - HF Cutoff vs. SNR

Lecture 18: The Time-Bandwidth Product

Analysis/resynthesis with the short time Fourier transform


Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Maximum likelihood estimation of mean reverting processes

The Heat Equation. Lectures INF2320 p. 1/88

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

State Space Time Series Analysis

STA 4273H: Statistical Machine Learning

Betting with the Kelly Criterion

SIMPLIFIED PERFORMANCE MODEL FOR HYBRID WIND DIESEL SYSTEMS. J. F. MANWELL, J. G. McGOWAN and U. ABDULWAHID

Adaptive Equalization of binary encoded signals Using LMS Algorithm

A Novel Method to Improve Resolution of Satellite Images Using DWT and Interpolation

Enhancement of scanned documents in Besov spaces using wavelet domain representations

Trend and Seasonal Components

Incorporating Internal Gradient and Restricted Diffusion Effects in Nuclear Magnetic Resonance Log Interpretation

The Algorithms of Speech Recognition, Programming and Simulating in MATLAB

Sachin Patel HOD I.T Department PCST, Indore, India. Parth Bhatt I.T Department, PCST, Indore, India. Ankit Shah CSE Department, KITE, Jaipur, India

Java Modules for Time Series Analysis

Summary Nonstationary Time Series Multitude of Representations Possibilities from Applied Computational Harmonic Analysis Tests of Stationarity

Monitoring of Internet traffic and applications

Matrices and Polynomials

Filtering method in wireless sensor network management based on EMD algorithm and multi scale wavelet analysis

L9: Cepstral analysis

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

11. Time series and dynamic linear models

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

SAS Software to Fit the Generalized Linear Model

Parametric Curves. (Com S 477/577 Notes) Yan-Bin Jia. Oct 8, 2015

The finite field with 2 elements The simplest finite field is

PAPR and Bandwidth Analysis of SISO-OFDM/WOFDM and MIMO OFDM/WOFDM (Wimax) for Multi-Path Fading Channels

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

FAST Fourier Transform (FFT) and Digital Filtering Using LabVIEW

Time Series Analysis in WinIDAMS

SOFTWARE FOR GENERATION OF SPECTRUM COMPATIBLE TIME HISTORY

SYSTEMS OF REGRESSION EQUATIONS

EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set

Treelets A Tool for Dimensionality Reduction and Multi-Scale Analysis of Unstructured Data

December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS

Transcription:

An Introduction to the Wavelet Analysis of Time Series Don Percival Applied Physics Lab, University of Washington, Seattle Dept. of Statistics, University of Washington, Seattle MathSoft, Inc., Seattle

Overview wavelets are analysis tools mainly for time series analysis (focus of this tutorial) image analysis (will not cover) as a subject, wavelets are relatively new (1983 to present) synthesis of many new/old ideas keyword in 10, 558+ articles & books since 1989 (000+ in the last year alone) broadly speaking, have been two waves of wavelets continuous wavelet transform(1983 and on) discrete wavelet transform(1988 and on) 1

Game Plan introduce subject via CWT describe DWT and its main products multiresolution analysis (additive decomposition) analysis of variance ( power decomposition) describe selected uses for DWT wavelet variance (related to Allan variance) decorrelation of fractionally differenced processes (closely related to power law processes) signal extraction (denoising)

What is a Wavelet? wavelet is a small wave (sinusoids are big waves ) real-valued ψ(t) is a wavelet if 1. integral of ψ(t) is zero: ψ(t) dt = 0. integral of ψ (t) is unity: ψ (t) dt = 1 (called unit energy property) wavelets so defined deserve their name because # says we have, for every small E >0, T ψ (t) dt < 1 E, T for some finite T (might be quite large!) length of [ T,T ] small compare to [, ] # says ψ(t) must be nonzero somewhere #1 says ψ(t) balances itself above/below 0 Fig. 1: three wavelets Fig. : examples of complex-valued wavelets 3

Basics of Wavelet Analysis: I wavelets tell us about variations in local averages to quantify this description, let x(t) be a signal real-valued function of t will refer to t as time (but can be, e.g., depth) consider average value of x(t) over[a, b]: 1 b x(u) du α(a, b) b a a reparameterize in terms of λ & t A(λ, t) α(t λ,t + λ )= λ b a is called scale 1 t+ λ λ t λ t = (a + b)/ is center time of interval x(u) du A(λ, t) is average value of x(t) over scale λ at t 4

Basics of Wavelet Analysis: II average values of signals are of wide-spread interest hourly rainfall rates monthly mean sea surface temperatures yearly average temperatures over central England etc., etc., etc. (Rogers & Hammerstein, 1951) Fig. 3: fractional frequency deviates in clock 571 can regard as averages of form[t 1,t+ 1 ] t is measured in days (one measurment per day) plot shows A(1,t) versus integer t A(1,t)=0 master clock & 571 agree perfectly A(1,t) < 0 clock 571 is losing time can easily correct if A(1,t) constant quality of clock related to changes in A(1,t) 5

Basics of Wavelet Analysis: III can quantify changes in A(1,t) via D(1,t 1 ) A(1,t) A(1,t 1) or, equivalently, = t+ 1 1 t 1 x(u) du t t 3 x(u) du, D(1,t) = A(1,t + 1 ) A(1,t 1 ) t+1 t = x(u) du x(u) du t t 1 generalizing to scales other than unity yields D(λ, t) A(λ, t + λ ) A(λ, t λ ) 1 t+λ 1 t = x(u) du x(u) du λ t λ λ t D(λ, t) often of more interest than A(λ, t) can connect to Haar wavelet: write with D(λ, t) = ψ λ,t (u)x(u) du ψ λ,t (u) 1/λ, t λ u <t; 1/λ, t u <t+ λ; 0, otherwise. 6

Basics of Wavelet Analysis: IV specialize to case λ = 1 and t =0: ψ 1,0 (u) comparison to ψ H(u) yields 1, 1 u <0; 1, 0 u <1; 0, otherwise. ψ H 1,0 (u) = ψ (u) Haar wavelet mines out info on difference between unit scale averages at t = 0 via ψ H (u)x(u) du W H (1, 0) to mine out info at other t s, just shift ψ H(u): 1, t 1 u <t; ψ H H H 1 1,t(u) ψ (u t); i.e., ψ 1,t (u) =, t u<t+1; 0, otherwise Fig. 4: top row of plots to mine out info about other λ s, form 1 λ, t λ u<t; H 1 u t ψ λ,t (u) ψ H = 1, t u <t+ λ; λ λ 0, λ otherwise. Fig. 4: bottomrow of plots 7

Basics of Wavelet Analysis: V H can check that ψ λ,t (u) is a wavelet for all λ & t H use ψ λ,t (u) to obtain W H (λ, t) ψ H λ,t(u)x(u) du D(λ, t) left-hand side is Haar CWT can do the same with other wavelets: 1 u t W (λ, t) ψλ,t (u)x(u) du, where ψ λ,t (u) ψ λ λ left-hand side is CWT based on ψ(u) interpretation for ψ fdg(u) and ψ Mh(u) (Fig. 1): differences of adjacent weighted averages 8

Basics of Wavelet Analysis: VI basic CWT result: if ψ(u) satisfies admissibility condition, can recover x(t) fromits CWT: 1 1 t u dλ x (t) = W (λ, t) ψ du, λ λ C 0 ψ λ where C ψ is constant depending just on ψ conclusion: W (λ, t) equivalent to x(t) can also show that 1 dλ x (t) dt = W (λ, t) dt C 0 λ LHS called energy in x(t) ψ RHS integrand is energy density over λ & t Fig. 3: Mexican hat CWT of clock 571 data 9

Beyond the CWT: the DWT critique: have transformed signal into an image can often get by with subsamples of W (λ, t) leads to notion of discrete wavelet transform(dwt) can regard as dyadic slices through CWT can further subsample slices at various t s DWT has appeal in its own right most time series are sampled as discrete values (can be tricky to implement CWT) can formulate as orthonormal transform (facilitates statistical analysis) approximately decorrelates certain time series (including power law processes) standardization to dyadic scales often adequate can be faster than the fast Fourier transform! will concentrate on DWT for remainder of tutorial 10

Overview of DWT let X = [X 0,X 1,...,X N 1 ] T be observed time series (for convenience, assume N integer multiple of J 0) let W be N N orthonormal DWT matrix W = WX is vector of DWT coefficients orthonormality says X = W T W,so X W can partition W as follows: W1 W =. W J 0 V J 0 j Wj contains N j = N/ wavelet coefficients j 1 related to changes of averages at scale τj = (τ j is jth dyadic scale) related to times spaced j units apart V contains N = N/ J 0 J 0 J 0 scaling coefficients related to averages at scale λ J0 = J 0 related to times spaced J 0 units apart 11

Example: Haar DWT Fig. 5: W for Haar DWT with N =16 first 8 rows yield W 1 changes on scale 1 next 4 rows yield W changes on scale next rows yield W 3 changes on scale 4 next to last row yields W 4 change on scale 8 last row yields V 4 average on scale 16 Fig. 6: Haar DWT coefficients for clock 571 1

DWT in Terms of Filters filter X 0,X 1,...,X N 1 to obtain Lj 1 j/ W j,t J h j,l X t l mod N, t =0, 1,...,N l=0 where h j,l is jth level wavelet filter note: circular filtering 1 subsample to obtain wavelet coefficients: W j,t = j/ W j, j (t+1) 1, t =0, 1,...,N j 1, where W j,t is tth element of W j Figs. 7 & 8: Haar, D(4), C(6) & LA(8) wavelet filters jth wavelet filter is band-pass with pass-band [ 1 1 j+1, j ] note: jth scale related to interval of frequencies similarly, scaling filters yield V J0 Figs. 9 & 10: Haar, D(4), C(6) & LA(8) scaling filters J 1 0th scaling filter is low-pass with pass-band [0, J 0 +1 ] 13

Pyramid Algorithm: I can formulate DWT via pyramid algorithm elegant iterative algorithmfor computing DWT implicitly defines W computes W = WX using O(N) multiplications brute force method uses O(N ) FFT algorithmuses O(N log N) algorithmmakes use of two basic filters wavelet filter h l of unit scale h l h 1,l associated scaling filter g l 14

The Wavelet Filter: I let h l,l =0,...,L 1, be a real-valued filter L is filter width so h 0 =0& h L 1 =0 L must be even assume h l = 0 for l <0& l L h l called a wavelet filter if it has these 3 properties 1. summation to zero: L J 1 h l =0 l=0. unit energy: L J 1 h l =1 l=0 3. orthogonality to even shifts: L J 1 h l h l+n = J h l h l+n =0 l=0 l= for all nonzero integers n & 3 together called orthonormality property 15

The Wavelet Filter: II transfer & squared gain functions for h l : L J 1 iπfl H(f) h l e & l=0 H(f) H(f) can argue that orthonormality property equivalent to H(f)+ H(f + 1 ) = for all f Fig. 11: H(f) for Daubechies wavelet filters L = case is Haar wavelet filter filter cascade with averaging & differencing filters high-pass filter with pass-band [ 1, 1 ] 4 can regard as half-band filter 16

The Scaling Filter: I scaling filter: g l ( 1) l+1 h L 1 l reverse h l & flip sign of every other coefficient e.g.: h 0 = 1 & h 1 = 1 1 g0 = g 1 = g l is quadrature mirror filter for h l properties of h l imply g l has these properties: 1. summation to ±, so will assume L J 1 l=0 g l =. unit energy: L J 1 g l =1 l=0 3. orthogonality to even shifts: LJ 1 J g l g l+n = g l g l+n =0 l=0 l= for all nonzero integers n 4. orthogonality to wavelet filter at even shifts: L J 1 g l h l+n = J g l h l+n =0 l=0 l= for all integers n 17

The Scaling Filter: II transfer & squared gain functions for g l : L 1 G(f) J gle iπfl l=0 & G(f) G(f) can argue that G(f) = H(f 1 ) have G(0) = H( 1 )= H( 1 )& G( 1 )= H(0) since h l is high-pass, g l must be low-pass low-pass filter with pass-band [0, 1 ] 4 can also regard as half-band filter orthonormality property equivalent to G(f)+ G(f + 1 )= or H(f)+ G(f) = for all f 18

Pyramid Algorithm: II define V 0 X and set j = 1 input to jth stage of pyramid algorithm is V j 1 V j 1 is full-band related to frequencies [0, 1 j ]in X filter with half-band filters and downsample: L 1 J W j,t h l V j 1,t+1 l mod Nj 1 l=0 L J 1 V j,t g l V j 1,t+1 l mod Nj 1, l=0 t =0,...,N j 1 place these in vectors W j & V j Wj are wavelet coefficients for scale τ j = j 1 V j are scaling coefficients for scale λ j = j increment j and repeat above until j = J 0 yields DWT coefficients W 1,...,W J0, V J0 19

Pyramid Algorithm: III can formulate inverse pyramid algorithm (recovers V j 1 from W j and V j ) algorithmimplicitly defines transformmatrix W partition W commensurate with W j : W = W1 W1 W W.. parallels W =.. W J0 W J0 V J0 V J0 rows of W j use jth level filter h j,l with DFT 1 j H( j f) G( l f) l=0 (h j,l has L j =( j 1)(L 1) + 1 nonzero elements) W j is N j N matrix such that Wj = W j X and W j W T j = I Nj 0

Two Consequences of Orthonormality multiresolution analysis (MRA) X = W T W = J W T J 0 T J 0 j W j + V J V 0 J 0 j=1 j=1 scale-based additive decomposition D j s & S J0 called details & smooth analysis of variance consider energy in time series: IXI = X T X = NJ 1 Xt t=0 J D j + S J0 energy preserved in DWT coefficients: IWI = IWXI = X T W T WX = X T X = IXI since W 1,...,W J0, V J0 partitions W, have J 0 IWI = IWj I + IV J0 I, j=1 leading to analysis of sample variance: 1 N J 1 1 J 0 1 σˆ (X I I I I t X) = W j + V J0 X N t=0 N j=1 N scale-based decomposition (cf. frequency-based) 1

Variation: Maximal Overlap DWT can eliminate downsampling and use 1 L J j 1 j/ l=0 W j,t h j,l X t l mod N, t =0, 1,...,N 1 to define MODWT coefficients W j (& also V j) unlike DWT, MODWT is not orthonormal (in fact MODWT is highly redundant) like DWT, can do MRA & analysis of variance: IXI = J 0 IW j I + IV J 0 I j=1 unlike DWT, MODWT works for all samples sizes N (i.e., power of assumption is not required) if N is power of, can compute MODWT using O(N log N) operations (i.e., same as FFT algorithm) contrast to DWT, which uses O(N) operations Fig. 1: Haar MODWT coefficients for clock 571 (cf. Fig. 6 with DWT coefficients)

Definition of Wavelet Variance let X t, t =..., 1, 0, 1,..., be a stochastic process run X t through jth level wavelet filter: L W j,t J j 1 hj,l X t l, t =..., 1, 0, 1,..., l=0 which should be contrasted with W Lj 1 j,t J hj,l X t l mod N, t =0, 1,...,N 1 l=0 definition of time dependent wavelet variance (also called wavelet spectrum): ν X,t(τ j ) var {W j,t }, assuming var {W j,t } exists and is finite ν X,t(τ j ) depends on τ j and t will consider time independent wavelet variance: ν X(τ j ) var {W j,t } (can be easily adapted to time varying situation) 3

Rationale for Wavelet Variance decomposes variance on scale by scale basis useful substitute/complement for spectrum useful substitute for process/sample variance 4

Variance Decomposition suppose X t has power spectrum S X (f ): 1/ SX (f ) df =var {X t }; 1/ i.e., decomposes var {X t } across frequencies f involves uncountably infinite number of f s S X (f ) f contribution to var {X t } due to f s in interval of length f centered at f wavelet variance analog to fundamental result: J ν j =1 X (τ j )=var{x t } i.e., decomposes var {X t } across scales τ j recall DWT/MODWT and sample variance involves countably infinite number of τ j s ν X (τ j ) contribution to var {X t } due to scale τ j ν X (τ j ) has same units as X t (easier to interpret) 5

Spectrum Substitute/Complement because h j,l bandpass over [1/ j+1, 1/ j ], ν X (τ j ) 1/ j +1 S X (f) df 1/ j if S X (f) featureless, info in ν X (τ j ) info in S X (f) ν X (τ j ) more succinct: only 1 value per octave band α example: SX (f) f, i.e., power law process can deduce α fromslope of log S X (f) vs. log f implies ν X (τ j ) τ j α 1 approximately can deduce α fromslope of log ν X (τ j ) vs. log τ j no loss of info using ν X (τ j ) rather than S X (f) with Haar wavelet, obtain pilot spectrumestimate proposed in Blackman & Tukey (1958) 6

Substitute for Variance: I can be difficult to estimate process variance ν X(τ j ) useful substitute: easy to estimate & finite let µ = E{X t} be known, σ =var {X t } unknown can estimate σ using 1 N J 1 σ (X t µ) N t=0 estimator above is unbiased: E{σ } = σ if µ is unknown, can estimate σ using σˆ 1 NJ 1 N t=0 (X t X) there is some (non-pathological!) X t such that E{σˆ} <E σ for any gvien E >0& N 1 σˆ can badly underestimate σ! example: power law process with 1 < α<0 7

Substitute for Variance: II Q: why is wavelet variance useful when σ is not? replaces global variability with variability over scales if X t stationary with mean µ, then L j J 1 L j 1 E{W j,t } = h j,l E{ X t l } = µ J h j,l =0 l=0 l=0 because l h j,l =0 E{W j,t } known, so can get unbiased estimator of var {W j,t } = ν X (τ j ) certain nonstationary X t have well-defined ν X (τ j ) example: power law processes with α 1 (example of process with stationary increments) 8

Estimation of Wavelet Variance: I can base estimator on MODWT of X 0,X 1,...,X N 1 : Lj 1 W j,t J h j,l X t l mod N, t =0, 1,...,N 1 l=0 (DWT-based estimator possible, but less efficient) recall that L j 1 W j,t J h j,l X t l, t =0, ±1, ±,... l=0 so W j,t = W j,t if mod not needed: L j 1 t <N if N L j 0, unbiased estimator of ν X (τ j )is 1 N J 1 1 N J 1 νˆ X (τ W j ) j,t = W j,t, N L j +1 t=lj 1 M where M j N L j +1 j t=l j 1 can also construct biased estimator of ν X (τ j ): N L J W J J j,t = Wj,t + W N j,t t=0 t=0 t=lj 1 1 1 1 j N 1 ν X (τ j ) N 1st sumin parentheses influenced by circularity 9

Estimation of Wavelet Variance: II biased estimator unbiased if {X t } white noise biased estimator offers exact analysis of σˆ; unbiased estimator need not biased estimator can have better mean square error (Greenhall et al., 1999; need to reflect X t ) 30

Statistical Properties of νˆx (τ j ) suppose {W j,t } Gaussian, mean 0&spectrum S j (f) suppose square integrability condition holds: A j 1/ S j (f) df < & S j (f) > 0 1/ (holds for power law processes if L large enough) can show νˆx (τ j ) asymptotically normal with mean ν X (τ j ) & large sample variance A j /M j can estimate A j and use with νˆ X (τ j ) to construct confidence interval for ν X (τ j ) example Fig. 13: clock errors X t X t along with (i) (i 1) (i 1) differences X t X t X t 1 for i = 1, Fig. 14: νˆx (τ j ) for clock errors Fig. 15: νˆ (τ j ) for Y t Y (1) X t Haar νˆ (τ Y j) related to Allan variance σ Y (,τ j ): (0) ν (τ j )= 1 σ Y ( j ) Y,τ 31

Decorrelation of FD Processes X t fractionally differenced if its spectrumis σ E S X (f) =, sin(πf) δ where σ E > 0 and 1 <δ< 1 note: for small f, have S X (f) C/ f δ ; i.e., power law with α = δ if δ = 0, FD process is white noise if 0 < δ< 1, FD stationary with long memory can extend definition to δ 1 nonstationary 1/f type process also called ARFIMA(0,δ,0) process Fig. 16: DWT of simulated FD process, δ = 0.4 (sample autocorrelation sequences (ACSs) on right) 3

DWT as Whitening Transform sample ACSs suggest W j uncorrelated since FD process is stationary, so are W j (ignoring terms influenced by circularity) Fig. 17: spectra for W j, j = 1,, 3, 4 W j & W j, j = j, approximately uncorrelated (approximation improves as L increases) DWT thus acts as a whitening transform lots of uses for whitening property, including: 1. testing for variance changes. bootstrapping time series statistics 3. estimating δ for stationary/nonstationary fractional difference processes with trend 33

Estimation for FD Processes: I extension of work by Wornell; McCoy & Walden problem: estimate δ fromtime series U t such that where U t = T t + X t T t r j=0 a j jt is polynomial trend X t is FD process, but can have δ 1 DWT wavelet filter of width L has embedded differencing operation of order L/ if L r + 1, reduces polynomial trend to 0 can partition DWT coefficients as where W = W s + W b + W w W s has scaling coefficients and 0s elsewhere W s has boundary-dependent wavelet coefficients W w has boundary-independent wavelet coefficients 34

Estimation for FD Processes: II since U = W T W, can write U = W T (Ws + W b )+ W W w T + X Fig. 18: example with fractional frequency deviates can use values in W w to formlikelihood: where L(δ, σ E ) T N J 0 j 1 W /(σ j ) j,t+l 1 e j 1/ j=1 t=1 πσj σ 1/ σ E j H 1/ j(f) df ; sin(πf) δ and H j (f) is squared gain for h j,l leads to maximum likelihood estimator δˆ for δ works well in Monte Carlo simulations get δ ˆ =0.39. ± 0.03 for fractional frequency deviates 35

DWT-based Signal Extraction: I DWT analysis of X yields W = WX DWT synthesis X = W T W yields multiresolution analysis (MRA) estimator of signal D hidden in X: modify W to get W use W to formsignal estimate: D W T W key ideas behind wavelet-based signal estimation DWT can isolate signals in small number of W n s can threshold or shrink W n s key ideas lead to waveshrink (Donoho and Johnstone, 1995) 36

DWT-based Signal Extraction: II thresholding schemes involve 1. computing W WX. defining W (t) as vector with nth element (t) 0, if W W n = n δ; some nonzero value, otherwise, where nonzero values are yet to be defined 3. estimating D viad (t) W T W (t) simplest scheme is hard thresholding: W (ht) n = 0, if Wn δ; W n, otherwise. Fig. 19: solid line ( kill/keep strategy) alterative scheme is soft thresholding: where W (st) n = sign {W n } ( W n δ) +, +1, if W n > 0; x, if x 0; sign {W n } 0, if W n =0; and (x) + 0, if x < 0. 1, if W n < 0. Fig. 19: dashed line 37

DWT-based Signal Extraction: III third scheme is mid thresholding: W (mt) n = sign {W n } ( W n δ) ++, where ( Wn δ) δ; ( +, if W n < W n δ) ++ W n, otherwise Fig. 19: dotted line Q: how should δ be set? A: universal threshold (Donoho & Johnstone, 1995) (lots of other answers have been proposed) specialize to model X = D + E, where E is Gaussian white noise with variance σe universal threshold: δ U [σe log(n)] rationale for δ U: suppose D = 0 & hence W is white noise also as N, have P max W n δ U 1 n so all W (ht) = 0 with high probability will estimate correct D with high probability 38

DWT-based Signal Extraction: IV can estimate σ E using median absolute deviation (MAD): median { W 1,0, W 1,1,..., W N 0.6745 1, } 1 σˆ(mad), where W 1,t s are elements of W 1 Fig. 0: application to NMR series has potential application in dejamming GPS signals (with roles of signal and noise swapped!) 39

Web Material and Books Wavelet Digest http://www.wavelet.org/ MathSoft s wavelet resource page books http://www.mathsoft.com/wavelets.html R. Carmona, W. L. Hwang & B. Torrésani (1998), Practical Time-Frequency Analysis, Academic Press S. G. Mallat (1999), A Wavelet Tour of Signal Processing (Second Edition), Academic Press R. T. Ogden (1997), Essential Wavelets for Statistical Applications and Data Analysis, Birkhäuser D. B. Percival & A. T. Walden (000), Wavelet Methods for Time Series Analysis, Cambridge University Press (will appear in July/August) http://www.staff.washington.edu/dbp/wmtsa.html B. Vidakovic (1999), Statistical Modeling by Wavelets, John Wiley & Sons. 40

Software Matlab Wavelab (free): http://www-stat.stanford.edu/~wavelab WaveBox (commercial): http://www.toolsmiths.com/ Mathcad Wavelets Extension Pack (commercial): http://www.mathsoft.com/mathcad/ebooks/wavelets.asp S-Plus software WaveThresh (free): http://lib.stat.cmu.edu/s/wavethresh S+Wavelets (commercial): http://www.mathsoft.com/splsprod/wavelets.html 41