Integrated Wavelet Denoising Method for High-Frequency Financial Data Forecasting

Similar documents

Master of Mathematical Finance: Course Descriptions

Introduction to time series analysis

A Wavelet Based Prediction Method for Time Series

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm

Statistical Modeling by Wavelets

Wavelet Analysis Based Estimation of Probability Density function of Wind Data

INDIRECT INFERENCE (prepared for: The New Palgrave Dictionary of Economics, Second Edition)

Threshold Autoregressive Models in Finance: A Comparative Approach

Java Modules for Time Series Analysis

Maximum likelihood estimation of mean reverting processes

Time Series Analysis

An Evaluation of Irregularities of Milled Surfaces by the Wavelet Analysis

A Coefficient of Variation for Skewed and Heavy-Tailed Insurance Losses. Michael R. Powers[ 1 ] Temple University and Tsinghua University

A Simple Model for Intra-day Trading

How To Understand The Theory Of Probability

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Least Squares Estimation

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

Co-integration of Stock Markets using Wavelet Theory and Data Mining

Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods

Affine-structure models and the pricing of energy commodity derivatives

Supplement to Call Centers with Delay Information: Models and Insights

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH

Operations Research and Financial Engineering. Courses

LOGNORMAL MODEL FOR STOCK PRICES

Wavelet analysis. Wavelet requirements. Example signals. Stationary signal 2 Hz + 10 Hz + 20Hz. Zero mean, oscillatory (wave) Fast decay (let)

SYSTEMS OF REGRESSION EQUATIONS

The term structure of Russian interest rates

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

Four Essays on the Empirical Properties of Stock Market Volatility

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Contents. List of Figures. List of Tables. List of Examples. Preface to Volume IV

PHASE ESTIMATION ALGORITHM FOR FREQUENCY HOPPED BINARY PSK AND DPSK WAVEFORMS WITH SMALL NUMBER OF REFERENCE SYMBOLS

TEMPORAL CAUSAL RELATIONSHIP BETWEEN STOCK MARKET CAPITALIZATION, TRADE OPENNESS AND REAL GDP: EVIDENCE FROM THAILAND

Financial TIme Series Analysis: Part II

CS Introduction to Data Mining Instructor: Abdullah Mueen

Geostatistics Exploratory Analysis

The CUSUM algorithm a small review. Pierre Granjon

Univariate and Multivariate Methods PEARSON. Addison Wesley

When to Refinance Mortgage Loans in a Stochastic Interest Rate Environment

Statistics in Retail Finance. Chapter 6: Behavioural models

On Correlating Performance Metrics

Part II Redundant Dictionaries and Pursuit Algorithms

Using simulation to calculate the NPV of a project

Volatility modeling in financial markets

Bootstrapping Big Data

Time Series Analysis

A Confidence Interval Triggering Method for Stock Trading Via Feedback Control

Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic.

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections

SPARE PARTS INVENTORY SYSTEMS UNDER AN INCREASING FAILURE RATE DEMAND INTERVAL DISTRIBUTION

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan , Fall 2010

7 Time series analysis

Detection of changes in variance using binary segmentation and optimal partitioning

The information content of lagged equity and bond yields

Summary Nonstationary Time Series Multitude of Representations Possibilities from Applied Computational Harmonic Analysis Tests of Stationarity

Statistical Machine Learning

MARKETS, INFORMATION AND THEIR FRACTAL ANALYSIS. Mária Bohdalová and Michal Greguš Comenius University, Faculty of Management Slovak republic

Stephane Crepey. Financial Modeling. A Backward Stochastic Differential Equations Perspective. 4y Springer

Statistics Graduate Courses

Monte Carlo testing with Big Data

Quantitative Methods for Finance

Measuring downside risk of stock returns with time-dependent volatility (Downside-Risikomessung für Aktien mit zeitabhängigen Volatilitäten)

A comparison between different volatility models. Daniel Amsköld

Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts

Multiple Linear Regression in Data Mining

3. Regression & Exponential Smoothing

Measuring Line Edge Roughness: Fluctuations in Uncertainty

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

From the help desk: Bootstrapped standard errors

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Local outlier detection in data forensics: data mining approach to flag unusual schools

Time series analysis of data from stress ECG

1 Teaching notes on GMM 1.

Monte Carlo Simulation

Monte Carlo Methods in Finance

Predictive Indicators for Effective Trading Strategies By John Ehlers

Analysis of a Production/Inventory System with Multiple Retailers

Analysis of Financial Time Series

Tutorial 5: Hypothesis Testing

ANALYZER BASICS WHAT IS AN FFT SPECTRUM ANALYZER? 2-1

Forecasting methods applied to engineering management

Big Data, Socio- Psychological Theory, Algorithmic Text Analysis, and Predicting the Michigan Consumer Sentiment Index

The relation between news events and stock price jump: an analysis based on neural network

A Primer on Forecasting Business Performance

How To Analyze The Time Varying And Asymmetric Dependence Of International Crude Oil Spot And Futures Price, Price, And Price Of Futures And Spot Price

APPENDIX N. Data Validation Using Data Descriptors

Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies

Predictability of Non-Linear Trading Rules in the US Stock Market Chong & Lam 2010

TCOM 370 NOTES 99-4 BANDWIDTH, FREQUENCY RESPONSE, AND CAPACITY OF COMMUNICATION LINKS

The Variability of P-Values. Summary

Transcription:

Integrated Wavelet Denoising Method for High-Frequency Financial Data Forecasting Edward W. Sun KEDGE Business School, France Yi-Ting Chen School of Computer Science National Chiao Tung University, Taiwan Min-Teh Yu National Chiao Tung University, Taiwan Abstract Intelligent pattern recognition imposes new challenges in high-frequency financial data mining due to its irregularities and roughness. Based on the wavelet transform for decomposing systematic patterns and noise, in this paper we propose a new integrated wavelet denoising method, named smoothness-oriented wavelet denoising algorithm (SOWDA), that optimally determines the wavelet function, maximal level of decomposition, and the threshold rule by using a smoothness score function that simultaneously detects the global and local extrema. We discuss the properties of our method and propose a new evaluation procedure to show its robustness. In addition, we apply this method both in simulation and empirical investigation. Both the simulation results based on three typical stylized features of financial data and the empirical results in analyzing high-frequency financial data from Frankfurt Stock Exchange confirm that SOWDA significantly (based on the RMSE comparison) improves the performance of classical econometric models after denoising the data with the discrete wavelet transform (DWT) and maximal overlap discrete wavelet transform (MODWT) methods. Keywords: Data denoising, DWT, High-frequency data, MODWT, Wavelet Corresponding author: Edward W. Sun, KEDGE Business School, 680 Cours de la Libèration, 33405 Talance Cedex, France. Email: edward.sun@bem.edu. 1

1 Introduction The financial service sector is typically an expert at churning enormous amounts of data, gathered at the tick-by-tick level from customer interactions and transactions. Such ultra-high frequency data have a complex structure of irregularities and roughness that have been described as exhibiting multifractal phenomena - that is, different fragments of the data have different fractal properties (see Mandelbrot (1982)). The heterogeneity characterized by multifractal phenomena is caused by a large number of instantaneous changes in the markets and trading noises (see Sun et al. (2007) and references therein). Mining high-frequency data turns out to be more fundamental for financial informatics and stimulates interest in intelligently extracting information conveyed in such data. For example, McCulloch and Tsay (2004) discuss the non-linear behavior of high-frequency data and propose a price change duration model for analyzing price changes, Kelly and Steigerwald (2004) apply the stochastic volatility model to replicate features of high-frequency data in their market microstructure analysis, and Sun et al. (2009) present a high-frequency Value-at-Risk measure based on Lèvy processes. The wavelet method is one of the multifractal spectrum computing methods and is proven to be a reliable tool in econometric analysis (see, for example, Fan and Gençay (2010), Fan and Wang (2007), and Hong and Kao (2004)). Particularly, it is suitable for time series analysis, like in smoothing, denoising, and jump detection (see, for example, Gençay et al. (2010), In et al. (2011), Donoho and Johnstone (1998), and Sun and Meinl (2012), among others). Ramsey (2002) highlights some research areas where wavelet analysis might be applied in economics, and Crowley (2007) provides a survey about how wavelet methods have been used in the economics and finance literature. The advantage of the wavelet method is that it performs a multiresolution analysis - that is, it allows us to analyze the data at different scales (each one associated with a particular frequency passband) at the same time. In this way, wavelets can identify single events truncated in one frequency range as well as coherent structures across different scales. Several studies have applied wavelet methods in mining financial data. For example, Ramsey and Lampart (1998) and Kim and In (2008) apply wavelets to analyze relationships and dependencies among key macroeconomic and financial variables. Gençay et al. (2003) propose a method based on a wavelet multiscaling approach for decomposing time series data. Gençay et al. (2005) introduce a method to estimate the systematic risk of an asset based on a wavelet multiscaling approach of decomposing the underlying time series on a scale-by-scale basis. Lada and Wilson (2006) develop a wavelet-based spectral method for steady-state simulation analysis. Esteban-Bravo and Vidal-Sanz (2007) propose a wavelet-based method for solving boundary value problems in growth models. Jensen (2007) implements compactly supported wavelets to develop an estimator of the long memory process. Laukaitis (2008) applies wavelet transforms for high-frequency data denoising in the study of credit card intraday cash flow and intensity of transactions. Gençay and Gradojevic (2011) introduce a wavelet approach to estimate the parameters of a linear regression 2

model. Sun et al. (2011) propose a wavelet method for analyzing the currency market with high-frequency data. Aguiar-Conraria et al. (2012) apply wavelet tools in macroeconomics data analysis. Haven et al. (2012) show the efficiency of the wavelet method in denoising option price data. A classic assumption for data mining is that the data are generated by certain systematic patterns plus random noise. Denoising high-frequency data provides a fundamental tool to extract the systematic patterns conveyed in the data. As Sun and Meinl (2012) point out a specific problem arises when the trend component exhibits occasional jumps that are in contrast to the slow evolving long-term trend. These occasional jumps are often caused by, for example, unexpected large transactions or extreme prices and should not be contributed to the normal short-term variations (since jumps are often considered as noises), but indeed to the long-run trend. Traditional linear denoising methods (e.g., moving average) usually fail to capture this information accurately as these linear methods tend to blur out jumps, while non-linear filters are not appropriate to smooth out high-frequency fluctuations sufficiently since the trends extracted by these methods are not smooth enough (i.e., usually with kicks) to present long-run dynamic information (see Sun and Meinl (2012) and references therein). Several works have been done to overcome the above-mentioned problem. For example, Connor and Rossiter (2005) estimate the wavelet variance by using non-decimated wavelet transforms. Studies of applying wavelets in denoising data and coefficient construction can be found in Gençay et al. (2002), Keinert (2004), Mallat and Hwang (1992), and Percival and Walden (2006), among others. Among these methods, both DWT and MODWT need to decide the wavelet function, level of decomposition, and thresholding rule. A common approach in choosing the wavelet function is to use the shortest wavelet filter that can provide reasonable results (see Percival and Walden (2006)). The level of decomposition leads to the choice that considers the higher the better" in general. The thresholding rule, which is a function identifying the wavelet coefficients to be deleted, has also been investigated in academic research (see, for example, Gençay et al. (2002)). Meinl and Sun (2012) propose the local linear scaling approximation (LLSA) method in denoising high-frequency data and show its robustness in empirical application under statistical goodness-of-fit tests. However, LLSA focuses on the linear characteristics around jumps for the re-constructed wavelet coefficients, but only takes advantage of it when the wavelet function is pre-determined. The remaining challenge is how to determine the combination of the wavelet function, level of decomposition, and thresholding rule to reach an optimal smoothness that generally improves the performance of classic models after denoising the data. The algorithm (named SOWDA) proposed in this paper can optimally determine the wavelet function, level of decomposition, and thresholding rule by using smoothness as a regularization variable. The goal of our method is to denoise the data and obtain the trend that: (1) contains as much information as possible, (2) exhibits a certain degree of smoothness that can utilize the classic model, and (3) preserves as few artifacts (i.e., undesired structures, like oscillating nature, 3

generated through the denoising process) as possible. In our method, we define the measures for smoothness. Intuitively, these measures for smoothness describe the characteristics of the denoised data with wavelet transform that optimally provide output for further analysis with the classical model. We show that the resulting difference sequence between the denoised data and the original signal must converge in probability at a predetermined confidence level. This requires that: (1) the structural change (e.g., jumps) of the denoised data and the original signal should be synchronous, (2) there are no outliers in the denoised data, and (3) the local extremum in the denoised data should be bounded. We show the analytical properties of SOWDA that confirm the proposed method can lead to an optimal solution to satisfy these requirements. Therefore, SOWDA can be used to improve the performance of econometric models that are parametrically built on the i.i.d. white noise assumption by denoising the unexpected outliers. In addition, we propose a new performance evaluation method based on the jump detection test suggested by Xue et al. (2014). Through this procedure, we verify the robustness of SOWDA in its performance. We investigate the performance of SOWDA with numerical simulations that consider some typical patterns often observed in high-frequency financial data, e.g., excessive volatility and regime switching. With a comparison, our method results in a better performance than the alternative methods, the numerical results confirm the analytical properties of SOWDA that show the proposed algorithm maintains the original wavelet transform s computational complexity, and its approximation errors are bounded. In order to confirm the computational reliability and consistency that we have shown in the simulations, we further perform an empirical investigation by applying our algorithm with high-frequency data (5-minute data) of DAX 30 stocks. In the empirical study, we work both on the in-sample model fitting and out-of-sample (one-step ahead and two-step ahead) forecasting. The results we obtain from such a large sample investigation coincide with the previous simulation results. When using the denoised data generated by our algorithm for forecasting with the high-frequency data, we find that the performances (i.e., accuracy of forecasting) of the classic models e.g., AR, ARMA, and ARMA-GARCH, significantly improve (based on the RMSE comparison) and confirm the efficiency of the proposed algorithm. We organized the paper as follows. We describe the methodology in details and summarize it with an algorithm chart in Section 2. In Section 3, we show the implementation of SOWDA and its analytical properties with respect to the jumps in high frequency data and proposes a new performance measure procedure based on the jump detection test. Section 4 investigates the performance of SOWDA by conducting simulations. The simulation results confirm the superior performance of our method. In Section 5 we execute an empirical study by applying our method to analyze the high-frequency data collected from Frankfurt Stock Exchange (i.e., the DAX 30 stocks), and illustrate both in-sample modeling and out-of-sample forecasting results. We summarize our conclusions in Section 6. 4

2 The Methodology 2.1 Wavelets and wavelet transforms for denoising Wavelets are bases of L 2 (R) first developed to analyze geophysical signals (see Morlet (1983)). In contrast to the Fourier transformation they enable a localized time-frequency analysis with wavelet functions that usually have either compact support or decay exponentially fast to zero. In addition, wavelets provide a multiresolution of a signal that allows for analyzing the signal simultaneously on different (usually dyadic) scales. To achieve these desirable properties, the wavelet transformation is subject to the following conditions, i.e., the admissibility condition and orthonormality: ψ(ω) 2 dω <, ω ψ 2 (u) du = 1 and l=0 l=0 l=0 ψ(u) du = 0. The latter condition signifies that ψ oscillates around zero. An analog discrete formulation is developed by introducing discrete wavelet filter banks that correspond to a linear moving weighted average high-pass filter (see Mallat (1989)). For this, equivalent discrete oscillation and orthonormality conditions must hold: L 1 L 1 L 1 h l = 0, h 2 l = 1 and h l h l+2n = 0 for all n N, where h l = (h 0, h 1,, h L 1 ) is a finite length discrete wavelet filter. We note that these conditions are only necessary, but not sufficient to construct reasonable high-pass wavelet filters, requiring additional regularity conditions. Daubechies (1992) provides a thorough treatment of this topic. In our paper we focus on the application of wavelets and only consider the most commonly used wavelet functions in financial data analysis, e.g., Daubechies (D4) and the least asymmetric (LA) wavelet, which differ in their constructional regularity conditions and are the ones in their class with the smallest support (see Percival and Walden (2006) and Sun and Meinl (2012)). A discrete wavelet transform (DWT) is an orthogonal transform of a vector (discrete signal or time series data) X of length N (which must be a multiple of 2 J ) into J wavelet coefficient vectors W j R N/2j, 1 j J and one scaling coefficient vector V J R N/2J : [W 1,..., W J, V J ] = WX. (1) With the transformation matrix W determined by the wavelet filter banks h, it is more convenient to use the pyramid algorithm developed by Mallat (1989), with a computational complexity of O(N). Applying the inverse transform on these vectors yields: [W 1,..., W J, V J ]W = S J + 5 J D j = X, (2) j=1

that is, an additive decomposition of the original signal. When S J 1 = S J + J D j (3) holds, S j, approximations (i. e., moving weighted averages) of X at different dyadic scales, and D j, the detail vectors that are the details we lose at each approximation level, result in a multiscale decomposition. For each scale j we can separate the signal into high and low frequencies by a wavelet filter with the bandwidth determined by j. Maximal overlap discrete wavelet transform (MODWT) is an extension of the traditional DWT, which differs from DWT in the sense that all vectors are in R N and have a higher computational complexity of O(N log 2 N) (see, for example, Percival and Walden (2006) and references therein). For all discrete wavelet transforms in this paper we utilize circular filtering extensions at the borders, that is, for data points X 0, X 1,... and X N+1, X N+2,..., required for the moving weighted average, we substitute them by X N, X N 1,... with X 1, X 2,..., respectively. Percival and Walden (2006) provide a discussion on boundary extensions and distortions, as well as methods to deal with them. Wavelet denoising might result in different outcomes due to different choice of input variables. As we have pointed out, wavelets are oscillating functions and transform the signal orthogonally in the Hilbert space. Consequently, we do not have any trouble from reconstruction based on the wavelet transform. However, there are many wavelet functions that meet the requirement for the transform. In practice, different wavelets are used for different reasons. For example, the Haar wavelet has very small support, and wavelets of higher orders such as Daubechies (D4) and least asymmetric (LA) have bigger support. Bigger support can ensure a smoother shape of S j, 1 j J with each wavelet, and the scaling coefficient can carry more information due to the increased filter width. The optimal band-pass filter, which dictates the capacity to isolate features to specific frequency intervals is determined by the length of a wavelet function in approximation. In addition, the wavelet function must be able to mimic the features contained in the signal of interest in order to optimally represent the conveyed information. Therefore, the choice of the wavelet basis function turns out to be important when analyzing a given signal. After choosing the wavelet, we then decompose the signal into several levels. In general, we use the pyramid algorithm to accomplish this (see Gençay et al. (2002)). This algorithm decomposes the signal into detail and approximation coefficients in its first iteration. Each following iteration applies the same procedure on the approximation coefficients from the iteration one step ahead. 1 We can do this at a maximum log 2 (N) times (N is the number of observations). The quality of the denoising process varies with the number of iterations. The resulting outcome also critically depends on the thresholding rule that decides all wavelet coefficients less than a fixed constant in magnitude are zero. The coefficients obtained in each iteration are subject to the thresholding 1 For DWT, we down sample the coefficients, i. e., in each iteration we halve the number of detail coefficients. j= J 6

rule before we reconstruct the denoised signal. This rule identifies the coefficients that represent noise. We thus have three factors that influence the quality of denoising based on the wavelet transform: wavelet function (or mother wavelet), number of maximal iterations (or level of decomposition), and the thresholding rule. However, there is no straightforward method to determine these three factors simultaneously. In this paper we propose a new method that aims to optimally determine the choice of wavelet function, number of maximal iterations, and thresholding (we refer to the choice of them as a combination of denoising factors) simultaneously based on the smoothness-oriented criterion. 2.2 Smoothness-oriented wavelet denoising algorithm (SOWDA) In this section we introduce the method that helps us to decide the combination of denoising factors. Assume that the observed data X can be decomposed as follows: X t = S t + N t, where S t is the true signal and S t is its estimation. Here, N t is the additive noise sampled at time t. In order to evaluate the denoising performance, i.e., to see how close S t is toward S t, we define the smoothness properties as follows. Definition 1. Let x n be a random variable and x n = S t S t. If there exist constants c and ε, then the smoothness is, when ε > 0: lim Pr ( x n c > ε ) = 0. n This definition of smoothness intuitively states that the sequence x n of the difference between S t and S t becomes close to a controllable constant c. In other words, the resulting difference sequence between S t and S t converges in probability to c. This requires (i) the structural change (e.g., jumps) between S t and S t is synchronous, (ii) there are no outliers in x n, and (iii) the local extremum in x n is bounded and leads to the following measures. Definition 2. Let (Y 1, Y 2 ) T be a vector of continuous random variables with marginal distribution functions F 1, F 2, then the coefficient η H is: ( ) η H (u) = lim P Y 2 > F2 1 (u) Y 1 > F1 1 (u), (4) u 1 and the coefficient η L is ( ) η L (u) = lim P Y 2 < F2 1 (u) Y 1 < F1 1 (u). (5) u 0 When ε > 0, u 0, u 0 > u we have: η H (u) η L (u) 1 < ε. We then say Y 1 and Y 2 are synchronous, that is, η H (u) η L (u). 7

When η H > 0, there exists upper tail synchronicity and the positive extreme values in Y 1 and Y 2 can be observed simultaneously. When η L > 0, there exists lower tail synchronicity, and the negative extreme values can be observed simultaneously. We further require the smoothness measure to be able to detect artifacts and jumps. We suggest two different measures here: one considers artifacts (τ 1, based on an outliers test) and the other considers jumps (τ 2, based on local extrema). In other words, we use τ 1 to detect the global extrema and τ 2 the local extrema. Both of them have the ability to detect boundary problems, that is, inefficient approximation at the beginning and end of the signal. We suggest to apply Grubbs test for identifying artifacts, which is an iterative test for outliers based on an approximately normal distributed sample (see Grubbs (1969)). Let µ = 1 T T 1 X t be the sample mean of vector X t and s 2 = 1 T 1 T 1 (X t µ) 2 its sample variance. The test statistics are then given by G = max X i µ. s Here, G can be assumed to be t-distributed, and a test for outliers with significant level α (e.g. α = 0.05) can easily be performed by rejecting the null hypothesis of no outliers if: G > z α = T 1 t2 α,t 2 2 T T T 2 + t 2 α 2 T,T 2. When an outlier (i.e., the global extremum) is detected, it is removed from the data and the test then proceeds. As a measure of the amount of artifacts (or jumps of high magnitude), we can identify the number of iterations to run the test until it confirms there is no outlier. We apply the test until g(x) = 0 and count the number of outliers as a measure of structure. Definition 3. Let C(x) be a function determining whether there is one outlier in vector X: { 1, if G > z α C(x) = 0, if otherwise We define τ 1 as: where n is the sample size. τ 1 = n 1 1 C(x)=1, (6) i=1 In order to control all structural changes to be bounded, our proposed method investigates the local extrema (maxima or minima, respectively) at a certain magnitude. In order to avoid redundant computation (since τ 1 controls the outlier detection), we only run the test procedure for the output data after the wavelet transform. The local extrema referred to here are the largest and smallest values that a function takes at a point within a given neighborhood. 8

Definition 4. If there exists a Λ and Λ R, so that lim sup x n = Λ, n N and there exists a subsequence x kn of x n for which we have that x kn < Λ, n, then Λ is the local maxima. If there exists a λ and λ R, so that lim inf n N x n = λ, and there exists a subsequence x kn of x n for which we have that x kn > λ, n, then λ is the local minima. Definition 5. Let D(x) be a function that detects local maxima { 1, if x i Λ D(x) = 0, if otherwise, and D (x) detects local minima. D (x) = { 1, if x i λ 0, if otherwise. We define τ 2 as: where n is the sample size. τ 2 = n 1 1 D(x)=1 + i=1 n 1 1 D (x)=1, (7) i=1 The algorithm in this paper applies a linear score function T ( ) of τ 1 and τ 2 (i.e., a linear combination of measures for global and local extrema) to compute the overall score: T (τ 1, τ 2 ) = α τ 1 + β τ 2, (8) where α and β are the weights assigned to the score function, e.g., α + β = 1, which can be determined based on the data characteristics. The combination of denoising factors with the lowest value of T (τ 1, τ 2 ) is preferred and identified as the optimal set for the wavelet-based denoising process. 2.2.1 Summary of SOWDA algorithm Given a set of wavelet functions (f ), maximal levels of decomposition (l ), and thresholding rules (s ) as denoising factors, and an input data vector X t as input variables, SOWDA determines the output of the wavelet transform. We define a score function T (τ 1, τ 2 ) as the criterion to determine the combination of denoising factors. The function min{t (τ 1, τ 2 )} will lead to the optimal combination of denoising factors (f, l and s ) when its score value is minimized. The pseudo code for SOWDA is summarized in the following Algorithm Chart. 9

Algorithm 1 Smoothness-oriented wavelet denoising algorithm (SOWDA) Require: X: input data vector; F : a set of wavelet functions; L: a set of maximal levels of decomposition; S: a set of thresholding rules; W: wavelet transform {DWT, MODWT}; C : F L S W; Ensure: optimal f, l, s ; ˆX = W(f, l, s ) 1: for C c do 2: Conduct wavelet transform. 3: Apply threshold rule to wavelet coefficients. 4: Conduct wavelet inverse transform. 5: Compute evaluation function and return T (c). 6: end for 7: (W, f, l, s ) = c = argmin T (c). 8: Apply W and using (f, l, s ) to extract trend of X, ˆX. 3 Implementation of SOWDA 3.1 Jumps in high-frequency data A substantial amount of significant discontinuity has been observed in financial data and these discontinuities are commonly called jumps. Several empirical and theoretical studies have discussed the existence of jumps and their substantial impact on asset pricing, portfolio and risk management, and hedging (see Lee and Mykland (2008)). Jumps come to markets irregularly and their arrivals and amplitudes depend on market information that leads to the roughness of the financial time series data, particularly for high-frequency data (see Sun et al. (2007)). We need to detect jumps with robust tools, such as the wavelet method (see Sun and Meinl (2012) and Xue et al. (2014)), and then clarify which information is dynamically related to jumps in order to discover market phenomena and improve modeling, forecasting, pricing, and hedging. In finance, a one-dimensional asset return process with a fixed complete probability space (Ω, F t, P), where {F t : t [0, T ]} is a right-continuous information filtration for market participants, and P is a data-generating measure. Letting S(t) be the continuously compounded return of the asset at t under P, S(t) can be expressed as the Itô drift-diffusion process that satisfies the stochastic differential equation: ds(t) = µ(t)dt + σ(t)db(t), where B(t), µ(t), and σ(t) are the F t -adapted standard Brownian motion, the drift, and the spot volatility, respectively. When there are jumps: ds(t) = µ(t)dt + σ(t)db(t) + λ(t)dj(t), where J(t) is a counting process that is independent of W (t), and λ(t) is the jump size that is independent and identically distributed. 10

Lee and Mykland (2008) impose two necessary assumptions on price processes given as follows. For any ɛ > 0, 0 = t 0 < t 1,..., < t n = T : and sup t sup µ(u) µ(t i ) = O p ( t 1 2 ɛ ), t i u t i+1 sup t sup σ(u) σ(t i ) = O p ( t 1 2 ɛ ). t i u t i+1 The notation O p is used for random vector {X n } and non-negative random variable {x n }, X n = O p (x n ), whereby if for any ɛ > 0 there exists a finite constant C ɛ such that P ( X n > C ɛ x n ) < ɛ. These two assumptions show the drift and diffusion coefficients do not change dramatically over a finer time interval and the maximum changes in mean and spot volatility in a given interval is bounded (see Lee and Mykland (2008)). Several jump detecting methods based on wavelet transform have been proposed (see, for example, Sun and Meinl (2012) and Xue et al. (2014)). 3.2 Property of SOWDA The SOWDA algorithm we propose in this paper is exactly trying to remove the jumps without jeopardizing the information possessed in the original data. Let (F, L) be fixed, and we can directly obtain S SOWDA J = S DWT J and S SOWDA J = S MODWT J. For any other choice of (F, L, S), more details will be added to the estimator SJ SOWDA, according to T (τ 1, τ 2 ). After the reconstruction (see Equation (3)), these additional details correspond to the estimators S DWT J and SJ MODWT. For the refinement process, several wavelet coefficients are is available, as this procedure causes the set to zero by S, and no exact description of SJ SOWDA information carried by the wavelet coefficients at different levels to be intermixed. However, noting that either the DWT or MODWT approximation of any level is bounded by the signal itself, we can set up an ε-tube around the initially estimated trend by ε := max X t min X t. (9) t t We then estimate an upper bound for the error by a given constant c and c = N ε, which suffices the following equation: N t=1 ϑ t (X) S GOWDA J, t < c, (10) where X is the signal of length N and ϑ(x) is its denoised approximation. As we have: min X t SJ, SOWDA t max X t, and min X t SJ, MODWT t max X t, t t t t for all 1 j J, it follows that: ϑ t (X) S DWT J, t ε, and ϑ t (X) S MODWT J, t ε. 11

This must also hold for SOWDA, such that: min X t S SOWDA t J, t max t X t, and ϑ t (X) S SOWDA J, t ε. Assuming for fixed F and L, the wavelet coefficients Ω W j, k are ordered in time, that is, ω j, k+1 ω j, k, for all 1 j J and 1 k K 1. With t K := max t, j {t ΩS j, K} for all t > t K and the same ε as in Equation (9), we then obtain: N t=1 µ ( ϑ t (X) ) µ ( S SOWDA J, t ) = t K t=1 µ ( ϑ t (X) ) µ ( ) SJ, SOWDA t tk ε, and N t=1 σ ( ϑ t (X) ) σ ( S SOWDA J, t ) = t K t=1 σ ( ϑ t (X) ) σ ( ) SJ, SOWDA t tk ε. 3.3 Performance measure with jump detection Wavelet transforms have the ability to decompose the data into high-frequency components and low-frequency components. Since jumps are unexpected and instantaneous, Xue et al. (2014) indicate that jump dynamics should be contained in the high-frequency components of prices. They propose a non-parametric jump detection method, which estimates the high-frequency components of instantaneous volatility. Let the logarithm of an asset s market price be P t = log S t where S t is the asset price at time t. P 1,t are the wavelet coefficients of P t = log(s t ) for the first scale MODWT. A test statistic defined as J w, which detects jump occurrences at time t i for i = 3,..., T, is given as follows: J w (i) = P 1,t i ˆ σ ti 1 where ˆσ t 2 i 1 = 1 i 1 i 2 k=2 P i,t k P i,tk 1. When the test statistics exceeds the threshold, the null hypothesis of jumps in this interval is rejected. In this paper, we setup the threshold level as 1% of the null distribution. We define a jump as being detected when the test statistics exceeds the threshold level. We have learned that SOWDA is capable of identifying jumps and reconstructing them to preserve information, i.e., classify the spurious jumps and actual jumps. We propose a performance evaluation procedure for SOWDA based on the jump detection test (XGF test in short) introduced by Xue et al. (2014). The idea behind this procedure is straightforward. The identified jumps must be marked at the same location in the original data and denoised data. Our procedure is given as follows. 12

Step 1. Applying SOWDA to the original time series, then we separate the original data to the trend series and noise series. Step 2. As Xue et al. (2014) assume J w (i) following a normal distribution, we set the threshold level at 1% for testing the null hypothesis that there is no jump. Step 3. We run the XGF test following a non-overlapping moving window with length n to the original, the trend, and the noise. We mark 1 when the XGF test rejects the null hypothesis; otherwise 0 is marked. We then have three test series containing only 0 and 1 elements with a length of N/n, where N is the total data length and n << N. Step 4. We compute the difference series (e.g., the test series of the original minus the test series of the trend) of three series we obtained by Step 3. We next take the absolute value of them. Step 5. We run a T test to check if the mean of the absolute difference series obtained by Step 4 is different from 1. We reject the null hypothesis (i.e., equal to 1 ) with statistical significance when the jumps occur simultaneously in the two series investigated. Following Xue et al. (2014), we conduct this procedure respectively for positive jumps and negative jumps. If SOWDA efficiently decomposes the original time series, then we reject the null hypothesis for testing the difference series of original data and trend data, but cannot reject the null hypothesis for the difference series of the original data and noise data. 4 Simulation Study We conduct a simulation study to investigate the performance of the proposed algorithm. The purpose of this simulation study is two-fold. First, we show that for any arbitrary signal the proposed method will result in a better performance than the non-optimized method (i.e., arbitrarily determining the union of wavelet, level of decomposition, and the thresholding rule). Second, we shall illustrate the properties of our algorithm by, particularly, showing the consistency of our algorithm - that is, the error generated by our method is bounded and less than those of the non-optimized method. 4.1 The data In this study we perform the Monte Carlo simulations where errors (jumps) are generated from three different patterns to describe (1) moderate volatility, (2) excessive volatility, and (3) excessive volatility with mean level shifts. We create time series data of length 2 13. This trend is based on a sine function, whose amplitude and frequency are drawn from a uniform distribution. For generating the pattern (1) signals, following the simulation introduced by Sun and Meinl (2012), we add jumps to this trend. Jump occurrences are uniformly distributed (coinciding 13

with a Poisson arrival rate observed in many systems), with the jump heights being a random number drawn from a normal distribution with mean 0 and variance 1. The signal is constant between the jumps. White Gaussian noise is added to the signal afterwards. For pattern (2) signals, we repeat the method used for simulating pattern (1), but replace the Gaussian noise with skewed contaminated normal noise, which has heavy tails to capture the excessive volatility (see Chun et al. (2012)). For pattern (3) signals, we repeat the method used for pattern (2) signals, but shift the trend up and down once in order to generate a signal characterized as excessive volatility with mean level shifts. The amplitude of the shift is four times the previous trend. Figure 1 illustrates the Q-Q plots of these three different signals. 4.2 The methodology In our simulation study we choose Haar, Daubechies (DB), Symlet (LA) and Coiflets (Coif) as wavelet functions (see Percival and Walden (2006)). We apply the pyramid algorithm in our empirical study. With each iteration of the pyramid algorithm, we increase the scaling level, that is, given a signal of length N = 2 J, the j-th iteration computes detail coefficients associated with changes on a scale of length λ j = 2 j 1 (see Gençay et al. (2002)). In this simulation we consider several thresholding rules. Donoho and Johnstone (1994) suggest the universal thresholding rule based on the following equation: ϑ U = ˆσ ɛ 2 log n. The idea behind this selection rule is that, for a sequence of n independently and identical distributed (i.i.d.) N(0, σ 2 ɛ ) random variables, it holds that the probability that the largest value in absolute terms is smaller than ˆσ ɛ 2 log n converges to one for large n. Donoho and Johnstone (1998) suggest the minimax. Donoho (1994) proposes a method based on minimizing Steins unbiased estimate of risk (i.e., SURE). Suppose there is a sequence of k i.i.d. random variables z i N(µ i, 1). Let ˆµ be an estimator of µ = (µ 1,..., µ k ). Given a weakly differentiable function g, an unbiased estimator of µ, ˆµ = z + g(z) is given by: where g(z) = i E ˆµ µ 2 = k + [E g(z) 2 + 2 g(z)], i g i (z). With soft thresholding we obtain: SURE(z, ϑ) = k 2 #(i : z i ϑ + k min 2 ( z i, ϑ). Here, #S equals the cardinality of a given set S. The SURE thresholding is the one minimizing the estimated risk: i=1 ϑ S = arg min ϑ 0 SURE(z, ϑ). Donoho and Johnstone (1995) suggest the heuristic thresholding that applies the SURE thresholding rule to some levels of decomposition and universal thresholding to others. The 14

decision for which rule will be used on which level is made heuristically. Birgé and Massart (1998) suggest a thresholding rule based on the Birgé-Massart strategy using a penalized projection estimator (PPE). For each level i, q i is calculated as: q i = m (j + 2 i) α, where j is the maximal level of decomposition, m is a constant proposed to equal the length of the data (i.e., the number of observations), and α is a controlling constant. On each level i the q i largest coefficients are kept. The larger the α value is, the more coefficients that remain. A typical choice for the α value is 1.5 for compression and 3 for denoising. Based on the above-mentioned thresholding rules, wavelet coefficients are thresholded term by term on the basis of their individual magnitudes. Information on other coefficients has no influence on the treatment of particular coefficients. Cai and Silverman (2001) propose a block thresholding method, which is a shrinkage method capturing information on neighboring coefficients. When applying the block thresholding rule, the coefficients are considered in overlapping blocks and the treatment of coefficients in the middle of each block depends on the data in the whole block. The candidates for denoising factors of SOWDA in our simulation are: F { Haar, DB(2), DB(4), DB(8), LA(2), LA(4), LA(8), Coif(4), Coif(6), Coif(8)}; L {i : i = 1, 2, 3}; S { Block, Universal, SURE, heuristic Sure, Minimax, Birgé-Massart}. The linear score function T ( ) of τ 1 and τ 2 has the following form: T (τ 1, τ 2 ) = 0.5 τ 1 + 0.5 τ 2. When we compute τ 1, we set α = 0.05 for the Grubbs t-statistic. When we compute τ 2, the one-sigma rule is applied to detect the local extrema - that is, an observation is considered to be the local extrema (of a given sequence) if it lies in the region at a distance from its mathematical expectation of more than the standard deviation. In this simulation the alternative methods we compare with SOWDA are five single wavelet functions, i.e., Haar, DB(4), DB(8), LA(8), and Coif(6), that work with both DWT and MODWT. We run the simulation for the three different data patterns described in Section 3.1. We use our algorithm to identify the best denoising method that optimally combines the wavelet function, level of decomposition, and the thresholding rule. For each pattern, we conduct the simulation based on a moving window design illustrated with Figure 2. We investigate our 15

algorithm for both an in-sample approximation and out-of-sample forecasting. For the out-ofsample forecasting, we work with one-step and two-step forecasting. Since the true trend (for both in-sample and out-of-sample) of the simulated stylized data is known, we then use SOWDA and alternative methods to denoise the simulated data and compare the approximated trend and forecasted trend with their true counterparts. Obviously, the smaller the difference is compared with the true trend, the better the performance will be by the underlying algorithm. 4.3 Simulation results As we mentioned in Section 3.1, the data length is 8,192 (2 13 ) for each pattern. For the moving window design, we set the in-sample size as 1,000 and the out-of-sample size as 10 for both one-step ahead and two-step ahead forecasting. The number of window moves is then 720, and we generate 100 data series for each pattern. Therefore, for each pattern we test our algorithm 72,000 times for in-sample approximation, one-step forecasting (validation), and two-step forecasting. In our simulation, we have 216,000 runs in total. For each run we compute the root mean squared error (RMSE). We report the mean value of RMSE and its corresponding variance (in parenthesis) for each method in Table 1. The smaller the MSE mean value is, the better the denoising performance. For DWT, we find that the RMSE mean values of SOWDA are smaller than that of other alternatives for both in-sample and out-of-sample performances. We note that the proposed SOWDA performs better than the non-optimized alternative methods. In order to illustrate the result reported in Table 1, we show some results (the best three) by Figures 3 and 4. We see that the mean values of RMSE of SOWDA are smaller than that of the alternative methods and the variance of RMSE for SOWDA turns out to be smaller than that obtained by using alternative methods. Additionally, we identify that when increasing the number of simulation runs, the variance of RMSE decreases. The speed of the decrease in variance (i.e., the speed of error convergence to its limit) of SOWDA is relatively faster than that of alternative methods. This result confirms the consistency property of SOWDA that we have analytically shown in Section 2.3.2. The results we obtain in this simulation conclude that SOWDA shows better performance than alternative methods. 5 Empirical Study In this empirical study we investigate the performance of our proposed algorithm (SOWDA) by analyzing high-frequency data of German DAX 30 component stocks. We want to see if SOWDA can be used as a denoising method to improve the performance of some classic econometric models (i.e., AR, ARMA, and ARMA-GARCH) for in-sample estimation and out-of-sample forecasting. 16

5.1 The data The analysis is performed on high-frequency German DAX 30 component stock prices from January to December 2005. We aggregate the tick-by-tick data to the homogeneous (i. e., equally spaced) time series data at the 5-minute level by using the linear interpolation method - that is, the inhomogeneous series where times t i are given by x(t i ), while the target homogeneous time series x shall be defined at times τ j := t 0 + j t, j N, with t > 0 fixed. Every regular τ j is bounded by two times of irregularly spaced series, i.e.: t Ij τ j < t Ij +1, with I j := max{i t i τ j }, and data point τ j is interpolated between t Ij and t Ij +1 by: x(τ j ) = x Ij + τ j t Ij t Ij +1 t Ij (x Ij +1 x Ij ). In our sample, the 5-minute data contains 26,686 data points for each DAX stock in 2005. 5.2 Methodology and results In this empirical study we have two experiments: the in-sample modeling and out-of-sample forecasting. For the in-sample training experiment, we first apply the smoothness-oriented wavelet denoising algorithm (SOWDA) proposed in this paper (see Algorithm chart 1 in Section 2.2.1) to denoise the data. We then use both the original data and denoised data to estimate the AR(2), ARMA(2,1), and ARMA(2,1)-GARCH(1,1) models with the maximum likelihood approach. We compare the performance of model fitting by using the root mean squared error (RMSE) for each model as the goodness-of-fit measure. For the out-of-sample forecasting, we use the established model to conduct one-step-ahead and two-step ahead forecasting following the method suggested by Sun and Meinl (2012). We then evaluate the forecasting performance by computing the root mean square error (RMSE) of the forecasted values compared with the observed values in the original data. We use the same SOWDA setting and moving window design as in the simulation. For the moving window design, we set the in-sample size as one month and the out-of-sample size as one day for both one-step ahead and two-step ahead forecasting. Table 2 reports the in-sample results for AR(2), ARMA(2,1), and ARMA(2,1)-GARCH(1,1). We see that the mean, median, and variance of RMSE are all reduced after applying SOWDA denoising, showing a significant improvement of denoised data (for both DWT and MODWT SOWDA) over the original data for model fitting. Table 3 reports the results of out-of-sample forecasting for AR(2), ARMA(2,1), and ARMA(2,1)- GARCH(1,1) based on SOWDA DWT denoising. Table 4 shows the forecasting results based on 17

SOWDA MODWT denoising. For the out-of-sample forecasting results, we also identify a generally significant improvement provided by SOWDA denoised data versus the original data. The empirical results coincide with our previous simulation results when considering the robustness of SOWDA s denoising performance. We next conduct the performance evaluation proposed in Section 3.3. If SOWDA efficiently decomposes the trend and noise from the original data, then all the jumps possessed by the original data, the trend, and the noise should coincide - that is, the null hypothesis that the mean of the absolute difference series equals one should be rejected with the T test. In our assessment, we consider not only the overall jumps, but also the negative and positive jumps individually. We report our results in Table 5. From Table 5, we note that the null hypothesis that the mean of the absolute difference series equals one is rejected with significant T test statistics, illustrating that the jumps occur simultaneously in the original data, the trend, and the noise obtained with SOWDA denoising. Our result implies that SOWDA performs efficiently in the decomposition of the original data based on the jump detection test. 6 Conclusion When conducting econometric analysis of data with classic parametric models built on i.i.d. white noise assumption, once one encounters unexpectedly extreme observations, e.g., outliers, the performance of such a parametric model will be distorted. To overcome the challenging disturbance of outliers, one can usually either find a more robust way to build the model or remove the outliers. The former might complicate the situation when the outlier s pattern is unknown, while the latter could remove some information if such outliers are closely related to the underlying true dynamics. In order to optimally detect and remove the real noise from the underlying dynamics, wavelet methods have been suggested to denoise the data not only in economics, but also in other areas, such as electronic signal processing. Applying the wavelet denoising method requires that the wavelet function, level of decomposition, and thresholding rule be determined. We call it trinity of wavelet denoising operation. Identifying the optimal combination of wavelet function, level of decomposition, and thresholding rule challenges the efficiency of the classical methods of denoising data based on the wavelet transform (e.g., DWT and MODWT). An inefficient decomposition of the systematic pattern (the trend) and noise of the target data will tremendously reduce the efficiency and effectiveness of any decision support system. When working with highfrequency financial data, their irregularities and roughness reinforce the necessity of more efficient tools for data mining. In this paper we propose a new denoising method for high-frequency financial data, named the smoothness-oriented wavelet denoising algorithm (SOWDA), which optimally determines the combination of denoising factors (i.e., wavelet function, level of decomposition, and thresholding 18

rule) based on a smoothness-oriented score function that is designed for detecting global and local extremum. The method can be applied with the classic DWT or MODWT approach. When applying it to high-frequency financial data, this algorithm is able to preserve a smooth trend and effectively displays the noise since all information can be contained in the wavelet multiresolution decomposition optimally after specifying the level of smoothness in reconstructing wavelet coefficients. In this paper we analytically show SOWDA s properties and propose a new performance evaluation procedure based on the jump detection test. In order to show the efficient performance of SOWDA, we first conduct an experiment based on simulations. We consider three different stylized data patterns that are often observed in high-frequency financial data, such as heavy tails and regime switching. In all simulation settings that we have investigated, SOWDA illustrates its robustness independent of the input data. We then empirically show the potential application of SOWDA by fitting and forecasting real high-frequency financial data (aggregated at the 5-minute-frequency level) from German DAX 30 component stock prices with three classic econometric models. The results confirm our conclusion that SOWDA significantly (based on the RMSE comparison) improves the efficiency of data denoising based on both DWT and MODWT transforms for these classic econometric models. From the results herein, we conclude that SOWDA is a robust algorithm that enriches the class of intelligent denoising methods in data mining. The proposed algorithm is expected to provide a significant improvement to the accuracy of econometric models in modeling and forecasting highfrequency financial data. We also believe SOWDA can be applied on other data with unknown noise patterns. Since data quality is fundamental for intelligent decision making, performing econometric models efficiently and effectively on quality data will help us build more reliable decision making system. References Aguiar-Conraria, L., Martins, M., Soares, M., 2012. The yield curve and the macro-economy across time and frequencies. Journal of Economic Dynamics and Control 36, 1950 1970. Birgé, L., Massart, P., 1998. Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4(3), 329 375. Cai, T., Silverman, B., 2001. Incorporating information on neighboring coefficients into wavelet estimation. Sankhya: The Indian Journal of Statistics 63, 127 148. Chun, S., Shapiro, A., Uryasev, S., 2012. Conditional value-at-risk and average value-at-risk: estimation and asymptotics. Operations Research 60(4), 739 756. Connor, J., Rossiter, R., 2005. Wavelet transforms and commodity prices. Studies in Nonlinear Dynamics & Econometrics 9, 433 465. 19

Crowley, P., 2007. A guide to wavelets for economists. Journal of Economic Surveys 21(2), 207 267. Daubechies, I., 1992. Ten lectures on wavelets, volume 61 of CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA 118. Donoho, D., 1994. Asymptotic minimax risk for sup-norm loss: solution via optimal recovery. Probability Theory and Related Fields 99, 145 170. Donoho, D., Johnstone, I., 1994. Ideal spatial adaptation by wavelet shrinkage. Biometrika 81, 425 455. Donoho, D., Johnstone, I., 1995. Adapting to unknown smoothness via wavelet shrinkage. Journal of the American Statistical Association 90(432), 1200 1224. Donoho, D., Johnstone, I., 1998. Minimax estimation via wavelet shrinkage. Annals of Statistics 26(3), 879 921. Esteban-Bravo, M., Vidal-Sanz, J., 2007. Computing continuous-time growth models with boundary conditions via wavelets. Journal of Economic Dynamics and Control 31, 3614 3643. Fan, J., Wang, Y., 2007. Multi-scale jump and volatility analysis for high-frequency financial data. Journal of the American Statistical Association 102, 1349 1362. Fan, Y., Gençay, R., 2010. Unit root tests with wavelets. Econometric Theory 26, 1305 1331. Gençay, R., Gradojevic, N., 2011. Errors-in-variables estimation with wavelets. Journal of Statistical Computation and Simulation 81(11), 1545 1564. Gençay, R., Gradojevic, N., Selcuk, F., Whitcher, B., 2010. Asymmetry of information flow between volatilities across time scales. Quantitative Finance 10, 895 915. Gençay, R., Selçuc, F., Whitcher, B., 2005. Multiscale systematic risk. Journal of International Money and Finance 24, 55 70. Gençay, R., Selçuk, F., Whitcher, B., 2002. An introduction to wavelets and other filtering methods in finance and economics. Academic Press. Gençay, R., Selçuk, F., Whitcher, B., 2003. Systematic risk and timescales. Quantitative Finance 3(2), 108 116. Grubbs, F., 1969. Procedures for detecting outlying observations in samples. Technometrics 11, 1 21. Haven, E., Liu, X., Shen, L., 2012. De-noising option prices with the wavelet method. European Journal of Operational Research 222(1), 104 112. Hong, Y., Kao, C., 2004. Wavelet-based testing for serial correlation of unknown form in panel models. Econometrica 72, 1519 1563. In, F., Kim, S., Gençay, R., 2011. Investment horizon effect on asset allocation between value and growth strategies. Economic Modelling 28, 1489 1497. 20

Jensen, M., 2007. An alternative maximum likelihood estimator of long-memory processes using compactly supported wavelets. Journal of Economic Dynamics and Control 24, 361 387. Keinert, F., 2004. Wavelets and Multiwavelets. Chapman & Hall/CRC. Kelly, D., Steigerwald, D., 2004. Private information and high-frequency stochastic volatility. Studies in Nonlinear Dynamics & Econometrics 8, 1 18. Kim, S., In, F., 2008. The relationship between financial variables and real economic activity: Evidence from spectral and wavelet analysis. Studies in Nonlinear Dynamics and Econometrics 7, 22 45. Lada, E., Wilson, J., 2006. A wavelet-based spectral procedure for steady-state simulation analysis. European Journal of Operational Research 174(3), 1769 1801. Laukaitis, A., 2008. Functional data analysis for cash flow and transactions intensity continuous-time prediction using Hilbert-valued autoregressive processes. European Journal of Operational Research 185(3), 1607 1614. Lee, S., Mykland, P., 2008. Jumps in financial markets: A new nonparametric test and jump dynamics. Review of Financial Studies 21, 2535 2563. Mallat, S., 1989. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 11, 674 693. Mallat, S., Hwang, W., 1992. Singularity detection and processing with wavelets. IEEE Transactions on Information Theory 38, 617 643. Mandelbrot, B., 1982. The Fractal Geometry of Nature. W. H. Freeman & Co. Ltd. McCulloch, R., Tsay, R., 2004. Nonlinearity in high-frequency financial data and hierarchical models. Studies in Nonlinear Dynamics & Econometrics 5, 1 17. Morlet, J., 1983. Sampling theory and wave propagation. Issues in Acoustic Signal/Image Processing and Recognition 1, 233 261. Percival, D., Walden, A., 2006. Wavelet methods for time series analysis. Cambridge University Press. Ramsey, J., 2002. Wavelets in economics and finance: past and future. Studies in Nonlinear Dynamics and Econometrics 6, 1 27. Ramsey, J., Lampart, C., 1998. The decomposition of economic relationships by time scale using wavelets: expenditure and income. Studies in Nonlinear Dynamics and Econometrics 3, 23 42. Sun, E., Meinl, T., 2012. A new wavelet-based denoising algorithm for high-frequency financial data mining. European Journal of Operational Research 217, 589 599. Sun, E., Rezania, O., Rachev, S., Fabozzi, F., 2011. Analysis of the intraday effects of economic releases on the currency market. Journal of International Money and Finance 30(4), 692 707. 21

Sun, W., Rachev, S., Fabozzi, F., 2007. Fractals or I.I.D.: evidence of long-range dependence and heavy tailedness from modeling German equity market returns. Journal of Economics and Business 59, 575 595. Sun, W., Rachev, S., Fabozzi, F., 2009. A new approach for using lèvy processes for determining high-frequency value-at-risk predictions. European Financial Management 15, 340 361. Xue, Y., Gençay, R., Fagan, S., 2014. Jump detection with wavelets for high-frequency financial time series. Quantitative Finance 14, 1427 1444. 22

Data pattern 1 Data pattern 2 Data pattern 3 Figure 1: Normal Q-Q plot for three different simulated data patterns. 23

Figure 2: Moving window design for the numerical studies. E is the length of the data used for training (approximation), V is the length for one-step ahead forecasting (validation), and F is the length for the two-step ahead forecasting. Given T as the total length of the data, the number of window moves is then floor((t E)/V ) + 1. 24

Mean of RMSE under DWT for pattern 1 Variance of RMSE under DWT for pattern 1 Mean of RMSE under DWT for pattern 2 Variance of RMSE under DWT for pattern 2 Mean of RMSE under DWT for pattern 3 Variance of RMSE under DWT for pattern 3 Figure 3: Comparison of denoising performances of SOWDA with other alternative methods under DWT 25 measured by mean and variance of RMSE for three different stylized data patterns.

Mean of RMSE under MODWT for pattern 1 Variance of RMSE under MODWT for pattern 1 Mean of RMSE under MODWT for pattern 2 Variance of RMSE under MODWT for pattern 2 Mean of RMSE under MODWT for pattern 3 Variance of RMSE under MODWT for pattern 3 Figure 4: Comparison of denoising performances of SOWDA with other alternative methods under MODWT 26 measured by mean and variance of RMSE for three different stylized data patterns.

Table 1: Comparison of denoising performances of SOWDA with other alternative methods for the in-sample approximation, one-step forecasting (validation), and two-step forecasting measured by mean ( 10 3 ) and variance (in parenthesis) of RMSE ( 10 5 ) for three different stylized data patterns. DWT MODWT Pattern 1 Pattern 2 Pattern 3 Pattern 1 Pattern 2 Pattern 3 SOWDA 4.1027 4.8544 5.5903 4.4399 4.8931 5.2217 (0.0680) (0.0175) (0.0496) (0.0161) (0.0171) (0.0162) Haar 6.4389 6.6536 7.0652 4.5945 4.9900 5.2275 (0.0735) (0.0778) (0.0674) (0.0172) (0.0187) (0.0166) DB(4) 4.7790 5.2794 5.6391 4.4406 4.9150 5.2560 In- (0.0187) (0.0196) (0.0190) (0.0158) (0.0168) (0.0155) Sample DB(8) 4.7180 5.2133 5.5923 4.4563 4.9268 5.2981 (0.0182) (0.0193) (0.0184) (0.0157) (0.0168) (0.0155) Coif(6) 4.7416 5.2003 5.5919 4.4563 4.8973 5.2246 (0.0187) (0.0203) (0.0174) (0.0158) (0.0168) (0.0155) LA(8) 4.7941 5.2454 5.6403 4.4563 4.9268 5.2981 (0.0197) (0.0215) (0.0180) (0.0157) (0.0168) (0.0155) SOWDA 73.8879 98.0779 118.2665 66.7697 91.7164 102.8651 (23.4620) (11.5876) (36.4309) (22.3380) (13.6447) (11.6819) Haar 121.2459 137.5414 157.7349 67.4713 93.0689 108.1393 (185.8116) (153.1924) (145.4305) (29.3159) (16.6503) (14.7383) DB(4) 76.2261 101.5453 119.9381 67.9992 91.8543 106.6679 1-step (20.6273) (13.4932) (13.1264) (22.6593) (14.1140) (12.3436) ahead DB(8) 75.8046 100.9839 119.7063 68.8317 93.8070 110.4068 (20.6511) (13.1740) (13.1896) (21.4444) (13.9552) (12.1662) Coif(6) 74.9808 99.6079 118.7269 66.8570 92.1139 106.8731 (18.6225) (14.5735) (12.1761) (22.5989) (14.0654) (12.3298) LA(8) 75.8741 100.6805 119.5905 68.8317 93.8070 110.4068 (18.8895) (14.3112) (13.2696) (21.4444) (13.9552) (12.1662) SOWDA 73.9336 98.0779 118.2665 68.5847 91.2292 104.8484 (22.1610) (11.7912) (30.1113) (21.8080) (11.9080) (13.8356) Haar 96.5352 112.3378 134.3725 69.5818 92.9110 110.0385 (108.6779) (71.5900) (74.3082) (28.4445) (15.4903) (18.5306) DB(4) 75.1660 98.2470 118.5803 69.7367 91.6690 108.5175 2-step (20.1612) (13.3407) (13.8208) (21.9617) (12.4354) (14.9600) ahead DB(8) 77.1922 100.6721 121.2457 70.4208 93.8369 112.1951 (18.6477) (12.6767) (14.7408) (20.7086) (12.3681) (14.3683) Coif(6) 76.1126 99.9550 120.1059 68.6629 91.9427 108.7199 (19.6608) (11.8431) (14.5004) (21.8970) (12.3819) (14.9189) LA(8) 74.0547 103.4440 142.4636 70.4208 93.8369 112.1951 (20.6330) (11.7441) (15.2854) (20.7086) (12.3681) (14.3683) : 10 3 27

Table 2: Comparing the performances of original data, SOW DADW T, and SOW DAMODW T denoised data (DAX 30 stocks 5-minute data in 2005) for the in-sample model fitting measured by RMSE ( 10 5 ) when using AR(2), ARMA(2,1), and ARMA(2,1)-GARCH(1,1) models. AR(2) ARMA(2,1) ARMA(2,1)-GARCH(1,1) Original DWT MODWT Original DWT MODWT Original DWT MODWT ADS 3.3137 1.3062 1.2433 3.3137 1.4181 27.3670 4.3979 1.8674 1.4139 ALT 3.8554 1.4117 1.3102 3.8392 1.5688 1.5923 4.9888 1.6069 1.4642 ALV 3.2670 1.3930 1.3063 3.2638 1.5699 1.5978 5.0962 4.6879 1.4465 ARO 7.7988 2.8161 2.5273 7.7946 3.1315 3.0297 10.3910 3.8531 2.8625 BAS 3.3058 1.3105 1.2277 3.3073 1.4023 1.4661 4.0946 1.9191 1.4371 BAY 3.9928 1.5444 1.3630 3.9928 1.6751 1.5858 4.8512 2.2161 1.3593 BMW 3.5891 1.4427 1.4243 3.5730 1.6047 1.8604 3.8925 1.4509 1.4152 CBK 3.7479 1.4398 1.3446 3.7545 1.5798 1.6065 4.4395 1.5137 1.5702 CON 3.6809 2.0647 1.4195 3.6727 2.3643 1.6669 4.9855 2.1101 1.3881 DAI 3.8097 1.4281 1.4068 3.8152 1.6657 1.6738 5.0421 1.5513 1.7025 DBK 3.2331 1.2569 1.1361 3.2275 1.4269 1.3203 4.1104 1.5159 1.1529 DPT 3.4895 1.2792 1.1796 3.4802 1.4423 1.4208 3.8320 3.6154 1.4121 DTK 3.0022 1.5675 1.0650 3.0089 1.9828 1.3991 3.6364 1.5214 1.2404 EON 3.3243 1.3488 1.3993 3.3203 1.4708 1.5339 4.8823 1.9070 2.6660 EPC 6.7821 2.1369 1.9438 6.7933 2.3464 2.4068 9.2288 2.2322 2.0527 FRE 3.6692 1.9935 1.3355 3.6678 2.1336 1.5636 4.4944 2.0193 1.4154 HEN 3.1155 1.2389 1.1832 3.1149 1.3310 1.4351 3.5171 1.7588 1.3090 HVB 4.4685 1.8539 1.6164 4.4745 2.1330 1.9293 7.2151 2.4163 1.5886 HYP 5.4830 1.8578 1.7331 5.4836 2.0442 2.1687 8.1192 2.2331 1.8177 IFX 5.2806 1.9926 1.7519 5.2799 2.3086 2.0317 6.9283 2.4243 1.6389 LHA 4.4117 1.7150 1.5189 4.4122 1.9247 1.9997 5.6527 2.2715 1.7232 LIN 3.3162 1.2673 1.1641 3.3058 1.4248 1.4023 3.3842 4.5712 1.3064 MEO 3.5664 1.4359 1.2260 3.5624 1.5697 1.4436 4.8349 2.1549 1.2042 MLP 5.9886 2.9315 1.8555 6.0060 3.7584 2.2734 7.7546 2.8513 2.1530 MUV 3.2454 1.2491 1.1083 3.2403 1.4390 1.2963 5.6321 1.3738 1.0894 RWE 3.8388 1.5000 1.6335 3.8321 1.6377 1.6546 4.6399 1.6607 4.2603 SIE 3.2260 1.2488 1.1675 3.2329 1.4132 1.3787 5.5789 1.9311 1.3436 TUI 4.4142 1.6936 1.5757 4.4147 1.8570 1.9267 7.6174 1.9247 1.7607 TYA 3.9097 1.9896 1.3156 3.8993 2.5014 1.6215 5.6285 1.9983 1.4030 VOW 3.7065 1.5869 1.3834 3.7025 1.8009 1.6108 5.1803 2.1464 1.3140 Mean 4.0542 1.6382 1.4299 4.0524 1.8569 2.7354 5.5058 2.2314 1.6637 Median 3.6937 1.4714 1.3538 3.6876 1.6517 1.6162 5.0155 2.0088 1.4418 Variance 1.2211 0.1786 0.0894 1.2267 0.2888 21.7073 2.8208 0.7034 0.3790 28

Table 3: Comparing the performances of original data and SOW DADW T denoised data (DAX 30 stocks 5-minute data in 2005) for the out-of-sample forecasting measured by RMSE ( 10 5 ) when using AR(2), ARMA(2,1), and ARMA(2,1)-GARCH(1,1) models. AR(2) ARMA(2,1) ARMA(2,1)-GARCH(1,1) Original SOWDA Original SOWDA Original SOWDA 1-step 2-step 1-step 2-step 1-step 2-step 1-step 2-step 1-step 2-step 1-step 2-step ADS 16.2540 15.4940 6.4899 5.5701 15.9860 15.2000 6.5639 5.7943 19.0970 18.4380 7.5379 6.9372 ALT 18.9340 17.8780 5.8970 6.2285 18.5490 17.9640 6.7399 6.9472 21.1000 21.1130 8.1499 7.7036 ALV 15.5330 15.3070 5.3413 5.6801 15.3430 15.3080 6.9871 6.8745 20.3360 20.7000 18.5880 22.3830 ARO 37.9780 33.9710 11.7960 10.3850 38.0220 34.3250 14.4020 13.4170 43.8650 39.2550 17.2610 15.2210 BAS 16.5390 15.3470 6.3475 5.6852 16.5270 15.2100 6.5470 6.0321 19.2740 17.3880 8.0922 6.8814 BAY 18.8070 19.0620 6.6984 7.1480 18.4920 19.1650 7.2813 7.2936 22.0210 22.7120 8.0986 8.1078 BMW 17.3610 16.4310 6.5509 5.9644 17.2450 16.3450 7.1227 6.7461 19.8510 18.6160 7.6341 7.0573 CBK 17.9820 17.2370 6.2500 6.3013 18.1850 17.1340 6.7579 6.9565 21.1780 20.5240 7.8595 7.8229 CON 17.9770 17.1860 10.5220 10.2690 17.9120 17.0130 11.2710 10.8610 20.8550 19.8630 11.4820 11.4460 DAI 18.0060 17.7910 5.6175 5.1862 16.8010 17.3490 7.3475 6.6141 20.1400 19.7040 6.4365 6.3687 DBK 15.5150 15.5150 5.3349 5.0880 15.6610 15.2670 6.5578 6.0706 19.2450 18.5380 6.3944 6.4516 DPT 16.9800 15.9200 5.0528 5.0929 17.1430 16.1570 6.8417 6.1174 19.8710 19.4580 8.2444 23.8920 DTK 14.5510 13.1010 9.1429 8.4429 14.4000 12.9360 10.3450 9.7051 17.2420 15.1070 8.5329 7.7946 EON 16.0630 15.5940 6.1559 5.8530 15.7050 15.6830 6.5661 6.1578 19.3160 18.3810 7.1859 6.7593 EPC 32.3490 30.2900 9.4371 8.7476 31.8040 30.8790 10.1830 9.6872 37.8200 35.2450 11.5660 10.8360 FRE 18.6400 16.6740 10.1350 9.3643 18.3570 16.5210 10.3040 9.5578 20.4450 18.7000 11.0870 10.2400 HEN 14.9580 14.5760 5.3879 5.4703 14.9030 14.5790 5.7215 5.8078 18.6480 17.6510 6.6085 6.7160 HVB 21.0430 20.7270 6.7206 6.9938 21.2250 20.5310 9.0357 9.3307 25.8800 24.0840 10.5020 11.4390 HYP 27.7480 26.9770 8.3724 8.5527 27.7240 26.0660 8.8325 9.1970 32.0370 30.1680 12.0200 10.7770 IFX 25.3350 24.4230 7.8245 7.8147 25.3940 24.1510 10.2340 9.5652 30.0830 29.4590 11.0640 11.0130 LHA 21.5790 20.2530 6.0334 6.6585 21.4000 20.2310 8.2455 8.6258 24.3480 23.6020 9.4369 8.3292 LIN 16.1370 15.4310 4.9091 5.1527 16.1100 15.3940 6.1373 6.1148 17.8720 17.0690 33.9020 6.4515 MEO 17.6360 16.4750 6.3158 6.2368 16.9700 16.5580 6.9784 6.7049 21.5420 20.3860 11.6880 78.4750 MLP 29.5450 27.5910 16.5580 16.4350 29.5150 27.6890 20.1020 18.9370 34.7830 32.0240 14.9450 14.4860 MUV 15.3820 14.8820 4.6527 4.5068 15.3100 15.0000 6.0619 5.7097 19.0120 18.3410 5.4154 4.9455 RWE 17.7390 17.9800 6.6891 6.5195 17.8440 18.5690 6.7505 7.1174 20.6100 21.1110 8.8080 8.4313 SIE 15.4040 15.0440 4.7373 4.8621 15.3270 14.9530 6.4151 5.8327 19.2720 18.5320 8.5433 8.0937 TUI 22.0020 20.1170 7.3985 6.9897 21.7210 20.0230 8.7586 7.6609 26.9680 24.3860 9.3778 9.2842 TYA 18.5980 18.1220 11.1170 11.3840 18.2030 18.0440 12.2530 12.5160 22.8070 22.9650 10.4240 10.7590 VOW 18.3880 17.2980 6.5565 6.3007 17.6140 17.3050 8.3161 7.6968 21.9080 21.0060 9.0000 9.1373 Mean 19.6459 18.7426 7.2828 7.1226 19.4551 18.7072 8.4652 8.1485 23.3105 22.2367 10.5623 11.9482 Median 17.9795 17.2115 6.5204 6.3010 17.7290 17.0735 7.2020 7.0370 20.9775 20.6120 8.9040 8.3803 Variance 29.4797 23.8337 6.4164 5.8872 29.6815 24.3653 8.6736 7.8003 38.2975 30.6047 27.3584 164.9387 29

Table 4: Comparing the performances of original data and SOW DAMODW T denoised data (DAX 30 stocks 5-minute data in 2005) for the out-of-sample forecasting measured by RMSE ( 10 5 ) when using AR(2), ARMA(2,1), and ARMA(2,1)-GARCH(1,1) models. AR(2) ARMA(2,1) ARMA(2,1)-GARCH(1,1) Original SOWDA Original SOWDA Original SOWDA 1-step 2-step 1-step 2-step 1-step 2-step 1-step 2-step 1-step 2-step 1-step 2-step ADS 16.2540 15.4940 5.0340 4.9682 15.9860 15.2000 5.8819 3.1792 19.0970 18.4380 5.5335 4.7276 ALT 18.9340 17.8780 5.3046 5.3916 18.5490 17.9640 6.2644 5.9929 21.1000 21.1130 5.2142 5.1482 ALV 15.5330 15.3070 5.0740 5.1873 15.3430 15.3080 6.1002 5.9782 20.3360 20.7000 5.4681 5.3148 ARO 37.9780 33.9710 10.3320 9.6620 38.0220 34.3250 12.4140 11.7800 43.8650 39.2550 11.3910 10.4260 BAS 16.5390 15.3470 5.0069 4.9242 16.5270 15.2100 6.2583 5.5720 19.2740 17.3880 5.4816 5.1506 BAY 18.8070 19.0620 4.9915 5.1350 18.4920 19.1650 6.3075 6.1591 22.0210 22.7120 4.9835 4.9532 BMW 17.3610 16.4310 5.8672 4.8389 17.2450 16.3450 8.9373 6.0236 19.8510 18.6160 5.6675 5.0201 CBK 17.9820 17.2370 5.2729 5.5943 18.1850 17.1340 6.2111 6.0754 21.1780 20.5240 5.5826 6.4697 CON 17.9770 17.1860 5.6068 5.6868 17.9120 17.0130 7.3654 7.5717 20.8550 19.8630 5.2210 5.1621 DAI 18.0060 17.7910 6.0532 5.1477 16.8010 17.3490 5.9544 6.2836 20.1400 19.7040 7.2302 5.2226 DBK 15.5150 15.5150 4.4197 4.1354 15.6610 15.2670 5.7917 5.6812 19.2450 18.5380 4.2840 4.2167 DPT 16.9800 15.9200 4.6115 4.8382 17.1430 16.1570 5.4336 5.5251 19.8710 19.4580 5.2724 5.9609 DTK 14.5510 13.1010 4.0912 3.9700 14.4000 12.9360 5.1443 5.6246 17.2420 15.1070 4.7478 4.3739 EON 16.0630 15.5940 6.6040 4.8656 15.7050 15.6830 6.1082 5.9007 19.3160 18.3810 19.6160 5.4579 EPC 32.3490 30.2900 7.9690 7.3990 31.8040 30.8790 9.3482 9.1746 37.8200 35.2450 7.0513 6.9043 FRE 18.6400 16.6740 5.4808 5.2417 18.3570 16.5210 6.0816 6.0213 20.4450 18.7000 5.2187 4.7510 HEN 14.9580 14.5760 4.7701 4.9905 14.9030 14.5790 5.8581 5.4967 18.6480 17.6510 4.1926 4.7290 HVB 21.0430 20.7270 5.8442 6.2092 21.2250 20.5310 7.3699 7.6220 25.8800 24.0840 5.8256 6.0757 HYP 27.7480 26.9770 6.7086 6.7950 27.7240 26.0660 7.8780 8.0091 32.0370 30.1680 6.3912 6.0536 IFX 25.3350 24.4230 6.4353 6.6020 25.3940 24.1510 8.5812 8.5047 30.0830 29.4590 6.9564 6.9307 LHA 21.5790 20.2530 6.2137 6.1575 21.4000 20.2310 8.7273 6.8698 24.3480 23.6020 6.3687 6.5813 LIN 16.1370 15.4310 4.8561 4.7612 16.1100 15.3940 5.5163 5.3769 17.8720 17.0690 5.1079 4.6615 MEO 17.6360 16.4750 4.6890 4.4892 16.9700 16.5580 6.0812 6.0293 21.5420 20.3860 4.6547 4.6874 MLP 29.5450 27.5910 7.1169 7.0087 29.5150 27.6890 8.6080 8.8930 34.7830 32.0240 7.8541 7.4500 MUV 15.3820 14.8820 4.1630 4.1650 15.3100 15.0000 5.1088 5.2244 19.0120 18.3410 4.2363 4.2475 RWE 17.7390 17.9800 8.2961 5.5140 17.8440 18.5690 6.6714 6.2476 20.6100 21.1110 30.1880 5.3440 SIE 15.4040 15.0440 4.7452 4.5185 15.3270 14.9530 5.4737 5.3161 19.2720 18.5320 5.1884 5.3577 TUI 22.0020 20.1170 6.5611 6.3603 21.7210 20.0230 7.4776 7.9260 26.9680 24.3860 6.4206 6.5234 TYA 18.5980 18.1220 5.4879 5.2750 18.2030 18.0440 6.5989 6.1488 22.8070 22.9650 4.9153 4.8525 VOW 18.3880 17.2980 5.5132 5.3653 17.6140 17.3050 6.6986 6.3902 21.9080 21.0060 5.1318 4.9202 Mean 19.6459 18.7426 5.7703 5.5169 19.4551 18.7072 6.8786 16.8210 23.3105 22.2367 7.0144 5.6276 Median 17.9795 17.2115 5.4844 5.2145 17.7290 17.0735 6.2614 6.1540 20.9775 20.6120 5.5076 5.2687 Variance 29.4797 23.8337 1.6945 1.3171 29.6815 24.3653 2.3796 3.0261 38.2975 30.6047 25.6537 1.4999 : 10 3 ; : 10 1 30

Table 5: Performance measure with the jump detection test for SOWDA. We run a T test to check if the mean of the absolute difference series obtained by the jump detection test statistics for two underlying time series is different from 1. We reject the null hypothesis (i.e., indicated by 1.0000 ) with significant test statistics when jumps coincide in the two series investigated. Test Statistics Original Data vs. SOWDA Trend Original Data vs. SOWDA Noise SOWDA Trend vs. SOWDA Noise SOWDADWT SOWDAMODWT SOWDADWT SOWDAMODWT SOWDADWT SOWDAMODWT Decision 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 Overall p-value 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 Confidence Interval (0.0056,0.0071) (0.0048,0.0062) (0.0307,0.0340) (0.0307,0.0340) (0.0313,0.0346) (0.0311,0.0344) Jumps T-stat -3.4296-3.6971-1.5049-1.5033-1.4902-1.4945 Standard Deviation 0.0797 0.0740 0.1768 0.1770 0.1785 0.1780 Decision 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 Positive p-value 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 Confidence Interval (0.0055,0.0070) (0.0047,0.0060) (0.0299,0.0331) (0.030,0.0333) (0.0304,0.0337) (0.0303,0.0335) Jumps T-stat -3.4622-3.7473-1.5247-1.5218-1.5123-1.5149 Standard Deviation 0.0789 0.0730 0.1747 0.1750 0.1760 0.1757 Decision 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 Negative p-value 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 Confidence Interval (0.0299,0.0332) (0.0046,0.0060) (0.0299,0.0332) (0.0299,0.0332) (0.0305,0.0338) (0.0302,0.0335) Jumps T-stat -1.5237-3.7566-1.5237-1.5241-1.5094-1.5218 Standard Deviation 0.1748 0.0728 0.1748 0.1747 0.1763 0.1750 : 10 3 31