REALIZED VOLATILITY FORECASTING and OPTION PRICING



Similar documents
A note on the impact of options on stock return volatility 1

CAPM, Arbitrage, and Linear Factor Models

THE ECONOMIC VALUE OF TRADING WITH REALIZED VOLATILITY

A Comparison of Option Pricing Models

1 Another method of estimation: least squares

Lecture Notes: Basic Concepts in Option Pricing - The Black and Scholes Model

Chapter 3: The Multiple Linear Regression Model

UBS Global Asset Management has

Real-time Volatility Estimation Under Zero Intelligence. Jim Gatheral Celebrating Derivatives ING, Amsterdam June 1, 2006

Pair Trading with Options

Interlinkages between Payment and Securities. Settlement Systems

Financial Time Series Analysis (FTSA) Lecture 1: Introduction

Stock Splits and Liquidity: The Case of the Nasdaq-100 Index Tracking Stock

Do option prices support the subjective probabilities of takeover completion derived from spot prices? Sergey Gelman March 2005

Option Valuation. Chapter 21

Geometric Brownian Motion, Option Pricing, and Simulation: Some Spreadsheet-Based Exercises in Financial Modeling

Review of Basic Options Concepts and Terminology

Bias in the Estimation of Mean Reversion in Continuous-Time Lévy Processes

Chapter 2. Dynamic panel data models

Department of Economics and Related Studies Financial Market Microstructure. Topic 1 : Overview and Fixed Cost Models of Spreads

Our development of economic theory has two main parts, consumers and producers. We will start with the consumers.

Forecasting Volatility using High Frequency Data

Internet Appendix for Institutional Trade Persistence and Long-term Equity Returns

CHAPTER 21: OPTION VALUATION

INDIAN INSTITUTE OF MANAGEMENT CALCUTTA WORKING PAPER SERIES. WPS No. 688/ November Realized Volatility and India VIX

ARE STOCK PRICES PREDICTABLE? by Peter Tryfos York University

Herding, Contrarianism and Delay in Financial Market Trading

10. Fixed-Income Securities. Basic Concepts

Sensex Realized Volatility Index

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Four Derivations of the Black Scholes PDE by Fabrice Douglas Rouah

Black-Scholes-Merton approach merits and shortcomings

Computational Finance Options

Cointegration Pairs Trading Strategy On Derivatives 1

Technical Analysis and the London Stock Exchange: Testing Trading Rules Using the FT30

EXERCISES FROM HULL S BOOK

1 Volatility Trading Strategies

Panel Data Econometrics

Asymmetric Information (2)

Introduction to Binomial Trees

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Hedging Using Forward Contracts

Jumps and microstructure noise in stock price volatility

VOLATILITY FORECASTING FOR MUTUAL FUND PORTFOLIOS. Samuel Kyle Jones 1 Stephen F. Austin State University, USA sjones@sfasu.

Margin Requirements and Equilibrium Asset Prices

Exact Nonparametric Tests for Comparing Means - A Personal Summary

The Black-Scholes Formula

The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.

Option pricing. Vinod Kothari

CHAPTER 21: OPTION VALUATION

Patterns Within the Trading Day

FINANCIAL ECONOMICS OPTION PRICING

Chapter 11 Options. Main Issues. Introduction to Options. Use of Options. Properties of Option Prices. Valuation Models of Options.

Pricing Cloud Computing: Inelasticity and Demand Discovery

Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Optimization of technical trading strategies and the profitability in security markets

LOGNORMAL MODEL FOR STOCK PRICES

Corporate Income Taxation

Rise of the Machines: Algorithmic Trading in the Foreign. Exchange Market

Normalization and Mixed Degrees of Integration in Cointegrated Time Series Systems

Quality differentiation and entry choice between online and offline markets

Forecasting increases in the VIX: A timevarying long volatility hedge for equities

VIX, the CBOE Volatility Index

Estimating correlation from high, low, opening and closing prices

Option Portfolio Modeling

Estimating the Leverage Effect Using High Frequency Data 1

Hedging Illiquid FX Options: An Empirical Analysis of Alternative Hedging Strategies

The Best of Both Worlds:

Pricing Currency Options with Intra-Daily Implied Volatility

Changing income shocks or changed insurance - what determines consumption inequality?

How to assess the risk of a large portfolio? How to estimate a large covariance matrix?

Execution Costs. Post-trade reporting. December 17, 2008 Robert Almgren / Encyclopedia of Quantitative Finance Execution Costs 1

WORKING PAPER NO OUT-OF-SAMPLE FORECAST TESTS ROBUST TO THE CHOICE OF WINDOW SIZE

Part V: Option Pricing Basics

Autocorrelation in Daily Stock Returns

EconS Advanced Microeconomics II Handout on Cheap Talk

Topic 5: Stochastic Growth and Real Business Cycles

Modelling Intraday Volatility in European Bond Market

Midterm March (a) Consumer i s budget constraint is. c i b i c i H 12 (1 + r)b i c i L 12 (1 + r)b i ;

The Behavior of Bonds and Interest Rates. An Impossible Bond Pricing Model. 780 w Interest Rate Models

Multiple Linear Regression in Data Mining

Portfolio selection based on upper and lower exponential possibility distributions

Asymmetric Reactions of Stock Market to Good and Bad News

OPTIONS EDUCATION GLOBAL

A comparison between different volatility models. Daniel Amsköld

Transcription:

REALIZED VOLATILITY FORECASTING and OPTION PRICING Federico M. Bandi, Je rey R. Russell, Chen Yang GSB, The University of Chicago First draft: September 12, 26 This draft: September 3, 26 Abstract A growing literature advocates the use of microstructure noise-contaminated high-frequency data for the purpose of volatility estimation. This paper evaluates and compares the quality of several recently-proposed estimators in the context of option pricing. Using forecasts obtained by virtue of alternative volatility estimates, agents price short-term options on the S&P 5 index before trading with each other at average prices. The agents average pro ts and the Sharpe ratios of the pro ts constitute the metrics used to evaluate alternative volatility estimates and corresponding forecasts. For our data, we nd that estimators with superior nite sample mean-squared-error properties generate larger pro ts and higher Sharpe ratios, in general. We con rm that, even from a forecasting standpoint, there is scope for optimizing the nite sample properties of alternative volatility estimators as advocated by Bandi and Russell (25) in recent work. JEL Classi cation: C53, G13 Keywords: Option pricing, microstructure noise, volatility forecasting, economic metrics, realized variance, two-scale estimator, realized kernels. We thank Valeri Voev for his useful comments. We thank the William S. Fishman Faculty Research Fund at the Graduate School of Business of the University of Chicago (Bandi), NYU Stern (Russell), and the Graduate School of Business of the University of Chicago (Bandi, Russell, and Yang) for nancial support. Correspondence can be sent to: federico.bandi@chicagogsb.edu, je rey.russell@chicagogsb.edu, cyang1@chicagogsb.edu. 1

1 Introduction The recent, stimulating work on nonparametric volatility estimation by virtue of high-frequency asset price data has largely focused on designing estimators with satisfactory statistical properties when realistic market microstructure contaminations play a role. Important theoretical emphasis is generally placed on the asymptotic features of the proposed estimators (see, e.g., Barndor -Nielsen, Hansen, Lunde, and Shephard, 25, 26, Kalnina and Linton, 26, Zhang, Mykland, and Aït- Sahalia, 25, and Zhang, 26). A related strand of this literature focuses on the estimators nite sample performance and the importance of optimizing this performance explicitly (Bandi and Russell, 23, 25). Bandi and Russell (26b), Barndor -Nielsen and Shephard (26), and McAleer and Medeiro (26) provide thorough reviews of this growing literature. Relatively sparse work examines the forecasting ability of alternative high-frequency volatility estimators. This paper considers this issue and evaluates volatility forecasting from the vantage point of a relevant economic metric. Speci cally, we evaluate the pro ts/losses that option dealers would derive from trading on the basis of alternative volatility forecasts. To this extent, we employ a methodology proposed by Engle, Hong, and Kane (199) and e ectively operate in the context of an arti cial (or "hypothetical," in their terminology) option market. Consider two generic option traders: Trader A and Trader B. Trader A (B) estimates volatility using Method A (B). Alternative volatility estimates yield alternative volatility forecasts and, hence, di erent option prices. The trades are conducted at the mid-point of the prices. The trader with the highest volatility (option price) forecast will buy a straddle (a call and a put option) from his/her counterpart. All options are executed on the following day. The average dollar pro ts and Sharpe ratios obtained from repeated implementations of this strategy represent the economic metrics used to evaluate alternative variance estimates/forecasts. The logic is simple. If the high volatility forecast is accurate, the straddle (whose price is twice the mid-point of the call/put price forecasts in our framework) is underpriced. The trader who buys the straddle is expected to make a pro t. Since a straddle is just a portfolio containing a call option and a put option, high volatility is expected to result in either option being substantially in the money. If the cost of this position is relatively low, as determined by the above-mentioned underpricing, then a pro t will arise. We work with a variety of variance estimates and forecasts (Methods). Each trader uses a Method and trades with every other trader. The pair-wise trades always occur at the mid-point of the price forecasts derived from the corresponding pair of Methods. For each trader, we report the mean of the average pro ts/losses across the pair-wise trades. We employ the following Methods: 1. Realized variance constructed using 5-minute returns. 2

2. Realized variance constructed using 15-minute returns. 3. Optimally-sampled realized variance as proposed by Bandi and Russell (23). 4. The "near consistent" Bartlett kernel estimator with an optimal (in a nite sample) choice for the number of autocovariances as suggested by Bandi and Russell (25). 5. The two-scale estimator of Zhang, Mykland and Aït-Sahalia (25) with an asymptotically optimal choice for the number of subsamples as suggested by Zhang, Mykland and Aït-Sahalia (25). 6. The two-scale estimator of Zhang, Mykland and Aït-Sahalia (25) with an optimal (in a nite sample) choice for the number of subsamples as suggested by Bandi and Russell (25). 7. The at-top Bartlett kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an asymptotically optimal choice for the number of autocovariances as suggested by Barndor -Nielsen, Hansen, Lunde, and Shephard (26) 8. The at-top Bartlett kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an optimal (in a nite sample) choice for the number of autocovariances as suggested by Bandi and Russell (25). 9. The at-top cubic kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an asymptotically optimal choice for the number of autocovariances as suggested by Barndor -Nielsen, Hansen, Lunde, and Shephard (26). 1. The at-top cubic kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an optimal (in a nite sample) choice for the number of autocovariances as suggested by Bandi and Russell (25). 11. The at-top modi ed Tukey-Hanning kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an asymptotically optimal choice for the number of autocovariances as suggested by Barndor -Nielsen, Hansen, Lunde, and Shephard (26). 12. The at-top modi ed Tukey-Hanning kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an optimal (in a nite sample) choice for the number of autocovariances as suggested by Bandi and Russell (25). 3

We work with short-term options on the S&P 5 index. We use high-frequency data on the Standard and Poor s depository receipts (SPIDERS) to construct the index s intra-daily volatility estimates. Since the volatility estimates (and forecasts) use high-frequency data, they are derived for a 6-hour period. Thus, we consider three scenarios. The rst scenario implies trading 6- hour options to avoid issues posed by the need to account for the overnight returns in volatility forecasting. The second and third scenario utilize 1-day options and two alternative methods for dealing with the overnights. Our ndings suggest that explicit optimization of the nite sample mean-squared-error (MSE) properties of the proposed estimators, as advocated by Bandi and Russell (23, 25), results in important economic gains. In virtually all cases, we nd that the optimized (in a nite sample) at-top unbiased kernels (Bartlett, cubic, and modi ed Tukey-Hanning) constitute the best Methods both in terms of average pro ts and in terms of Sharpe ratios. The "near consistent" Bartlett kernel estimator and the two-scale estimator of Zhang, Mykland and Aït-Sahalia (25) can perform very well, and sometimes as well as the at-top unbiased kernels, when the number of subsamples/autocovariances is carefully chosen using nite sample methods. Choices of subsamples/autocovariances based on asymptotic, rather than nite sample, criteria might lead to excessively suboptimal performance. This is particularly evident in the case of the two-scale estimator due to a potentially large nite sample bias. Optimally-sampled realized variance almost always dominates 5- and 15-minute realized variance. As mentioned, volatility forecasting in the presence of market microstructure noise is still a largely underexplored subject. Bandi and Russell (26a) and Bandi, Russell, and Zhu (26) use reduced-form models to show that optimally-sampled realized variances (covariances) outperform realized variances (covariances) constructed using ad-hoc (5- or 15-minute, say) intervals in predicting variances (covariances) out-of-sample. Ghysels and Sinko (26a) employ the MIDAS approach of Ghysels, Santa-Clara, and Valkanov (26) to evaluate the relative performance of realized variance based on xed intervals, bias-corrected realized variance, and by-power variation. By-power variation is the preferred estimator in their framework. Large (26) uses the HAR-RV model of Corsi (23) to nd that his "alternation estimator" can have better forecasting properties than realized variance constructed using ad-hoc, xed intervals. Aït-Sahalia and Mancini (26) study the forecasting performance of the two-scale estimator and realized variance using a variety of simulation set-ups for stochastic volatility and microstructure noise. In their framework, the two-scale estimator outperforms realized variance both in terms of MSE and in terms of forecasting ability. Conditional predictive densities and con dence intervals for alternative integrated variance estimators are studied in Corradi, Distaso, and Swanson (26). 4

Two economic metrics have been proposed, thus far. Bandi and Russell (26a) consider a portfolio choice problem and the long-run utility that a mean-variance representative investor derives from alternative variance forecasts as the relevant performance criterion. A similar portfoliobased approach has been recently implemented by Bandi, Russell, and Zhu (26) and De Pooter, Martens, and Van Dijk (26) in a multivariate context (see Fleming, Kirby, and Ostdiek, 21, 23, and West, Edison, and Cho, 1993, in the no noise case). Bandi and Russell (25, 26c) study volatility forecasting for the purpose of option pricing as in the current paper. However, their focus is only on optimally-sampled realized variance and xed-interval realized variance (Bandi and Russell, 26c) and on optimally-sample realized variance and the two-scale estimator (Bandi and Russell, 25). The extant approaches focus on subsets of the existing estimators. Their ndings generally point to the preferability of optimally-sampled realized variance over realized variance constructed using xed intervals as well as to the out-of-sample usefulness of the consistent estimators, such as the two-scale estimator, when optimized using nite sample methods. Stimulating recent studies towards a more comprehensive analysis of the forecasting performance of alternative volatility estimation methods have been conducted by Andersen, Bollerslev, and Meddahi (26) and Ghysels and Sinko (26b). These papers use empirically-relevant stochastic volatility models and analytic expressions to evaluate forecasting performance in linear regressions of integrated variance on alternative variance estimates. Mincer-Zarnowitz-style regression models (Andersen, Bollerslev, and Meddahi, 26) and MIDAS regressions (Ghysels and Sinko, 26b) are used to predict variance in practise. In Andersen, Bollerslev, and Meddahi (26) and Ghysels and Sinko (26b), the forecasting metric has a statistical nature. The current paper s goal is to provide a rich comparison between alternative estimation methods in the context of an important economic criterion. We show that estimators with superior nite sample MSE properties generally have superior predictive ability in our set-up. The paper proceeds as follows. Section 2 describes the Methods. In Section 3 we discuss the pricing and trading mechanism. Section 4 is about the data. Section 5 contains the results. Tables and Figures are in the Appendix. 2 The "Methods" Consider, for simplicity, a trading day of length h = 1. Assume availability of M + 1 equispaced, observed logarithmic asset prices over [; 1] and write p j = p j + j 5

or, in terms of continuously-compounded returns, p j p (j {z 1) } r j = pj p (j 1) {z } rj + j (j 1) ; {z } " j where p denotes the unobservable equilibrium price, denotes unobservable market microstructure noise, and = 1 M represents the time distance between adjacent price observations. Assume the equilibrium price process evolves in time as a stochastic volatility local martingale, i.e., p t = Z t s dw s ; where t is a càdlàg stochastic volatility process and W t is a standard Brownian motion. The object of econometric interest is V = R 1 2 sds. The Methods attempt to identify V by mitigating the impact that the presence of the noise component has on nonparametric volatility estimates constructed using intra-daily, noise-contaminated asset returns r. Because we use SPIDERS high-frequency mid-quotes in this study, the price formation mechanism should be understood as the SPIDERS mid-quote formation mechanism. Realized variance constructed using 5-minute returns, b V 5 min The classical realized variance estimator of Andersen, Bollerslev, Diebold, and Labys (23) and Barndor -Nielsen and Shephard (22) is simply de ned as the sum of the squared returns over the trading day, i.e., bv = MX rj 2 : j=1 In the absence of noise, b V is consistent for V as the sampling frequency increases, i.e., as M! 1 over [; 1] (see, e.g., Protter, 1995). In the presence of empirically-reasonable noise contaminations, bv is inconsistent as shown by Bandi and Russell (23) and Zhang, Mykland and Aït-Sahalia (25). To reduce the impact of the noise component, Andersen, Bollerslev, Diebold, and Labys (23) and Andersen, Bollerslev, Diebold, and Ebens (21), among others, suggest using 5-minute returns, rather than returns sampled at the highest frequencies, to estimate V in practise. Thus, bv 5 min is equal to b V with M = 72 (= 6 6=5). Realized variance constructed using 15-minute returns, V b 15 min Since, provided the noise is independent of the equilibrium price process, the bias of the realized variance estimator is equal to ME M (" 2 ), this bias can be further reduced by selecting a lower 6

number of intra-daily returns for the purpose of variance estimation. Relying on plots of realized variance versus alternative sampling frequencies (i.e., "volatility signature plots") and the levelling o of realized variance around the 15-minute frequency, Andersen, Bollerslev, Diebold, and Labys (1999, 2) recommend using 15- or 2-minute frequencies for the purpose of constructing realized variance. Hence, V b 15 min is equal to V b with M = 24 (= 6 6=15): Optimally-sampled realized variance, V b Opt While low sampling frequencies mitigate the extent of the estimator s bias component, they increase its variance. An optimal sampling frequency can then be chosen by optimally trading-o bias and variance as suggested by Bandi and Russell (23, 26a). Bandi and Russell (23) discuss 1 selection of the optimal MSE-based frequency M in the presence of noise dependence but provide opt a particularly simple rule-of-thumb to select this frequency when the noise is assumed to be independent of p and, as an approximation at least, i.i.d. in discrete time. In this case, M opt R 1! 1 4 3 sds (E(" 2 )) 2. Clearly, M opt depends on a signal-to-noise ratio. The larger the signal coming from the equilibrium price process Q = R 1 4 sds relative to the noise component E(" 2 ) 2, the larger M opt. Both quantities in M opt can be evaluated. The quarticity Q can be estimated inconsistently, but with little bias, by calculating Q b MX = M 3 rj 2 (realized quarticity) with low frequency returns, e.g., M = 24 j=1 or 15-minute intervals. Under the same assumptions leading to M opt, the noise second moment can be estimated consistently by computing E(" b MX 2 ) = rj 2 =M at the highest frequency at which the data arrives (see, e.g., Bandi and Russell, 23, and Zhang, Mykland and Aït-Sahalia, 25). Hence, V b Opt = V b with M c 2 1=3 opt = bq= be(" ) 2. j=1 The "near consistent" Bartlett kernel estimator with an optimal (in a nite sample) choice for the number of autocovariances as suggested by Bandi and Russell (25). Here, and below, we assume the data is not necessarily equispaced. Consider a traditional HAC estimator de ned as M 1 bv Bar = M q 1 q b + 2 qx q s=1 q s b s ; 7

where b s = P M s j=1 r jr j+s is the s-th return autocovariance. Clearly, b = V b, hence the estimator weighs the classical realized variance estimator (possibly applied to non-equispaced data) and the rst q return autocovariances by virtue of Bartlett-type kernel weights. This estimator is in the spirit of the rst-order bias-corrected estimator studied by Zhou (1996), Hansen and Lunde (26), and Oomen (25). Under i.i.d. s and independence between p and, Barndor -Nielsen, Hansen, Lunde, and Shephard (25) show that, if M; q! 1 with q q2 M! and M! 1, even though inconsistent for V, V b Bar has a small limiting variance given by 4 E( 2 ) 2. We implement b V Bar by selecting the number of autocovariances on the basis of the MSE-based approximation suggested by Bandi and Russell (25): 1 q 3 2! V 2 1=3 M 2 M: Q As earlier in the case of M opt, q is determined by a bias-variance trade-o. Leaving proportionality factors aside, the term V 2 represents the leading term of the estimator s bias, whereas Q M 2 represents the leading term of its variance. The larger the bias term relative to the variance term, the larger the number of autocovariances. The optimal number of autocovariances is selected as a fraction of the number of observations M for each trading day in the sample. Preliminary, roughly unbiased, estimates of V and Q can be obtained by employing realized variance, b V, and realized quarticity, b Q, with low frequency returns (for instance, 15-minute returns). We use this simple procedure in our study. The two-scale estimator of Zhang, Mykland and Aït-Sahalia (25) with an asymptotically optimal choice for the number of subsamples as suggested by Zhang, Mykland and Aït-Sahalia (25), b V ZMA ZMA Consider q non-overlapping sub-grids (i) of the original grid of M arrival times with i = 1; :::; q. The rst sub-grid starts at t and takes every q-th arrival time, i.e., (1) = (t ; t +q ; t +2q ; :::; ), the second sub-grid starts at t 1 and also takes every q-th arrival time, i.e., (2) = (t 1 ; t 1+q ; t 1+2q ; :::; ); and so on. Given the generic i-th sub-grid of arrival times, the corresponding realized variance estimator is de ned as bv (i) = X t j ;t j+ 2 (i) p tj+ p tj 2 ; (1) 1 This approximation is valid under the same conditions yielding "near consistency" of the estimator as in Barndor - Nielsen, Hansen, Lunde, and Shephard (25). The approximation relies on the nite sample MSE of the estimator. The full MSE can be optimized directly. See Bandi and Russell (25). 8

where t j and t j+ denote adjacent elements in (i). Zhang, Mykland, and Aït-Sahalia s two-scale or subsampling estimator is constructed as P q bv ZMA i=1 V = b (i) q P M j=1(p tj+ p tj ) 2 M b E(" 2 ); (2) where M = M q+1 q ; E(" b 2 ) = M is a consistent estimate of the second moment of the noise return, as discussed earlier, and ME(" b 2 ) is a bias-correction. The estimator averages the realized variance estimates constructed on the basis of subsamples and bias-corrects them. Under i.i.d. s and independence between p and, Zhang, Mykland, and Aït-Sahalia (25) show that, if M; q! 1 with q M q = cm 2=3, M 1=6 b V ZMA! and q2 M V )! 1, the estimator is consistent. Speci cally, if r8c 2 (E( 2 )) 2 + c 43 Q! N(; 1): (3) The constant c can be selected optimally in order to minimize the estimator s limiting variance. This minimization leads to an asymptotically optimal number of subsamples given by q ZMA = c ZMA M 2=3 = 16 E(2 )! 2 1=3 M 2=3 = 3 E("2 )! 2 1=3 M 2=3 (4) Q 4 3 Q (Zhang, Mykland, and Aït-Sahalia, 25). Q and E(" 2 ) can be estimated as in the case of M opt : Hence, b V ZMA ZMA = b V ZMA with q = q ZMA. The two-scale estimator of Zhang, Mykland, and Aït-Sahalia (25), with an optimal (in a nite sample) choice for the number of subsamples as suggested by Bandi and Russell (25), bv ZMA Barndor -Nielsen, Hansen, Lunde, and Shephard (25) have shown that the two-scale estimator can be rewritten as a modi ed Bartlett-type kernel estimator, i.e., bv ZMA = 1 M q + 1 b Mq + 2 qx q s q s=1 b s 1 q # q; (5) where b s = P M s j=1 r jr j+s, # 1 =, and # q = # q 1 + (r 1 + ::: + r q 1 ) 2 + (r M q+2 + ::: + r M ) 2 for q 2. The term 1 q # q, which mechanically derives from subsampling, is crucial for the estimator s consistency and di erentiates b V ZMA from the inconsistent Bartlett kernel estimator b V Bar. Under i.i.d. s and independence between p and ; Bandi and Russell (25) nd that, despite the inconsistency of b V Bar, the nite sample MSEs of b V ZMA and b V Bar are similar in practise. They 9

suggest using the same rule-of-thumb q as for V b Bar in constructing V b ZMA. Hence, V b ZMA equal to V b ZMA with q = q. The at-top Bartlett kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an asymptotically optimal choice for the number of autocovariances as suggested by Barndor -Nielsen, Hansen, Lunde, and Shephard (26), b V BNHLS(Bar) BNHLS In recent work, Barndor -Nielsen, Hansen, Lunde, and Shephard (26) have advocated using unbiased (under i.i.d. s and independence between p and ) at-top symmetric kernels of the type bv BNHLS = b + qx w s (b s + b s ); (6) s=1 where b s = P M j=1 r jr j s with s = q; :::; q, w s = k s 1 q and k is a function on [; 1] satisfying k() = 1 and k(1) =. If q = cm 2=3, this family of estimators is consistent (at rate M 1=6 ) and asymptotically mixed normal. When k(x) = 1 x (the Bartlett case), the limiting variance of bv BNHLS is the same as that of the two-scale estimator. Hence, q can be chosen asymptotically as suggested by Zhang, Mykland, and Aït-Sahalia (25). The at-top Bartlett kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an optimal (in a nite sample) choice for the number of autocovariances as suggested by Bandi and Russell (25), b V BNHLS(Bar) As earlier, Bandi and Russell (25) provide an alternative way to choose the optimal number of observations. Consistent with the case of b V ZMA and b V Bar, for a generic k(x), they advocate choosing q as M with < 1, where minimizes the nite sample MSE of the estimator, i.e., 2 is where MSE = (bias()) 2 + var(); bias() = 2 The MSE is obtained under i.i.d. s and independence between p and, i.e., the same conditions used to prove the limiting properties of this class of estimators as well as the limiting properties of the estimators discussed above (see Barndor -Nielsen, Hansen, Lunde, and Shephard, 25, 26, Zhang, Mykland and Aït-Sahalia, 25, Zhang, 26). In addition, normality of the errors is assumed solely to dispense with the estimation of the fourth noise moment. This assumption can be easily relaxed (Barndor -Nielsen, Hansen, Lunde, and Shephard, 26, and Bandi and Russell, 25). 1

and with var () = Q M w? 1 w + 4 E( 2 ) 2 M(w? 2 w) + 4 E( 2 ) 2 (w? 3 w) + (2E( 2 )V )4(w? 4 w); w = 1 1; 1; k ; :::; k M M 1? ; M and a a = 1; :::; 4 are (M + 1; M + 1) square matrices. For j M, the matrices 1 and 4 are de ned as follows: 1 [1; 1] = 2; 1 [1 + j; 1 + j] = 4; 4 [1; 1] = 1; 4 [2; 1] = 1; 4 [1; 2] = 1; 4 [2; 2] = 2; 4 [1 + j; 1 + j] = 2; 4 [1 + j; j] = 1; 4 [j; j + 1] = 1: For j M 1, the matrices 2 and 3 are de ned as follows: 2 [1; 1] = 3; 2 [1; 2] = 4; 2 [2; 1] = 4; 2 [2; 2] = 7; 2 [2 + j; 2 + j] = 6; 2 [2 + j; 1 + j] = 4; 2 [1 + j; 2 + j] = 4; 2 [2 + j; j] = 1; 2 [j; 2 + j] = 1; 3 [1; 1] = 1; 3 [1; 2] = 2; 3 [2; 1] = 2; 3 [2; 2] = 4:5; 3 [j + 2; j + 2] = 3(j + 1) 1; 3 [2 + j; 1 + j] = 2(j + 1); 3 [1 + j; 2 + j] = 2(j + 1); 3 [2 + j; j] = (j + 1)=2; 3 [j; 2 + j] = (j + 1)=2: Hence, b V BNHLS(Bar) is equal to b V BNHLS with q = M and k(x) = 1 x. The at-top cubic kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an asymptotically optimal choice for the number of autocovariances as suggested by Barndor -Nielsen, Hansen, Lunde, and Shephard (26), b V BNHLS(Cubic) BNHLS Consider again the class of estimators in Eq. (6). Barndor -Nielsen, Hansen, Lunde, and Shephard (26) have shown that, if k(:) is also chosen in such a way as to guarantee that k () = and k (1) =, the number of autocovariances can be selected as q = cm 1=2 and the estimator is consistent at rate M 1=4. Speci cally, 11

) M 1=4 V b BNHLS s V 4ck ; Q 8c 1 k ;2 E( 2 ) V + E(2 ) n + 4 (E( 2 2 )) 2 c 3 k () + k ;4 o! N(; 1); (7) where k ; = R 1 k2 (x)dx; k ;2 = R 1 k(x)k (x)dx; and k ;4 = R 1 k(x)k (x)dx. Importantly, when k(x) = 1 3x 2 + 2x 3, the limiting distribution of V b BNHLS is the same as that of the multi-scale estimator of Zhang (26). We de ne V b BNHLS(Cubic) BNHLS as V b BNHLS(Cubic) with q = cm 1=2 and c chosen as the minimizer of the limiting variance in Eq. (7). The at-top cubic kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an optimal (in a nite sample) choice for the number of autocovariances as suggested by Bandi and Russell (25), b V BNHLS(Cubic) We de ne b V BNHLS(Cubic) as b V BNHLS with q = M and k(x) = 1 3x 2 + 2x 3. The at-top modi ed Tukey-Hanning kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an asymptotically optimal choice for the number of autocovariances as suggested in Barndor -Nielsen, Hansen, Lunde, and Shephard (26), V b BNHLS(T H) BNHLS Finally, we consider the modi ed Tukey-Hanning at-top kernel estimator since it is asymptotically more e cient than the cubic estimator, i.e., k(x) = 1 cos (1 x) 2 =2 (Barndor -Nielsen, Hansen, Lunde, and Shephard, 26). We de ne V b BNHLS(T H) BNHLS as V b BNHLS(T H) with q = cm 1=2 and c chosen as the minimizer of the limiting variance in Eq. (7). The at-top modi ed Tukey-Hanning kernel estimator of Barndor -Nielsen, Hansen, Lunde, and Shephard (26) with an optimal (in a nite sample) choice for the number of autocovariances as suggested in Bandi and Russell (25), V b BNHLS(T H) We de ne V b BNHLS(T H) as V b BNHLS with q = M and k(x) = 1 cos (1 x) 2 =2. 2.1 Out-of-sample forecasting All variance estimates are obtained using intra-daily high-frequency returns over a 6-hour (1am to 4pm) period. The choice variables, such as the optimal sampling frequency of the realized variance estimator, the number of subsamples of the two-scale estimator, and the number of autocovariances of the at-top symmetric kernels, for instance, should be interpreted as time-varying, i.e., they are 12

selected for each trading day in our sample. For each trading day, optimization of the nite sample MSE of the estimators translates into the selection of an optimal number of returns in the case of realized variance and into the selection of an optimal number of autocovariances/subsamples, for a given number of returns, in the case of the alternative estimators. The out-of-sample forecasts are derived using ARFIMA models. 3 All of the 12 Methods above are computed by employing 3 sample lengths to estimate the ARMA parameters: (1) 1, days, (2) 1,5 days, and (3) the entire set of daily data with no less than 1,5 observations. 4 This translates into a total of 3 12 = 36 Methods. As in Engle, Hong, and Kane (199) we add three additional Methods, namely the average of the daily forecasts (Mean), and the daily minimum (Min) and maximum (Max) forecast, for a total of 39 Methods. As emphasized by Engle, Hong, and Kane (199), the average of n forecasts that are conditionally independent and of similar quality will converge to an accurate forecast at a fast rate. Failure to do so indicates very large economic di erences and/or dependence of the forecasts. The use of the minimum (maximum) forecast should highlight systematic upward (downward) biases in the forecasts. Speci cally, if the forecasts are largely downward biased, then the maximum forecast should have better performance than the minimum forecast and the forecasts that are considerably downward biased. Conversely, if the forecasts are largely upward biased, then the minimum forecast should have better performance than the maximum forecast and the forecasts that are very upward biased. To assess the in uence of the overnights in variance forecasting, we consider three scenarios. The rst involves pricing 6-hour options. Since the intra-daily variance estimates are for a 6-hour period, straightforward one-step-ahead forecasting yields the needed variance prediction. The second and third scenario involve pricing 1-day options. We account for the overnights in two alternative ways. The rst procedure multiplies each original 6-hour variance estimate e V (before forecasting) by a constant factor de ned as where RS&P 5 i = nx i=1 R S&P 5 2 i ; (8) nx ev i i=1 is the daily return on the S&P 5 index for day i and n is the total number of 3 A previous draft of the paper used ARMA speci cations. The resulting ranking of the forecasts was virtually the same as that reported below. 4 The fractional d parameter is estimated by virtue of the GPH estimator (Geweke and Porter-Hudak, 1983) using the full sample (see Table 1). Because this parameter can hardly be estimated e ciently, we opt for using the longest span of available data for its estimation. While this choice reduces the genuine out-of-sample nature of our empirical exercise, we nd that alternative, reasonable choices of the d parameter do not a ect our results in any meaningful way. 13

days in our sample. This procedure ensures that the average of the transformed variances, i.e., e V, is equal to the average of the squared returns. Hence, will blow up the 6-hour variance estimates ev. The second procedure simply adds the square of the corresponding overnight returns to each estimate e V. 3 Option pricing and option trading We follow Engle, Hong, and Kane (199) in designing an hypothetical option market. Every agent is associated with a Method. Given his/her Method the agent prices 6-hour or 1-day options on a $1 share of the S&P 5 index. The exercise price of the option is taken to be $1, the risk-free rate is taken to be zero. Under these assumptions, the predicted Black-Scholes call price is given by 1 P t = 2 2 t where is the cumulative normal distribution and t is a speci c volatility forecast, as delivered by a Method. By put/call parity, P t is also the corresponding put price. We are now speci c about the various stages of the pricing and trading process. 1. Given a Method, each agent computes his/her fair option price for either a 6-hour option or a 1-day option on a $1 share of the S&P 5 index. 2. Pair-wise trades take place. The pair-wise trades are conducted at the mid-point of the agents prices. Agents with variance forecasts higher than the mid-point will perceive the options to be underpriced. Hence, they will buy from their counterpart. Importantly, the agents buy or sell straddles (one put and one call). Hence, agents with high variance forecasts e ectively speculate on volatility in that, on the one hand, they perceive the straddle to be underpriced while, on the other hand, they count on either option to end up considerably in the money due to the expected high volatility. Finally, the positions are hedged. The delta, or hedge ratio, of the call option is given by 1 2 t. Hence, a trader who buys a call should go short 1 2 t shares of the stock for a riskless (given small changes in the stock price) position in the option and in the stock. Similarly, a trader who buys a put should go long 1 1 2 t for a riskless position. The hedge ratio of the straddle is the sum of the two hedge ratios, i.e., 1 2 1 2 t. The daily pro t to a trader who buys the straddle is then given by S&P 5 S&P 5 1 Rt 2P t + Rt 1 2 2 t : 14 1;

The daily pro t to a seller is R S&P 5 2P t R t S&P 5 t 1 1 2 2 t : 3. For each trading day, the pro ts/losses are computed for each agent (Method). The total daily pro t for each agent is just the sum of the pair-wise pro ts divided by l 1, where l is the total number of agents (Methods), i.e., 39. The pro ts/losses are then averaged across days. 4 The data We employ high-frequency data on the Standard and Poor s depository receipts (SPIDERS) to construct the index s intra-daily volatility estimates. SPIDERS are shares in a trust which owns stocks in the same proportion as that found in the S&P 5 index. SPIDERS trade like a stock (with the ticker symbol SPY on the Amex) at approximately one-tenth of the level of the S&P 5 index. They are widely used by institutions and traders as bets on the overall direction of the market or as a means of passive management (see, e.g., Elton, Gruber, Comer, Li, 22). Our sample extends from January 2, 1998 to March 31, 26. We use SPIDERS mid-quotes on the NYSE. We remove quotes whose associated price changes and/or spreads are larger than 1%. Table 1 contains descriptive statistics. In our sample, the average duration between quote updates is 11:53 seconds. The average spread and the average price level are :15 and 117:27, respectively. The second moment of the noise E(" 2 ), V, and Q are necessary inputs in our sampling frequency and bandwidth selection procedures. We estimate E(" 2 ) using quote-to-quote continuouslycompounded returns. The variance and quarticity estimates are obtained by using b V (realized variance) and b Q (realized quarticity) with xed, 15-minute, calendar-time intervals. The prevailing quote method is used in the absence of a quote. The out-of-sample forecasts are derived by virtue of an ARFIMA(1; 1) model. This simple speci cation provides a reasonable and parsimonious model for all series. The standard errors used to evaluate the statistical signi cance of the pro ts are Newey-West standard errors. The correlation structures of the various pro ts is mild. We use 6 autocovariances for all pro t series based on their inspection. The results are insensitive to the use of an alternative, reasonable number of autocovariances. The average optimal sampling interval of the realized variance estimator is 5:7 minutes. The average optimal (in a nite sample) number of autocovariances of V b ZMA (and V b Bar ), V b BNHLS(Bar) bv BNHLS(Cub) BNHLS(T H),, and V b is 16; 7; 6; and 1, respectively. The average asymptotically optimal number of autocovariances of V b ZMA ZMA, V b BNHLS(Bar) BNHLS, V b BNHLS(Cub) BNHLS, and V b BNHLS(T H) BNHLS is 5; 15

5; 6; and 8, respectively. These gures are consistent with values obtained by Bandi and Russell (25, 26a) using alternative assets and sample periods. Finite sample criteria lead to a number of autocovariances that is larger on average (but less volatile) than asymptotic criteria. The di erence is particularly evident for the two-scale estimator. The Bartlett kernel estimator and the two-scale estimator have very favorable asymptotic properties but, as discussed in Bandi and Russell (25), display a potentially large nite sample bias in general. The optimal (in a nite sample) choice of the number of autocovariances/subsamples is a ected by this bias component. Speci cally, e ective bias reduction in a nite sample requires the selection of a fairly large number of autocovariances/subsamples. The optimal (average) asymptotic choice (5) is likely to leave a substantial bias component in the resulting estimates. The class of at-top symmetric kernels is, under conditions discussed in Barndor -Nielsen, Hansen, Lunde, and Shephard (26), unbiased in a nite sample. Not surprisingly, the required optimal (in a nite sample) number of autocovariances is smaller than in the two-scale case (7; 6; and 1 versus 16). Also, again not surprisingly, the di erence between nite sample and asymptotic choices is smaller in this case. If the conditions under which Bandi and Russell (25), Zhang, Mykland and Aït-Sahalia (25) and Barndor -Nielsen, Hansen, Lunde and Shephard (25) prove the nite sample and limiting properties of these estimators are satis ed at least as an approximation (independence between p and and MA(1) " s), then (1) the class of at-top symmetric kernels should have no systematic biases, (2) the optimized (using nite sample methods) at-top symmetric kernels should have a smaller variance than their asymptotically optimal counterparts, (3) the optimized (using nite sample methods) two-scale estimator should have a smaller bias component than the asymptotically optimal two-scale estimator. The time variation in the optimal sampling intervals of the realized variance estimator and in the optimal (asymptotic and nite sample) number of autocovariances of the kernel estimators is substantial. Table 1 reports descriptive statistics. Figs. 1 and 2 contain the corresponding daily time series. Since the optimal number of autocovariances is positively related to the number of trades in each day, in all cases we experience a clear upward trend re ecting the corresponding upward trend in the number of trades. Similarly, the downward trend in the optimal sampling intervals of the realized variance estimator are largely due to decreases in the noise moments relative to the quarticity of the underlying. Fig. 3 (levels) and Fig. 4 (di erences) compare nite sample and asymptotic choices of the number of autocovariances. In the case of the two-scale estimator, the optimal (in a nite sample) choice is almost always higher than the asymptotically optimal choice. The only exceptions are for values at the end of the sample. As pointed out earlier, in light of the absence of a large nite 16

sample bias component, asymptotic and nite sample choices are expected to be more similar in the case of the at-top symmetric kernels. Figs. 3 and 4 con rm this intuition. In this case, large deviations are generally associated with a lower, asymptotically optimal number of autocovariances. Fig. 5 contains time-series plots of b V 5 min, b V 15 min, and b V Opt. Consistent with theory, b V 5 min and V b 15 min have slightly more erratic behavior then V b Opt. Fig. 6 presents plots of V b Bar, V b ZMA ZMA, and V b ZMA. Again, in agreement with theory, V b Bar and V b ZMA (the two-scale estimator with an optimal, in a nite sample, number of subsamples) are virtually identical. V b ZMA ZMA, the two-scale estimator with an asymptotically optimal number of subsamples, is less volatile and appears to be downward biased. Fig. 7 and Fig. 8 report the three at-top kernel estimates for asymptotic and nite sample choices of the number of autocovariances. The dynamic evolution of the intra-daily volatility estimates is similar in this case. In Table 2 we report the average variance forecasts, the standard deviations of the forecasts, and the average forecast errors (de ned here as the di erence between the variance forecasts and the squared 6-hour returns) as a percentage of the sample variance of the 6-hour returns. The realized variance estimates tend to lead to higher forecasts than the at-top symmetric kernel estimates. Similarly, the asymptotically optimal two-scale estimator leads to lower forecasts. This is indicative of a likely upward bias in the realized variance estimates and, again, a likely downward bias in the asymptotically optimal two-scale estimator. We now turn to option pricing and the pro ts from trading in an hypothetical option market. 5 Ranking volatility forecasts Table 3 reports the average option prices (in cents), the average pro ts from holding a straddle (in cents), as well as the standard deviations of the pro ts. The relevant horizon is 6 hours. In other words, the intra-daily variance estimates are not adjusted for lack of overnights. The pro t from holding a straddle is R S&P 5 t 2P t, where P t is the call/put price forecast induced by the corresponding Method. Here we are assuming that each agent goes long a straddle and pays a price for it given by the agent s volatility forecast. If the forecast is accurate, the corresponding pro t should be small in absolute value. For the longest data lengths, the smallest pro ts are guaranteed by the at-top Bartlett kernel estimator b V BNHLS(Bar) in its two versions (asymptotically optimal and nite sample optimal). The largest pro ts are given by 5-minute realized variance, b V 5 min. The magnitudes of the pro ts associated with Max, Min, and Mean suggest lack of systematic (across estimators) downward or upward biases. However, the upward-biased realized variance-based forecasts translate into upward-biased price estimates (and large losses), 17

whereas the downward-biased forecasts induced by the asymptotically optimal two-scale estimator lead to downward-biased price estimates (and large positive pro ts). The standard deviations of the pro ts do not vary much across alternative Methods. We now turn to trading, as described in Section 3, and consider rst the case of options with a 6-hour expiration time. We rank the Methods based on average pro ts (Table 4) as well as Sharpe ratios (Table 5). The Sharpe ratios are simply de ned as the average pro ts divided by the standard deviations of the pro ts. In what follows, the symbols V 1; ; V 1; 5, and V 1; 5M, de ne the generic Method V with ARMA parameters estimated using the three sample lengths described in Section 2, i.e., 1, observations, 1,5 observations, and the entire sample of data with at least 1,5 observations, respectively. When ranked based on average pro ts, 5 Methods perform better than Mean. The crosssectional dispersion of the pro ts is substantial, thereby indicating economically signi cant di erences between alternative variance forecasts. We con rm that, even though there are no systematic biases in the forecasts (Max and Min give pro ts equal to 17:4 and 4:4 with ranks equal to 39 and 34; respectively), upward biases are associated with the realized variance measures ( V b Opt, bv 5 min, and V b 15 min ), while downward biases are associated with V b ZMA ZMA. The windows on the right of Table 4 break down the Methods based on their data lengths. In all cases the largest pro ts are given by the optimized (in a nite sample) at-top symmetric kernels. The optimized (in a nite sample) two-scale estimator performs well and, sometimes, virtually as well as the optimized (in a nite sample) at-top symmetric kernels in our sample. The performance of the asymptoticallyoptimal at-top symmetric kernels is quite satisfactory, mainly in the Bartlett and cubic cases. For the longest data lenghts, b V Opt performs better than b V 5 min and b V 15 min. Di erently from the pricing exercise in Table 3, the standard deviation of the pro ts is quite variable across di erent Methods. The smallest standard deviation is associated with Mean, thus suggesting some degree of independence between alternative forecasts. Not surprisingly, ranking based on Sharpe ratios favors the Mean (see Table 5). The subsequent four Methods in the ranking are V b BNHLS(Cub) 1; 5, V b BNHLS(Bar) 1; 5, V b ZMA 1; 5, and V b BNHLS(Cub) BNHLS 1; 5: Importantly, when we break down the Methods on the basis of data length, we largely nd the same ranking as earlier. The largest Sharpe ratios are generally associated with the optimized (in a nite sample) at-top symmetric kernels. The optimized (in a nite sample) two-scale estimator performs well. Again, b V Opt outperforms b V 5 min and b V 15 min for the longest data lengths. We now deal with the overnights explicitly and price 1-day options (Table 6 and Table 7). First, we multiply each variance estimate by in Eq. (8). The adjustment guarantees that the average of the volatility estimates coincides with average of the S&P 5 squared daily returns. 18

Naturally, this adjustment has the potential to partly reduce the economic signi cance of alternative high-frequency volatility estimates/forecasts. Despite the adjustment, the overall picture remains fairly unchanged. Mean is now the fth best forecast. The optimized (in a nite sample) at-top symmetric kernels largely dominate all other forecasts for every choice of sample length. The asymptotically-optimal at-top symmetric kernels, the optimized (in a nite sample) Bartlett kernel, and the optimized (in a nite sample) two-scale estimator continue to fare well. Similarly, V b Opt continues to outperform V b 5 min and V b 15 min. Interestingly, the asymptotically-optimal two-scale estimator is now outperformed by all realized variance estimators. This is consistent with ndings in Bandi and Russell (25), where the same adjustment for lack of overnights is employed. Examining the Sharpe ratios, rather than the average pro ts, does not modify our results. The picture does not change when explicitly using the overnight returns, rather than the adjustment, to obtain daily variance forecasts for the purpose of pricing daily options (Table 8 and Table 9). If anything, notwithstanding the obvious contamination that this correction entails, the resulting ranking appears even cleaner, mainly in longest data length case. Barring a couple of exceptions, the ranking is: (1) optimized (in a nite sample) at-top symmetric kernels, (2) asymptotically-optimal at-top symmetric kernels, (3) optimized (in a nite sample) Bartlett kernel and two-scale estimator, (3) asymptotically-optimal two-scale estimator and optimally-sampled realized variance, and (4) realized variance based on ad-hoc intervals. Within the class of at-top symmetric kernels, the Bartlett and cubic kernel are the preferred kernels in all of our experiments. The modi ed Tukey-Hanning kernel is virtually always inferior to the alternative kernels, despite its favorable theoretical properties. Importantly, the di erences between alternative average pro ts/sharpe ratios can be quite large. This result is consistent across experiments and indicates that, from an economic standpoint, there is scope for employing nite sample adjustments even in situations where the performance of the asymptotically-optimal estimators is, in terms of ranking, similar to the performance of the optimal (in a nite sample) estimators. Not only are the pro ts economically signi cant, they are also statistically signi cant. Table 4, 6, and 8 contain the corresponding results. Joint Chi-squared tests of the null of zero pro ts reject the null overwhelmingly in all cases (i.e., when considering 6-hour options as well as when considering 1-day options with di erent adjustments for the overnights). The 5% critical value of a Chi-squared test with 39 degrees of freedom, as in our case, is about 55. The tests values are 37, 94, and 2779, respectively. Hence, at least one strategy earns pro ts statistically signi cantly di erent from zero. Pairwise t-tests of the null of equal pro ts between optimal (in a nite sample) kernel estimators and asymptotically-optimal kernel estimators favor the former. The pro ts are 19

generally statistically di erent and, importantly, larger in the case of the optimal (in a nite sample) kernel estimators. Interestingly, even in situations where a certain asymptotically-optimal kernel estimator performs as well as (or better) than an alternative optimal (in a nite sample) kernel estimator, the corresponding optimal (in a nite sample) version of that estimator delivers pro ts that are, in general, statistically higher. Hence, it seems valuable to optimize the performance of estimators that would perform satisfactorily even when implemented using asymptotic bandwidth selection criteria. We nd it important that our reported ranking almost perfectly mirrors that found by Bandi and Russell (25) using nite sample MSE expansions as the relevant metric. Bandi and Russell (25) emphasize that evaluating volatility estimators solely based on their limiting properties is, in general, not the right criterion. Certain "near consistent" and consistent estimators, such as the Bartlett kernel estimator and the two-scale estimator, have favorable asymptotic properties but can display large nite sample biases. Equivalently, estimators with the same asymptotic properties, such as the two-scale estimator and the unbiased at-top Bartlett kernel estimator, can have drastically di erent nite sample features, largely due to bias considerations. The nite sample bias needs to be reduced for certain estimators to perform at their full potential. Finite sample bandwidth selection methods, such as the one proposed by Bandi and Russell (25), can yield bias reduction while optimally trading-o bias and variance. Because consistent estimators are asymptotically unbiased, asymptotic bandwidth selection methods can be very detrimental in practise. In general, these methods do not capture the nite sample bias of these estimators. In addition, we show that, even in the case of estimators with no obvious biases, there is scope for optimizing their nite sample variance properties by appropriately selecting the corresponding number of autocovariances. If we were to consider reduced-form linear forecasting models as in Andersen, Bollerslev, and Meddahi (26), Bandi and Russell (26a), and Ghysels and Sinko (26a, 26b), among others, we would expect the nite sample bias of alternative Methods for the regressor, when largely constant across days, to play no role. If the regressand is roughly unbiased, then using regressors with smaller variance would lead to (roughly) unbiased forecasts and forecasting gains from an R-squared perspective (Andersen, Bollerslev, and Meddahi, 26, and Ghysels and Sinko, 26b). Consistent with this view, Ghysels and Sinko (26a) nd that the bias-correction suggested by Hansen and Lunde (26) does not lead, when applied to the regressor, to substantial forecasting gains in their regression framework. This said, an unbiased regressand is necessary in a linear regression context for the forecast to be unbiased, i.e., for the level of volatility to be correct. While emphasis is placed on variance considerations in Andersen, Bollerslev, and Meddahi (26) 2

and Ghysels and Sinko (26b), important bias considerations lead to their choice of unbiased regressands for the purpose of volatility prediction. It is of interest for future work to analyze how the e ective separation between bias issues (leading to the level of the volatility forecast) and variance issues (leading to the accuracy of the forecast) in a linear regression forecasting model would fare in the context of our option pricing metric. Here we employ classical time-series models to predict out-of-sample. Bias and variance considerations are dealt with by directly evaluating the nite sample MSEs of the individual estimators. We nd it important, from an applied standpoint, that explicit optimization of the nite sample MSE properties of these estimators, as suggested by Bandi and Russell (23, 25), leads to substantial economic gains and a somewhat expected ("ex-ante") ranking between alternative forecasts. Optimizing the adopted economic criterion directly could be bene cial, and is of interest for future work, but would lead to theoretical and empirical complications which practitioners, at least, would likely prefer to avoid. 6 Conclusions The relative success of alternative high-frequency variance measures depends on their out-of-sample forecasting performance. Yet, a variety of loss functions can be proposed in practise. Either statistical or economic metrics could be entertained. Within each broad category, several alternative criteria can be invoked. It is legitimate to argue that well-posed economic loss functions represent crucial measures of the success of di erent forecasts. We take this position in this paper. Speci cally, we attempt to provide a rich comparison between recently-proposed high-frequency measures of variance in the context of an important, nonlinear economic criterion. In the context of our proposed metric, we do not nd a drastic di erence between the "exante" ranking that one would derive from careful evaluation (and optimization) of the nite sample MSE properties of various estimators and the "ex-post" ranking of the estimators out-of-sample forecasts. Future work will further study this issue in the context of alternative metrics. The pricing and trading of longer-maturity options, and the resulting ranking of long-term volatility estimates/forecasts, is also left for future research. 21

References [1] Aït-Sahalia, Y., and L. Mancini (26). Out-of-sample forecasts of quadratic variation. Working paper. [2] Andersen, T.G., T. Bollerslev, and N. Meddahi (26). Realized volatility forecasting and market microstructure noise. Working paper. [3] Andersen, T.G., T. Bollerslev, F.X. Diebold, and H. Ebens (21). The distribution of realized stock return volatility. Journal of Financial Economics 61, 43-76. [4] Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys (1999). (Understanding, optimizing, using, and forecasting) Realized volatility and correlation. Working paper. [5] Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys (2). Great Realizations. Risk Magazine 13, 15-18. [6] Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys (23). Modeling and forecasting realized volatility. Econometrica 71, 579-625. [7] Bandi, F.M., and J.R. Russell (23). Microstructure noise, realized volatility, and optimal sampling. Working paper. [8] Bandi, F.M., and J.R. Russell (25). Market microstructure noise, integrated variance estimators, and the accuracy of asymptotic approximations. Working paper. [9] Bandi, F.M., and J.R. Russell (26a). Separating microstructure noise from volatility. Journal of Financial Economics 79, 655-692. [1] Bandi, F.M., and J.R. Russell (26b). Volatility. In J. Birge and V. Linetski (Eds.) Handbook of nancial engineering. Elsevier, North Holland, forthcoming. [11] Bandi, F.M., and J.R. Russell (26c). Microstructure noise, realized variance, and optimal sampling. Working paper. [12] Bandi, F.M., J.R. Russell, and Yinghua Zhu (26). Using high-frequency data in dynamic portfolio choice. Econometric Reviews, forthcoming. [13] Barndor -Nielsen, O.E., and N. Shephard (22). Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society, Series B 64, 253-28. 22

[14] Barndor -Nielsen, O.E., and N. Shephard (26). Variation, jumps, market frictions and highfrequency data in nancial econometrics. In R. Blundell, P. Torsten and W. K. Newey (Eds.) Advances in Economics and Econometrics. Theory and Applications. Ninth World Congress. Econometric Society Monographs, Cambridge University Press, forthcoming. [15] Barndor -Nielsen, O.E., P. Hansen, A. Lunde and N. Shephard (25). Regular and modi ed kernel-based estimators of integrated variance: the case with independent noise. Working paper. [16] Barndor -Nielsen, O.E., P. Hansen, A. Lunde and N. Shephard (26). Designing realized kernels to measure ex-post variation of equity prices in the presence of noise. Working paper. [17] Corradi, V., W. Distaso, and N. Swanson (26). Predictive inference for integrated volatility. Working paper. [18] Corsi (23). A simple long-memory model of realized volatility. Working paper. [19] De Pooter, M., M. Martens, and D. Van Dijk (26). Predicting the daily covariance matrix for S&P1 stocks using intraday data: but which frequency to use? Econometric Reviews, forthcoming. [2] Elton, E.J., M. Gruber, G. Comer, K. Li (22). Spiders: Where are the bugs? Journal of Business 75, 453-472 [21] Engle, R.F., C.H. Hong, and A. Kane (199). Valuation of variance forecasts with simulated option markets. NBER Working paper No: 335. [22] Fleming, J., C. Kirby, and B. Ostdiek (21). The economic value of volatility timing. Journal of Finance 56, 329-352. [23] Fleming, J., C. Kirby, and B. Ostdiek (23). The economic value of volatility timing using realized volatility. Journal of Financial Economics 67, 473-59. [24] Geweke, J. and S. Porter-Hudak (1983). The estimation and application of long memory time series models. Journal of Time Series Analysis 4, 221-238. [25] Ghysels, E. and A. Sinko (26a). Comment on Hansen and Lunde. Journal of Business and Economic Statistics 24, 192-194. [26] Ghysels, E. and A. Sinko (26b). Volatility forecasting and microstructure noise. Working paper. 23

[27] Ghysels, E., P. Santa-Clara, and R. Valkanov (26). Predicting volatility: getting the most out of return data sampled at di erent frequencies. Journal of Econometrics, forthcoming. [28] Hansen, P.R., and A. Lunde (26). Realized variance and market microstructure noise (with discussions). Journal of Business and Economic Statistics 24, 127-161. [29] Kalnina, I. and O. Linton (26). Estimating quadratic variation consistently in the presence of correlated measurement error. Working paper. [3] Large (26). Estimating quadratic variation when quoted prices change by constant increments. Working paper. [31] McAleer, M., and M. Medeiros (26). Realized volatility: A review. Working paper. [32] Oomen, R.C.A. (25). Properties of bias-corrected realized variance under alternative sampling schemes. Journal of Financial Econometrics 3, 555 577. [33] Protter, P. (1995). Stochastic integration and di erential equations. A new approach. Springer. [34] West, K.D., H.J. Edison, and D. Cho (1993). A utility-based comparison of some models of exchange rate volatility. Journal of International Economics 35, 23-45. [35] Zhang, L., P. Mykland, and Y. Aït-Sahalia (25). A tale of two time scales: determining integrated volatility with noisy high-frequency data. Journal of the American Statistical Association 1, 1394-1411. [36] Zhang, L. (26a). E cient estimation of stochastic volatility using noisy observations: a multi-scale approach. Bernoulli, forthcoming. [37] Zhou, B. (1996). High-frequency data and volatility in foreign-exchange rates. Journal of Business and Economic Statistics 14, 45-52. 24

Appendix 2 q-v ZMA ZMA 6 q-v ZMA 15 4 1 5 2 5 1 15 2 25 5 1 15 2 25 8 6 4 2 Opt. Sampling Interval 2.5 x 14 m 2 1.5 1.5 5 1 15 2 25 5 1 15 2 25 Figure 1. The asymptotic and finite sample optimal number of autocovariances of the two-scale estimator, the optimal number of observations of the realized variance estimator, and the number of daily trades. 25

2 q-v BNHLS(Bar) BNHLS 3 q-v BNHLS(Bar) 1 2 1 5 1 15 2 25 1 5 5 1 15 2 25 2 1 q-v BNHLS(Cub) BNHLS q-v BNHLS(TH) BNHLS 5 1 15 2 25 5 1 15 2 25 15 1 5 q-v BNHLS(Cub) 5 1 15 2 25 3 2 1 q-v BNHLS(TH) 5 1 15 2 25 Figure 2. The asymptotic and finite sample optimal number of autocovariances of the flat-top symmetric kernel estimators (Bartlett, cubic, and modified Tukey-Hanning). 26

2 15 1 q-v ZMA ZMA q-v ZMA 25 2 15 1 q-v BNHLS(Bar) BNHLS q-v BNHLS(Bar) 5 5 5 1 15 2 25 5 1 15 2 25 15 1 q-v BNHLS(Cub) BNHLS q-v BNHLS(Cub) 25 2 15 q-v BNHLS(TH) BNHLS q-v BNHLS(TH) 5 1 5 5 1 15 2 25 5 1 15 2 25 Figure 3. The asymptotic and finite sample optimal number of autocovariances of the two scale estimator and of the flat-top symmetric kernel estimators (Bartlett, cubic, and modified Tukey-Hanning). 27

5 qv ZMA -qvzma ZMA 3 qv BNHLS(Bar) -qv BNHLS(Bar) BNHLS 2-5 -1 1-15 -2 5 1 15 2 25-1 5 1 15 2 25 3 qv BNHLS(Cub) -qv BNHLS(Cub) BNHLS 3 qv BNHLS(Tukey) -qv BNHLS(Tukey) BNHLS 2 2 1 1-1 5 1 15 2 25-1 5 1 15 2 25 Figure 4. The difference between the asymptotic and finite sample optimal number of autocovariances of the two-scale estimator and of the flat-top symmetric kernel estimators (Bartlett, cubic, and modified Tukey-Hanning). 28

2 x 1-3 V Opt 1 5 1 15 2 25 2 x 1-3 V 5min 1 5 1 15 2 25 2 x 1-3 V 15min 1 5 1 15 2 25 Figure 5. The realized variance estimates (optimally-sampled, 5-minute, and 15-minute). 29

2 x 1-3 V Bar 1 5 1 15 2 25 ZMA 2 x V 1-3 ZMA 1 5 1 15 2 25 ZMA 2 x V 1-3 1 5 1 15 2 25 Figure 6. Variance estimates obtained by virtue of the Bartlett kernel estimator and the two-scale estimator (asymptotically optimal and optimal in a finite sample). 3

BNHLS(Bar) 2 x V 1-3 BNHLS 1 5 1 15 2 25 BNHLS(Cub) 2 x V 1-3 BNHLS 1 5 1 15 2 25 BNHLS(TH) 2 x V 1-3 BNHLS 1 5 1 15 2 25 Figure 7. Variance estimates obtained by virtue of the flat-top symmetric kernel estimators (Bartlett, cubic, and modified Tukey-Hanning) with an asymptotically optimal number of autocovariances. 31

BNHLS(Bar) 2 x V 1-3 1 5 1 15 2 25 BNHLS(Cub) 2 x V 1-3 1 5 1 15 2 25 BNHLS(TH) 2 x V 1-3 1 5 1 15 2 25 Figure 8. Variance estimates obtained by virtue of the flat-top symmetric kernel estimators (Bartlett, cubic, and modified Tukey-Hanning) with an optimal (in finite sample) number of autocovariances. 32

Table 1: Summary Statistics / 6 hours Avg. Dur Avg.Sprd Avg Price M Da(min) 11.53.15 117.27 3648 5.7235 Asy. q Bart ZMA FlatBart FlatCubic FlatTukey Mean 5 5 6 8 Max 185 28 129 22 Min 1 1 1 1 Std 1.2281 1.9988 6.9931 1.8549 Finite q Bart ZMA FlatBart FlatCubic FlatTukey Mean 16 16 7 6 1 Max 57 57 185 99 189 Min 3 3 1 1 1 Std 6.4522 6.4522 1.1848 6.5324 9.1976 We report summary statistics about the data (mid-quotes on SPIDERS between January 2, 1998 and March 31, 26), i.e., average duration between mid-quote updates, average prices, and average number of daily observations. Da(min) denotes the average optimal sampling interval (in minutes) of the realized variance estimator. The remaining statistics summarize the distribution of the optimal (finite sample and asymptotic) number of autocovariances of the Bartlett kernel estimator, of the two-scale estimator, and of the flat-top kernel estimators (Bartlett, cubic, and modified Tukey-Hanning). 33

The d estimates (Table 1 continued) d Vopt.4964 VBart.4268 VFinZMA.429 VFlatBart.4799 VFlatCubic.4335 VFlatTukey.4474 VZMA.4891 V5min.2927 V15min.3613 VFinFlatBart.4618 VFinFlatCubic.4619 VFinFlatTukey.4528 We report estimates of the fractional d parameter obtained by virtue of the GPH estimator (Geweke and Porter-Hudak, 1983). 34

Table 2: Average Forecasts, Stds forecasts, and Forecast Errors Avg Vol % Std Vol % MSE/AV* SD OptV1 9.4719 1.5145.1186 1.597 OptV15 9.6889 1.3334.1628 1.5918 OptV15M 9.7547 1.336.1785 1.5935 BartV1 8.8737 1.6383 -.1 1.5935 BartV15 9.2153 1.3381.542 1.5851 BartV15M 9.411 1.2352.953 1.5849 FinZMAV1 8.8489 1.6444 -.151 1.5933 FinZMAV15 9.283 1.3234.522 1.5847 FinZMAV15M 9.3747 1.2413.872 1.5855 FlatBart1 8.3975 1.3465 -.124 1.5861 FlatBart15 8.727 1.898 -.68 1.583 FlatBart15M 8.8236 1.19 -.384 1.5849 FlatCubic1 8.8232 1.3887 -.3 1.5838 FlatCubic15 9.1745 1.939.379 1.581 FlatCubic15M 9.2681 1.867.587 1.5815 FlatTukey1 8.7674 1.4222 -.48 1.594 FlatTukey15 9.127 1.1369.231 1.5847 FlatTukey15M 9.26 1.1412.462 1.5865 ZMAV1 7.5928 1.361 -.278 1.5944 ZMAV15 7.6441 1.2375 -.277 1.5923 ZMAV15M 7.736 1.2416 -.2595 1.5935 V5min1 1.263 1.7488.2592 1.696 V5min15 1.895 1.626.4565 1.5842 V5min15M 11.1393.9844.52 1.5858 V15min1 9.5219 1.3767.1253 1.5853 V15min15 9.991 1.749.2273 1.582 V15min15M 1.865 1.74.257 1.584 FinFlatBart1 8.5915 1.3367 -.87 1.5777 FinFlatBart15 8.925 1.762 -.174 1.5766 FinFlatBart15M 9.299 1.875.58 1.5781 FinFlatCubic1 8.7978 1.3211 -.377 1.5759 FinFlatCubic15 9.194 1.821.232 1.575 FinFlatCubic15M 9.254 1.932.448 1.5763 FinFlatTukey1 8.7762 1.387 -.43 1.5844 FinFlatTukey15 9.1123 1.167.245 1.588 FinFlatTukey15M 9.262 1.1163.456 1.5825 Max 11.2371 1.1482.558 1.5944 Min 7.4888 1.219 -.3 1.5896 Mean 9.1982 1.1374.444 1.5799 *: The ratio of the average forecast error (i.e., the variance forecast minus the 6-h. squared return) divided by the variance of the 6-h. returns. 35

Table 3: Hypothetical Option Prices and Average Profits from Straddles Avg. Option Avg. Prof. Profits Rank Avg. Abs Price (cents) Straddle HAC SD Breakdown Profits OptV1.2384 -.3721.1557 Forecast Length 1 OptV15.24349 -.4812.1562 FlatTukey1.18 OptV15M.24514 -.5143.1563 FinFlatTukey1.22 BartV1.2231 -.715.1571 FinFlatCubic1.33 BartV15.23159 -.2432.1576 FlatCubic1.46 BartV15M.23651 -.3416.1582 FinZMAV1.59 FinZMAV1.22238 -.59.1573 FinFlatBart1.7 FinZMAV15.23141 -.2397.1577 BartV1.71 FinZMAV15M.23559 -.3233.1585 FlatBart1.168 FlatBart1.2114.1679.1563 OptV1.372 FlatBart15.21916.54.1583 V15min1.397 FlatBart15M.22175 -.463.1586 ZMAV1.572 FlatCubic1.22174 -.461.154 V5min1.651 FlatCubic15.2356 -.2227.1561 Forecast Length 15 FlatCubic15M.23292 -.2697.1564 FlatBart15.5 FlatTukey1.2233 -.18.1553 FinFlatBart15.97 FlatTukey15.22876 -.1866.1575 FlatTukey15.187 FlatTukey15M.23136 -.2385.1577 FinFlatCubic15.19 ZMAV1.1981.5723.1623 FinFlatTukey15.191 ZMAV15.1921.5466.1626 FlatCubic15.223 ZMAV15M.1936.5166.1629 FinZMAV15.24 V5min1.25197 -.658.1556 BartV15.243 V5min15.2738 -.1874.1554 OptV15.481 V5min15M.27994 -.1212.1561 ZMAV15.547 V15min1.23929 -.3973.1534 V15min15.633 V15min15.2516 -.6326.1545 V5min15.187 V15min15M.25348 -.681.1545 Forecast Length >15 FinFlatBart1.21591.74.1545 FlatBart15M.46 FinFlatBart15.22429 -.973.1562 FinFlatBart15M.15 FinFlatBart15M.22693 -.15.1564 FinFlatCubic15M.238 FinFlatCubic1.2211 -.333.1533 FlatTukey15M.238 FinFlatCubic15.22893 -.19.1551 FinFlatTukey15M.239 FinFlatCubic15M.23134 -.2382.1552 FlatCubic15M.27 FinFlatTukey1.2255 -.225.1546 FinZMAV15M.323 FinFlatTukey15.229 -.1914.1562 BartV15M.342 FinFlatTukey15M.23136 -.2386.1564 OptV15M.514 Max.2824 -.12594.1557 ZMAV15M.517 Min.1882.6246.168 V15min15M.681 Mean.23116 -.2346.1547 V5min15M.121 36

Scenario I: Using intra-daily returns over 6 hours Table 4: Rank by Annualized Daily Profits Avg. Prof HAC SD Rank Avg. Prof OptV1 -.1381 2.3779 26 Forecast Length 1 OptV15 -.4786 2.3335 28 FinFlatCubic1 4.4551 OptV15M -2.1213 2.5254 3 FinFlatBart1 3.875 BartV1 2.3616 2.2452 16 FlatCubic1 3.392 BartV15 3.6653 1.576 8 FinFlatTukey1 2.9724 BartV15M 1.2814 1.774 23 FinZMAV1 2.596 FinZMAV1 2.596 2.2995 15 BartV1 2.3616 FinZMAV15 4.542 1.487 3 FlatTukey1 1.551 FinZMAV15M 1.9694 1.6157 18 FlatBart1 1.4815 FlatBart1 1.4815 2.7779 22 V15min1.757 FlatBart15 3.6338 2.1475 9 OptV1 -.1381 FlatBart15M 2.2775 1.9732 17 ZMAV1-3.978 FlatCubic1 3.392 1.7159 13 V5min1-5.17 FlatCubic15 3.531 1.1549 11 Forecast Length 15 FlatCubic15M 1.224 1.5894 24 FinFlatBart15 6.1644 FlatTukey1 1.551 2.1374 21 FinFlatCubic15 5.527 FlatTukey15 1.7559 1.748 19 FinZMAV15 4.542 FlatTukey15M -.1575 1.8728 27 BartV15 3.6653 ZMAV1-3.978 3.5846 33 FlatBart15 3.6338 ZMAV15-2.426 3.3586 31 FlatCubic15 3.531 ZMAV15M -1.766 3.1616 29 FinFlatTukey15 3.1981 V5min1-5.17 2.755 35 FlatTukey15 1.7559 V5min15-12.444 3.4121 37 OptV15 -.4786 V5min15M -16.158 3.6948 38 ZMAV15-2.426 V15min1.757 2.375 25 V15min15-3.499 V15min15-3.499 2.6112 32 V5min15-12.444 V15min15M -5.1714 2.6676 36 Forecast Length >15 FinFlatBart1 3.875 2.578 7 FinFlatBart15M 4.2323 FinFlatBart15 6.1644 1.642 1 FinFlatCubic15M 3.5692 FinFlatBart15M 4.2323 1.58 5 FlatBart15M 2.2775 FinFlatCubic1 4.4551 1.8289 4 FinZMAV15M 1.9694 FinFlatCubic15 5.527 1.128 2 FinFlatTukey15M 1.653 FinFlatCubic15M 3.5692 1.375 1 BartV15M 1.2814 FinFlatTukey1 2.9724 1.9449 14 FlatCubic15M 1.224 FinFlatTukey15 3.1981 1.279 12 FlatTukey15M -.1575 FinFlatTukey15M 1.653 1.3959 2 ZMAV15M -1.766 Max -17.4132 3.81 39 OptV15M -2.1213 Min -4.4398 3.8455 34 V15min15M -5.1714 Mean 3.9549.717 6 V5min15M -16.158 37

Table 5: Rank by Sharp Ratio (Avg. Profits / HAC SD) Avg. Prof./HAC SD Rank Avg. Prof./HAC SD OptV1 -.581 26 Forecast Length 1 OptV15 -.251 28 FinFlatCubic1 2.436 OptV15M -.84 31 FlatCubic1 1.7711 BartV1 1.518 19 FinFlatBart1 1.5452 BartV15 2.4311 1 FinFlatTukey1 1.5283 BartV15M.7238 23 FinZMAV1 1.1289 FinZMAV1 1.1289 18 BartV1 1.518 FinZMAV15 3.2242 4 FlatTukey1.7256 FinZMAV15M 1.219 15 FlatBart1.5333 FlatBart1.5333 24 V15min1.3167 FlatBart15 1.6921 12 OptV1 -.581 FlatBart15M 1.1542 17 ZMAV1-1.198 FlatCubic1 1.7711 11 V5min1-1.8796 FlatCubic15 3.333 5 Forecast Length 15 FlatCubic15M.7565 21 FinFlatCubic15 4.8999 FlatTukey1.7256 22 FinFlatBart15 3.7543 FlatTukey15 1.299 2 FinZMAV15 3.2242 FlatTukey15M -.841 27 FlatCubic15 3.333 ZMAV1-1.198 32 FinFlatTukey15 2.5165 ZMAV15 -.7223 3 BartV15 2.4311 ZMAV15M -.5398 29 FlatBart15 1.6921 V5min1-1.8796 35 FlatTukey15 1.299 V5min15-3.6354 37 OptV15 -.251 V5min15M -4.3732 38 ZMAV15 -.7223 V15min1.3167 25 V15min15-1.359 V15min15-1.359 34 V5min15-3.6354 V15min15M -1.9386 36 Forecast Length >15 FinFlatBart1 1.5452 13 FinFlatBart15M 2.865 FinFlatBart15 3.7543 3 FinFlatCubic15M 2.643 FinFlatBart15M 2.865 6 FinZMAV15M 1.219 FinFlatCubic1 2.436 9 FinFlatTukey15M 1.1822 FinFlatCubic15 4.8999 2 FlatBart15M 1.1542 FinFlatCubic15M 2.643 7 FlatCubic15M.7565 FinFlatTukey1 1.5283 14 BartV15M.7238 FinFlatTukey15 2.5165 8 FlatTukey15M -.841 FinFlatTukey15M 1.1822 16 ZMAV15M -.5398 Max -4.5813 39 OptV15M -.84 Min -1.1546 33 V15min15M -1.9386 Mean 5.5157 1 V5min15M -4.3732 38

Tests (Table 4 continued) Joint test all b's (profits) are equal to zero: b'*inv(cov(b))*b=37.1 Pair-wise tests between finite sample profits and asymptotic profits: 1 15 >15 t(finbart-asybart) 1.79 2.11 1.53 t(fincubic-asycubic) 1.24 2.61 2.68 t(fintukey-asytukey) 2.56 2.5 2.59 t(finzma-asyzma) 2.24 2.36 1. 39

Scenario II: 1-day options - first adjustment Table 6: Rank by Annualized Daily Profits Avg. Prof HAC SD Rank Avg. Prof OptV1.593 2.8414 25 Forecast Length 1 OptV15 1.291 2.3967 15 FinFlatBart1 4.4758 OptV15M.7381 2.371 2 FinFlatCubic1 4.4498 BartV1.27 2.5915 27 FinFlatTukey1 2.555 BartV15 1.494 2.928 16 FlatCubic1 2.433 BartV15M 1.33 1.789 17 FinZMAV1.138 FinZMAV1.138 2.5883 24 OptV1.593 FinZMAV15 1.3613 2.583 14 FlatTukey1.485 FinZMAV15M.894 1.868 19 BartV1.27 FlatBart1 -.8286 2.5392 3 V15min1 -.1273 FlatBart15-1.9875 2.454 33 FlatBart1 -.8286 FlatBart15M -2.516 2.5659 34 V5min1-1.8499 FlatCubic1 2.433 2.1738 8 ZMAV1-7.5583 FlatCubic15 3.5119 1.4192 4 Forecast Length 15 FlatCubic15M 2.2426 1.3836 1 FinFlatCubic15 3.896 FlatTukey1.485 2.2673 26 FlatCubic15 3.5119 FlatTukey15.4375 1.9977 21 FinFlatBart15 2.475 FlatTukey15M 1.28 2.1686 18 FinFlatTukey15 2.2416 ZMAV1-7.5583 3.2691 37 FinZMAV15 1.3613 ZMAV15-7.98 3.113 36 OptV15 1.291 ZMAV15M -7.5782 3.23 38 BartV15 1.494 V5min1-1.8499 3.214 32 FlatTukey15.4375 V5min15.164 2.7513 23 V15min15.1711 V5min15M -.8853 2.8187 31 V5min15.164 V15min1 -.1273 2.8919 29 FlatBart15-1.9875 V15min15.1711 2.7325 22 ZMAV15-7.98 V15min15M -.667 2.549 28 Forecast Length >15 FinFlatBart1 4.4758 1.9981 1 FinFlatCubic15M 3.493 FinFlatBart15 2.475 1.5253 9 FlatCubic15M 2.2426 FinFlatBart15M 1.987 1.8272 12 FinFlatBart15M 1.987 FinFlatCubic1 4.4498 2.795 2 FinFlatTukey15M 1.6849 FinFlatCubic15 3.896 1.4447 3 BartV15M 1.33 FinFlatCubic15M 3.493 1.5535 6 FlatTukey15M 1.28 FinFlatTukey1 2.555 2.1932 7 FinZMAV15M.894 FinFlatTukey15 2.2416 1.517 11 OptV15M.7381 FinFlatTukey15M 1.6849 1.6519 13 V15min15M -.667 Max -7.6944 3.8746 39 V5min15M -.8853 Min -6.2825 3.9159 35 FlatBart15M -2.516 Mean 3.1296.816 5 ZMAV15M -7.5782 4

Table 7: Rank by Sharp Ratio (Avg. Prof. / HAC SD) Avg. Prof./HAC SD Rank Avg. Prof./HAC SD OptV1.29 26 Forecast Length 1 OptV15.5383 16 FinFlatBart1 2.24 OptV15M.3114 2 FinFlatCubic1 2.1399 BartV1.1 27 FinFlatTukey1 1.165 BartV15.514 17 FlatCubic1 1.1192 BartV15M.58 15 FinZMAV1.41 FinZMAV1.41 24 FlatTukey1.214 FinZMAV15.6614 14 OptV1.29 FinZMAV15M.4333 19 BartV1.1 FlatBart1 -.3263 31 V15min1 -.44 FlatBart15 -.8262 33 FlatBart1 -.3263 FlatBart15M -.9749 34 V5min1 -.5756 FlatCubic1 1.1192 11 ZMAV1-2.3121 FlatCubic15 2.4745 3 Forecast Length 15 FlatCubic15M 1.629 7 FinFlatCubic15 2.6369 FlatTukey1.214 25 FlatCubic15 2.4745 FlatTukey15.219 21 FinFlatBart15 1.5784 FlatTukey15M.4741 18 FinFlatTukey15 1.4838 ZMAV1-2.3121 38 FinZMAV15.6614 ZMAV15-2.2821 37 OptV15.5383 ZMAV15M -2.366 39 BartV15.514 V5min1 -.5756 32 FlatTukey15.219 V5min15.596 23 V15min15.626 V5min15M -.3141 3 V5min15.596 V15min1 -.44 29 FlatBart15 -.8262 V15min15.626 22 ZMAV15-2.2821 V15min15M -.262 28 Forecast Length >15 FinFlatBart1 2.24 4 FinFlatCubic15M 1.9629 FinFlatBart15 1.5784 8 FlatCubic15M 1.629 FinFlatBart15M 1.841 12 FinFlatBart15M 1.841 FinFlatCubic1 2.1399 5 FinFlatTukey15M 1.2 FinFlatCubic15 2.6369 2 BartV15M.58 FinFlatCubic15M 1.9629 6 FlatTukey15M.4741 FinFlatTukey1 1.165 1 FinZMAV15M.4333 FinFlatTukey15 1.4838 9 OptV15M.3114 FinFlatTukey15M 1.2 13 V15min15M -.262 Max -1.9859 36 V5min15M -.3141 Min -1.644 35 FlatBart15M -.9749 Mean 3.8354 1 ZMAV15M -2.366 41

Tests (Table 6 continued) Joint test all b's (profits) are equal to zero: b'*inv(cov(b))*b=94.26 Pair-wise tests between finite sample profits and asymptotic profits: 1 15 >15 t(finbart-asybart) 2.71 2.67 2.45 t(fincubic-asycubic) 1.71.31.77 t(fintukey-asytukey) 2.38 2.5.19 t(finzma-asyzma) 1.85 2.35 2.26 42

Scenario III: 1-day options - second adjustment Table 8: Rank by Annualized Daily Profits Avg. Prof HAC SD Rank Avg. Prof OptV1 1.4534 2.4217 2 Forecast Length 1 OptV15.6932 2.399 25 FinFlatCubic1 4.2955 OptV15M -.2368 2.5959 28 FlatCubic1 3.3829 BartV1.9799 2.257 23 FinFlatBart1 3.2273 BartV15 1.891 1.6835 17 FinFlatTukey1 3.1455 BartV15M -.16 1.71 27 FlatTukey1 1.849 FinZMAV1.958 2.39 24 OptV1 1.4534 FinZMAV15 2.315 1.653 15 FlatBart1.9859 FinZMAV15M.4375 1.6374 26 BartV1.9799 FlatBart1.9859 2.6367 22 FinZMAV1.958 FlatBart15 2.1663 2.574 16 V15min1 -.6199 FlatBart15M 1.5513 1.9454 19 V5min1-4.6364 FlatCubic1 3.3829 1.7327 9 ZMAV1-6.577 FlatCubic15 3.8347 1.1139 6 Forecast Length 15 FlatCubic15M 2.5737 1.481 13 FinFlatCubic15 5.3257 FlatTukey1 1.849 2.176 18 FinFlatBart15 4.7966 FlatTukey15 2.329 1.5484 14 FlatCubic15 3.8347 FlatTukey15M 1.3885 1.7928 21 FinFlatTukey15 3.4887 ZMAV1-6.577 3.521 35 FlatTukey15 2.329 ZMAV15-4.8529 3.3362 34 FinZMAV15 2.315 ZMAV15M -4.44 3.1629 32 FlatBart15 2.1663 V5min1-4.6364 2.6222 33 BartV15 1.891 V5min15-9.273 3.3122 37 OptV15.6932 V5min15M -12.118 3.5741 38 V15min15-3.243 V15min1 -.6199 2.3816 29 ZMAV15-4.8529 V15min15-3.243 2.558 3 V5min15-9.273 V15min15M -3.7183 2.5395 31 Forecast Length >15 FinFlatBart1 3.2273 2.3538 1 FinFlatCubic15M 4.828 FinFlatBart15 4.7966 1.5385 3 FinFlatBart15M 4.2739 FinFlatBart15M 4.2739 1.4122 5 FinFlatTukey15M 2.832 FinFlatCubic1 4.2955 1.7782 4 FlatCubic15M 2.5737 FinFlatCubic15 5.3257 1.877 1 FlatBart15M 1.5513 FinFlatCubic15M 4.828 1.2491 2 FlatTukey15M 1.3885 FinFlatTukey1 3.1455 1.9464 11 FinZMAV15M.4375 FinFlatTukey15 3.4887 1.1397 7 BartV15M -.16 FinFlatTukey15M 2.832 1.385 12 OptV15M -.2368 Max -12.5931 3.7542 39 V15min15M -3.7183 Min -7.1419 3.7532 36 ZMAV15M -4.44 Mean 3.393.7394 8 V5min15M -12.118 43

Table 9: Rank by Sharp Ratio (Avg. Prof. / HAC SD) Avg. Prof./HAC SD Rank Avg. Prof./HAC SD OptV1.61 21 Forecast Length 1 OptV15.2899 25 FinFlatCubic1 2.4156 OptV15M -.912 28 FlatCubic1 1.9524 BartV1.4354 22 FinFlatTukey1 1.6161 BartV15 1.1228 16 FinFlatBart1 1.3711 BartV15M -.94 27 FlatTukey1.8315 FinZMAV1.4118 23 OptV1.61 FinZMAV15 1.4 14 BartV1.4354 FinZMAV15M.2672 26 FinZMAV1.4118 FlatBart1.3739 24 FlatBart1.3739 FlatBart15 1.529 17 V15min1 -.263 FlatBart15M.7974 19 ZMAV1-1.725 FlatCubic1 1.9524 1 V5min1-1.7681 FlatCubic15 3.4427 4 Forecast Length 15 FlatCubic15M 1.7378 11 FinFlatCubic15 4.8964 FlatTukey1.8315 18 FlatCubic15 3.4427 FlatTukey15 1.4989 13 FinFlatBart15 3.1178 FlatTukey15M.7745 2 FinFlatTukey15 3.611 ZMAV1-1.725 34 FlatTukey15 1.4989 ZMAV15-1.4546 32 FinZMAV15 1.4 ZMAV15M -1.2775 31 BartV15 1.1228 V5min1-1.7681 35 FlatBart15 1.529 V5min15-2.7988 37 OptV15.2899 V5min15M -3.396 39 V15min15-1.1823 V15min1 -.263 29 ZMAV15-1.4546 V15min15-1.1823 3 V5min15-2.7988 V15min15M -1.4642 33 Forecast Length >15 FinFlatBart1 1.3711 15 FinFlatCubic15M 3.8652 FinFlatBart15 3.1178 5 FinFlatBart15M 3.265 FinFlatBart15M 3.265 7 FinFlatTukey15M 2.52 FinFlatCubic1 2.4156 8 FlatCubic15M 1.7378 FinFlatCubic15 4.8964 1 FlatBart15M.7974 FinFlatCubic15M 3.8652 3 FlatTukey15M.7745 FinFlatTukey1 1.6161 12 FinZMAV15M.2672 FinFlatTukey15 3.611 6 BartV15M -.94 FinFlatTukey15M 2.52 9 OptV15M -.912 Max -3.3544 38 ZMAV15M -1.2775 Min -1.929 36 V15min15M -1.4642 Mean 4.5885 2 V5min15M -3.396 44

Tests (Table 8 continued) Joint test all b's (profits) are equal to zero: b'*inv(cov(b))*b=2779.2 Pair-wise tests between finite sample profits and asymptotic profits: 1 15 >15 t(finbart-asybart) 1.61 1.9 1.8 t(fincubic-asycubic).74 1.83 2.53 t(fintukey-asytukey) 2.46 1.62 2.6 t(finzma-asyzma) 2.14 2.14 1.19 45