Value, size and momentum on Equity indices a likely example of selection bias



Similar documents
Market sentiment and mutual fund trading strategies

Implementing Point and Figure RS Signals

Goldman Sachs ActiveBeta Equity Indexes Methodology

Performance of pairs trading on the S&P 500 index

De-Risking Solutions: Low and Managed Volatility

Trading Costs and Taxes!

Discussion of Momentum and Autocorrelation in Stock Returns

Assessing the Risks of a Yield-Tilted Equity Portfolio

Successful value investing: the long term approach

What Level of Incentive Fees Are Hedge Fund Investors Actually Paying?

Market Efficiency and Behavioral Finance. Chapter 12

S&P 500 Low Volatility Index

EXTRAPOLATION BIAS: INSIDER TRADING IMPROVEMENT SIGNAL

EVALUATION OF THE PAIRS TRADING STRATEGY IN THE CANADIAN MARKET

How to Win the Stock Market Game

ANZ ETFS S&P/ASX 300 HIGH YIELD PLUS ETF. (ASX Code: ZYAU)

Commodities. Precious metals as an asset class. April What qualifies as an asset class? What makes commodities an asset class?

The Hidden Costs of Changing Indices

Introduction to Equity Derivatives on Nasdaq Dubai NOT TO BE DISTRIUTED TO THIRD PARTIES WITHOUT NASDAQ DUBAI S WRITTEN CONSENT

Best Styles: Harvesting Risk Premium in Equity Investing

McKinley Capital U.S. Equity Income Prospects for Performance in a Changing Interest Rate Environment

on share price performance

The Value Line Definitive Guide { Ranking System

The Case For Passive Investing!

SPDR S&P 400 Mid Cap Value ETF

Are Bonds Going to Outperform Stocks Over the Long Run? Not Likely.

Overview. October Investment Portfolios & Products. Approved for public distribution. Investment Advisory Services

Cobalt Benchmark Report Q A review of key company valuation metrics in the UK, European and US Application Software sectors

ANZ ETFS PHYSICAL US DOLLAR ETF. (ASX Code: ZUSD)

How To Outperform The High Yield Index

Single Manager vs. Multi-Manager Alternative Investment Funds

Ground Rules. FTSE Russia IOB Index v2.4

INDEX-BASED INVESTING

No duplication of transmission of the material included within except with express written permission from the author.

EATON VANCE HEXAVEST GLOBAL EQUITY FUND Supplement to Summary Prospectus dated December 1, 2015

Ten PowerShares Exchange-Traded Fund Trust Funds to Begin Trading on NASDAQ on September 20, 2006 (Financial Products Update #2006-xxx)

The active/passive decision in global bond funds

The Coming Volatility

Why Decades-Old Quantitative Strategies Still Work Today

Lay System Split test

Cem Kaner, J.D., Ph.D. Florida Institute of Technology STAR East May 2011

When rates rise, do stocks fall?

The Equity Evaluations In. Standard & Poor s. Stock Reports

Stock Returns Following Profit Warnings: A Test of Models of Behavioural Finance.

7 ESSENTIAL TIPS FOR MANAGING CURRENCY RISK

Prospectus Socially Responsible Funds

Company Fundamentals. THE CMC Markets Trading Smart Series

General Information about Factor Models. February 2014

NN Sustainable Equity

AUTOMATED CURRENCY TRADING

Factsheet Phoenix Autocallable Notes April 2012

new international personal share portfolio

Robert and Mary Sample

Interest Rates and Inflation: How They Might Affect Managed Futures

Vantage 2.0 Portfolios Stop Loss Analyzed. Presented by:

ETF Total Cost Analysis in Action

US Mutual Funds Monthly Statistics (Including ETFs)

BUSM 411: Derivatives and Fixed Income

Stock Market. Software User Guide

Alternative Sector Rotation Strategy

Does trend following work on stocks? Part II

PROVIDING RETIREMENT INCOME WITH STRUCTURED PRODUCTS

Toms Market Outlook and Trade Alerts for Oct 19 th, Message from Tom

BASKET A collection of securities. The underlying securities within an ETF are often collectively referred to as a basket

DISCLAIMER ANY AND ALL FORWARD LOOKING STATEMENTS HERE OR ON ANY OF OUR SALES MATERIAL ARE INTENDED TO EXPRESS OUR OPINION OF EARNINGS POTENTIAL.

A Case for Index Fund Portfolios A study of strategy, probability and payout

Smart beta: 2015 survey findings from U.S. financial advisors

Porter, White & Company

GMO WHITE PAPER. The Capacity of an Equity Strategy. Defining and Estimating the Capacity of a Quantitative Equity Strategy. What Is Capacity?

Are High-Quality Firms Also High-Quality Investments?

Understanding the JPMorgan ETF Efficiente SM 5 Index

SPDR S&P 500 Low Volatility UCITS ETF

NASDAQ-100 INDEX METHODOLOGY. December 2015

Why Invest in Emerging Markets Small Cap Stocks?

Multiple Discriminant Analysis of Corporate Bankruptcy

Ground rules. Guide to Calculation Methods for the FTSE UK Index Series v5.3

Finding outperforming managers. Randolph B. Cohen Harvard Business School

NorthCoast Investment Advisory Team

Investment Section INVESTMENT FALLACIES 2014

TS Model Growth Portfolios

Perspectives September

Commitment of Traders How to Follow the Professionals

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

The Investment Derby: Value vs. Growth 2015

Effective downside risk management

Navellier Tactical U.S. Equity Sector Plus

Financial Planning Services Financial Goal Analysis

Annuities. Products. Safe Money. that Stimulate Financial Growth & Preserve Wealth. Safe Money is for money you cannot afford to lose.

Advisor Perspectives welcomes guest contributions. The views presented here do not necessarily represent those of Advisor Perspectives.

A.R.T Core Portfolio Trading Plan June 2015

ADDITIONAL (ASX DESCRIPTION CODE: ZGOL) AND THE DATE

Modernizing Portfolio Theory & The Liquid Endowment UMA

Dividend valuation models Prepared by Pamela Peterson Drake, Ph.D., CFA

Our verdict is in: Offshore high yield exchange-traded funds don t deliver

Transcription:

WINTON Working Paper January 2015 Value, size and momentum on Equity indices a likely example of selection bias Allan Evans, PhD, Senior Researcher Carsten Schmitz, PhD, Head of Research (Zurich) Value, size and momentum have a long history as stock price predictors, and similar indicators have been applied to stock indices in order to predict the performance of one national index against another. Published back tests of trading systems based on these ideas have shown impressive performance, but in this paper we find that this performance does not continue past the publication dates. We argue that selection bias at the time of publication has a part to play in the disappointing out of sample performance of these indicators. We show how the combination of estimation uncertainty and selective reporting can readily explain the observed deterioration in performance. Importantly, with a fuller understanding of these effects, the long term poor performance of the indicators could have been anticipated at the time. Introduction Efforts to find predictors for stock returns have a long history. Quantitative work on momentum goes back at least to the 1960s, with the observation that, over timescales of a few months, stocks that performed well in the past tend to also perform well in the future [1]. Later work showed some evidence for a negative effect (mean reversion) on a longer timescale [2, 3]. Valuation ratios also have a long history. The controversy over the Value Line system in the 1960s [4, 5] is a well known example. Fama and French recently reviewed the evidence for value, size and momentum factors across the world s stock markets [6]. Starting in the 1990s, a series of papers were published suggesting that similar effects could be seen in national stock indices [2, 3, 7, 8]. In analogy to cross sectional equity systems [9], countries were ranked on size, value or momentum indicators to form portfolios with long positions in the indices for the highest ranked countries and short positions for the lowest ranked countries. The published work showed excellent historical performance for such trading systems. This paper reviews the performance of these cross country equity portfolios. Nearly twenty years have passed, so we have the advantage of a large set of historical data that was not available to the authors of the listed papers. In Section 1 we replicate the published evaluations in the period before 1995, but in Section 2 we show that the performance of the trading systems on more recent data is disappointing. We believe this is because of selection bias in the published results, and in Sections 3 and 4 we show how this may have led to over optimistic assessments of the systems. More importantly we demonstrate how this could have been avoided at the time. Enquiries researchpapers@wintoncapital.com +44(0)20 8576 5800

1. Replicating the published work The papers [2, 3, 8] use national stock market indices provided by MSCI [10] across 18 developed countries in the period 1970 1995. Table 1 lists the four indicators calculated from this data set that were used in these papers. The momentum indicator (MOM) is the total return of an index over the last year, ignoring the return from the last month. The value indicator (V) is the book to market ratio of the index and size (S) is the inverse free float market capitalisation. The book value and market capitalisation are each calculated for the index as a whole by summing the values for the component stocks. These three indicators are the same as those used in [8]. The mean reversion indicator (MR) is defined in a number of different ways in [2] and [3]. We choose the fractional return between 3 years and 1 year ago so that the MR indicator is nearly independent of the MOM indicator. Each indicator defines a distinct trading system which will be associated with a series of portfolios through time. The indicator is calculated for each country at the end of a calendar month. The countries are then ranked by the values of their indicator. Out of the 18 countries, we take a positive position in the top six countries and a negative position in the bottom six. All positions are equal in their allocation, so the portfolio is dollar neutral. The positions are held for one month. Table 1 shows Sharpe ratios from the published sources and from our own tests over the period 1970 1995. We believe that the remaining differences between the published values and our own are due to small differences in the definition of the trading systems (for example, the way in which ties in the ranks are resolved). Table 1. Indicators used in the cited publications to construct long short trading systems on national equity indices. The published Sharpe ratios are compared with the Sharpe ratios from our tests. Indicator Publication Sharpe Ratio published/ replicated 2. What happened next: out of sample performance It should be noted that the Sharpe ratios in Table 1 are so called in sample results. The same data set (1970 1995) that was used to develop or select the trading systems was used to calculate the results. We have an advantage over the researchers who published the original work. Working in 2014, we can compute the performance of the systems on the market data from 1995 to 2014. This is an out of sample test. The results are shown in Figure 1 and Table 2. Figure 1. Cumulative profit (positive) or loss (negative) in billion USD of the published systems over the in sample and out of sample periods (left and right of the vertical line). A $100M long or short position is taken in each single country selected. The performance replications for MOM and MR start later due to the required price history, and the V replication starts in 1975 when the book price data becomes available. The combined performance of the four systems is shown in the bottom section. All four systems have worse performance in the out ofsample period than in the original test period. Two of the four systems lose money after 1995, and even the momentum system shows almost no profit after 2000. Momentum (MOM) [8] 0.85 0.85 Mean reversion (MR) [2, 3] 0.51 Value (V) [8] 0.84 0.86 Size (S) [8] 0.63 0.60

Table 2. Sharpe ratios for the four published systems, in sample and out of sample. The Sharpe ratios for the combined system are also given. All systems show a decline in performance. Indicator Sharpe ratio (1970 1995) Sharpe Ratio (1995 2014) Momentum (MOM) 0.85 0.57 Mean reversion (MR) 0.51 0.15 Value (V) 0.86 0.05 Size (S) 0.60 0.40 TOTAL 1.36 0.42 3. A family of trading systems: evidence for selection bias As we saw in the previous section, the performance of the national stock index portfolios is disappointing after the publication of these papers. This might be explained in various different ways. Perhaps the poor out of sample performance is due to bad luck and will be reversed in the future. Or the market environment might have changed in a way which makes the systems perform more poorly. However, the most likely explanation is selection bias. In fact, we should have expected a drop in performance even without the benefit of hindsight. We can understand this better by looking at the family of trading systems which the authors selected from. Usually it is difficult to know exactly which trading systems were considered before one was selected. In this case, we can make a reasonable guess. The MSCI data set that contains the book to market ratio and the market capitalisation for each country also contains a selection of other indicators, which have likely been studied alongside the published indicators. As further evidence, we find that in [7] (which is referred to by [8]) the correlations between index returns and some of these additional indicators are studied. The book to market (V) indicator is assessed as the most promising. Furthermore, the momentum and mean reversion systems use specific timescales suggested by successful systems trading individual equities. But other timescales, from one quarter to five years, have also been used in systems trading both indices and individual equities [9, 11]. These considerations suggest a wider group of ten trading systems which might have been tested by researchers in the 1990s alongside the published systems. They are listed in Table 3. To compare the performance of this family of systems across the entire time range, we use the 13 countries with data for all the indicators across the period 1975 2014. The in sample results are shown in Figure 2. Table 3. Indicators for ten trading systems which could have been tested in the 1990s, including the four already tested. Symbol Description Published V Book value/market cap [8] EM Earnings/market cap S 1/market cap [8] DIVYLD CEM Dividend yield Cash earnings/market cap MOM 1 year momentum [8] MOM6M MOM2Y 6 month momentum 2 year momentum MR Mean reversion (3 years) [2, 3] MR2 Mean reversion (2 years) Figure 2. In sample (pre 1995) Sharpe ratios of the ten equity index trading systems listed in Table 3. Those published and discussed in section 2 are highlighted in red. The mean (0.34) is indicated by a black line, and the one standard deviation range (σ=0.23) is shown as the red band around the mean. Error bars show the estimated sampling error from a formula given in [12]. Strikingly, all four of the published Sharpe ratios are above the average. This is very suggestive of active performancebased selection of trading systems. If the published trading systems were picked because they were among the best in a wider set of systems, then we would expect their out of sample performance to be worse than their in sample performance [13]. This phenomenon of regression to the mean applies in many other contexts and often leads to controversy [14]. Students with the best test scores in a particular school year are expected to decline in performance the next year (leading to accusations that they are a neglected group). Patients selected because they have high blood pressure will tend to show a decrease in blood pressure in the next stage of a clinical trial (leading to a positive assessment of any treatment they are given even if it is ineffective).

The reason for this decline is easy to understand. In each case there is a large random variation in the quantity measured. Trying to select the very best, we select the very lucky. And the very lucky are not likely to be as lucky next time. 4. Estimating the effect of selection bias We now understand why trading systems selected for their performance in sample are expected to show a decline in performance outside the sample period. If we can estimate the size of this effect using only an in sample test, then it will not be necessary to wait until out of sample data are available; we can make a corrected estimate of the future performance of a trading system immediately. This type of calculation is a valuable tool for the analysis of trading systems and portfolio construction. To do this, we need to know how much of the difference in performance between the systems is caused by real differences in the systems effectiveness and how much is due to luck. The luck here is random sampling error, caused by the use of a finite amount of data to estimate the Sharpe ratios. Real differences in effectiveness should be consistent between the in sample and out of sample periods, but luck will not persist into the out of sample period. The standard formula [12] for the sampling error of Sharpe ratios gives values of between 0.22 and 0.25 for all the systems in the family (Table 3). These are the error bars in Figure 2. In every case, the estimated sampling error is close to the standard deviation =0.23 of the set of Sharpe ratios (the measured range of Sharpe ratios across the systems). Differences between the Sharpe ratios can therefore be attributed to random sampling error alone. They do not indicate true differences in performance. This has an important consequence. Given only in sample data, our best estimate of the future (out of sample) Sharpe ratio of one of the systems is not the in sample Sharpe ratio of that system: it is the mean in sample Sharpe ratio of the whole family (0.34 in this case). The Tweedie formula [15], a well known method for correcting selection bias, confirms this conclusion. We can now look at the out of sample data to see whether they confirm this conclusion. Figure 3 shows the in sample and out of sample Sharpe ratios for the ten systems. It s clear that the in sample and out of sample results are not correlated, and a statistical test confirms this. In other words, exceptionally good systems in sample are not likely to remain good in out of sample tests. This is exactly what the analysis of the sampling error led us to expect. We have managed to foresee the drop in performance, due to selection bias, without requiring the extra 20 years of data. A large change in the overall mean would suggest some change in the world s financial systems between the two periods. In fact, the small decrease in the mean Sharpe ratio (0.34 to 0.25) is of the same order of magnitude as the change expected from random variations 1, so there is no statistical evidence for such a change. Figure 3. In sample (pre 1995) and out of sample (post 1995) Sharpe ratios for the family of ten trading systems. The one standard deviation ranges and the mean values are shown in the side bars. 5. Conclusion In this paper we have seen how published systems on national stock indices from the 1990s have underperformed in the following 20 years, and we have shown strong evidence that the decrease in performance was caused by cherry picking a set of indicators based on in sample performance. This selection is not always done by an individual researcher or group. For example, it is possible that different researchers tried the different systems, and only the ones who obtained positive results published their work. Or one group of researchers may evaluate a set of possible indicators (as in [7]), leading a different group of researchers (such as [8]) to make a particular choice of trading systems. Investment managers will always select trading systems which performed well in the past, and we do not argue that this is a bad policy. But as this paper shows, the selection introduces a bias, which should be corrected so that the performance of individual trading systems is not over stated. Methods to do this are an important part of the arsenal of scientific quantitative investment. Recent concerns expressed in the academic literature and in the financial community [16, 17, 18] make it clear that not everyone working in finance has fully adopted these methods yet. 1 The sampling error 0.23 divided by the square root of the number of systems 10 gives a rough estimate of 0.07 for the standard error of the mean.

6. Acknowledgements The authors are grateful to the Winton research department for discussions, suggestions and support to access data and in particular to William Cobern for help with data preparation. For correspondence please email researchpapers@wintoncapital.com References 1. R. Levy, Relative strength as a criterion for investment selection, Journal of finance, pp. 595 610, 1967. 2. R. Balvers, Mean reversion across national stock markets and parametric contrarian investment strategies, Journal of finance, pp. 745 772, 2000. 3. A. Richards, Winner loser reversals in national stock market indices: can they be explained?, IMF Working paper, 1997. 4. J. Shelton, The Value Line contest: a test of predictability of stock price changes, Journal of business, pp. 251 269, 1967. 5. F. Black, Yes Virginia there is hope: tests of the Value Line ranking system. Financial analysts journal 29, 1973). 6. E. Fama and K. French, Size value and momentum in international stock returns, Journal of financial economics, pp. 457 472, 2012. 11. F. DeBondt and R. Thaler, Does the stock market overreact?, Proceedings of the 43rd annual meeting of the American Finance Association, pp. 28 30, 1985. 12. A. Lo, The statistics of Sharpe Ratios, Financial analysts journal, pp. 36 52, 2002. 13. M. Roulston and D. Hand, Blinded by optimism, Winton Capital Management Working Paper, December 2013. 14. J. Kruger, Superstition and the regression effect, Skeptical enquirer, March/April 1999. 15. B. Efron, Tweedie s formula and selection bias, Journal of the American Statistical Association, pp. 1602 1614, 2011. 16. D. Bailey, Pseudo mathematics and financial charlatanism, Notices of the American Mathematical Society, pp. 458 471, 2014. 17. Wall Street Journal Online, Huge returns at low risk? Not so fast, 27 June 2014. [Online]. Available: blogs. wsj.com/moneybeat/2014/06/27/huge returns atlow risk not so fast/. 18. Vanguard, Joined at the hip: ETF and index development, July 2012. [Online]. Available: pressroom.vanguard.com/nonindexed/7.23.2012_ Joined_at_the_hip.pdf. 7. L. Heckman, Valuation ratios and cross country equity allocation, Journal of investing, pp. 54 63, 1996. 8. C. Asness, J. Liew and R. Stevens, Parallels Between the Cross Sectional Predictability of Stock and Country Returns, The Journal of Portfolio Management, pp. 79 87, 1997. 9. N. Jegadeesh and S. Titman, Returns to buying winners and selling losers: implications for stock market efficiency, Journal of finance, pp. 65 91, 1993. 10. MSCI country and regional indices, [Online]. Available: www.msci.com/products/indexes/ country_and_regional/dm/.

Legal Disclaimer This document has been prepared by Winton Capital Management Limited ( WCM ), which is authorised and regulated by the UK Financial Conduct Authority, registered as an investment adviser with the US Securities and Exchange Commission, registered with the US Commodity Futures Trading Commission and a member of the National Futures Association. This document is provided for information purposes only and the information herein does not constitute an offer to sell or the solicitation of any offer to buy any securities. The information herein is subject to updating and further verification and may be amended at any time and WCM is under no obligation to provide an updated version. WCM has used information in this document that it believes to be accurate and complete as of the date of this document. However, WCM does not make any representation or warranty, express or implied, as to the information s accuracy or completeness, and accepts no liability for any inaccuracy or omission. No reliance should be placed on the information herein and WCM does not recommend that it serves as the basis of any investment decision. This document may contain results based on simulated or hypothetical performance results that have certain inherent limitations. Unlike the results shown in an actual performance record, such results do not represent actual trading. Also, because such trades have not actually been executed, these results may have under or overcompensated for the impact, if any, of certain market factors, such as lack of liquidity. Simulated or hypothetical trading programs in general are also subject to the fact that they are designed with the benefit of hindsight. No representation is being made that any investment will or is likely to achieve profits or losses similar to those being shown using simulated data. Unauthorised dissemination, copying, reproducing or transmitting of this information is strictly prohibited. Winton Capital Management Limited 2015. l rights reserved. 130415 0067