Applications Development



Similar documents
Indxx SuperDividend U.S. Low Volatility Index

Statistics and Analysis. Quality Control: How to Analyze and Verify Financial Data

Finding National Best Bid and Best Offer

S&P 500 Low Volatility Index

WEATHERSTORM FORENSIC ACCOUNTING LONG-SHORT INDEX. External Index Methodology Document

Cash Flow-Based Value Investing in the Hong Kong Stock Market

Introduction Manual CRSP (WRDS)

Portfolio Construction with OPTMODEL

S&P Dow Jones Indices Announces Consultation on Equity Indices

Risk Budgeting. Northfield Information Services Newport Conference June 2005 Sandy Warrick, CFA

Subsetting Observations from Large SAS Data Sets

Income dividend distributions and distribution yields

Delisting returns and their effect on accounting-based market anomalies $

How to Screen for Winning Stocks

Leads and Lags: Static and Dynamic Queues in the SAS DATA STEP

ISE CLOUD COMPUTING TM INDEX

Guide to the Dow Jones BRIC 50 All DR 10% Volatility Risk Control Index SM

Templates available in Excel 97 (Excel 8) and higher versions:

Definitions of Earnings Quality Factors

Vanguard research August 2015

Index Guide. USD Net Total Return DB Equity Quality Factor Index. Date: [ ] 2013 Version: [1]/2013

Evolving beyond plain vanilla ETFs

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Paper Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation

Calculation Guideline. Solactive US High Dividend Low Volatility Index TR

Target-Date Funds: The Search for Transparency

Direct Marketing Profit Model. Bruce Lund, Marketing Associates, Detroit, Michigan and Wilmington, Delaware

Permuted-block randomization with varying block sizes using SAS Proc Plan Lei Li, RTI International, RTP, North Carolina

LEADS AND LAGS: HANDLING QUEUES IN THE SAS DATA STEP

Paper PO06. Randomization in Clinical Trial Studies

Integrated Company Analysis

Variable Universal Life Insurance Policy

Screening: The First Step for Finding Winning Stocks. John M. Bajkowski

Quantitative Equity Strategy

A Faster Index for sorted SAS Datasets

CNX NIFTY. Index Methodology. Contact:

Introduction to WRDS and Using the Web-Interface to Extract Data and Run an EVENTUS Query

Are High-Quality Firms Also High-Quality Investments?

Value versus Growth in the UK Stock Market, 1955 to 2000

Guidance on Performance Attribution Presentation

Catalyst Insider Buying Fund INSAX INSCX INSIX

MSCI Global Investable Market Indices Methodology

Ground Rules. FTSE NAREIT Preferred Stock Index v1.2

THE U.S. INFRASTRUCTURE EFFECT INTERVIEW BY CAROL CAMERON

How To Outperform The High Yield Index

Social Security Alternative Retirement Income Security Program for Other Personal Service Employees

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Corporate Office Von Karman Ave Suite 150 Irvine, California Toll Free: Fax:

PROC SQL for SQL Die-hards Jessica Bennett, Advance America, Spartanburg, SC Barbara Ross, Flexshopper LLC, Boca Raton, FL

THE POWER OF PROC FORMAT

Nine Questions Every ETF Investor Should Ask Before Investing

Table Lookups: From IF-THEN to Key-Indexing

Morningstar Core Equities Portfolio

RAFI Bonds US High Yield 1-10 Index

MSCI Global Investable Market Indices Methodology

Portfolio Management for institutional investors

AN INSIDE LOOK AT S&P MILA 40

Results of the 2014 FTSE NAREIT U.S. Real Estate Index Series Consultation

292 INDEX. Growth and income style of trading, winning strategies (see Winning growth and income strategies)

9 Questions Every Australian Investor Should Ask Before Investing in an Exchange Traded Fund (ETF)

Building and Interpreting Custom Investment Benchmarks

Different ways of calculating percentiles using SAS Arun Akkinapalli, ebay Inc, San Jose CA

9 Questions Every ETF Investor Should Ask Before Investing

This form consists of 3 separate sections. Please read each section carefully.

IBM SPSS Direct Marketing 23

The Science and Art of Market Segmentation Using PROC FASTCLUS Mark E. Thompson, Forefront Economics Inc, Beaverton, Oregon

Health Services Research Utilizing Electronic Health Record Data: A Grad Student How-To Paper

Considerations for Troubled Debt Restructuring Identification of Loans August 2011

Low-volatility investing: a long-term perspective

IBM SPSS Direct Marketing 22

ANZ ETFS S&P/ASX 300 HIGH YIELD PLUS ETF. (ASX Code: ZYAU)

Top 10 Things to Know about WRDS

Modeling Lifetime Value in the Insurance Industry

Assessing the Risks of a Yield-Tilted Equity Portfolio

Trade Date The date of the previous trading day. Recent Price is the closing price taken from this day.

1741 SWITZERLAND MINIMUM VOLATILITY INDEX

RAFI Bonds US High Yield 1-10 CAD Hedged Index

SUGI 29 Coders' Corner

What is a BDC? Business Development Companies ( BDCs ) at a Glance. NYSE:TCAP 2014 Triangle Capital Corporation

Risk Visualization: Presenting Data to Facilitate Better Risk Management

A Comparison of Decision Tree and Logistic Regression Model Xianzhe Chen, North Dakota State University, Fargo, ND

The High-Volume Return Premium: Evidence from Chinese Stock Markets

Ground Rules. FTSE Russia IOB Index v2.4

Alternative Investing

Another Look at Trading Costs and Short-Term Reversal Profits

Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report

Transcription:

Portfolio Backtesting: Using SAS to Generate Randomly Populated Portfolios for Investment Strategy Testing Xuan Liu, Mark Keintz Wharton Research Data Services Abstract One of the most regularly used SAS programs at our business school is to assess the investment returns from a randomly populated set of portfolios covering a student-specified historic period, rebalancing frequency, portfolio count, size, and type. The SAS program demonstrates how to deal with dynamically changing conditions, including periodic rebalancing, replacement of delisted stock, and shifting of stocks from one type of portfolio to another. The application is a good example of the effective use of hash tables, especially for tracking holdings, investment returns. 1. What is Backtesting and how does it work? Backtesting is the process of applying an investment strategy to historical financial information to asses the results (i.e. change in value). That is, it answers the question what if I had applied investment strategy X during the period Y?. The backtest application developed at the Wharton school, used for instructional rather than research purposes, is currently applied only to publically traded stocks. Later, in the Creating a Backtest section, we go over the more important considerations in creating a backtest program. However, as an example, a user might request a backtest of 4 portfolios, each with 20 stocks, for the period 2000 through 2008. The four portfolios might be from a cross-classification of (a) the top 20% of market capitalization (and bottom 20%) crossed with (b) top 20% and bottom 20% of book-to-market ratios. The user might rebalance the stocks every 3 months (i.e. redivide the investment equally among the stocks) and refill (i.e. replace no-longer eligible stocks) every 6 months. 2. Source file for Backtesting The source file used for Backtesting is prepared by merging monthly stocks data (for monthly prices and returns), event data (to track when stocks stopped or restarted trading), and annual accounting (for book equity data) data filed with the SEC. (shown in Figure 2.1). 1

Monthly Stocks&Event Files Annual accounting datafile Sourcefile for Backtesting Fig. 2.1 Data file used for Backtesting This yielded a monthly file, with changes in price, monthly cumulative returns, and yearly changes in book value). Because the data is sorted by stock identifier (STOCK_ID) and DATE, it allows the calculation of a monthly cumulative return (CUMRET0) for each stock in the dataset using the single month returns (RETURN), as below. CUMRET0 will be used later to determine the actual performance of each portfolio. /*Calculation of monthly cumulative returns */ data monthly_cumreturns; set monthly_file; by stockid date; if first.stock_id then do; if missing(return)=0 then cumret0=(1+return); else cumret0=1; else do; if missing(return)=0 then cumret0=cumret0*(1+return); else cumret0=cumret0; retain cumret0; Now, as mentioned earlier users may restrict portfolios to specific percentile ranges of variables like market capitalization (MARKETCAP). These percentiles (deciles in this example) are generated via PROC RANK for each refill date, as below: /*Portfolio deciles using the market cap criteria*/ proc sort data=monthly_cumreturns out=source; by date stockid; proc rank data=source out=temp group=10; by date; var marketcap; ranks rmarketcap; The resulting dataset looks like this: 2

stock_id date 50091 19711031 0.010000 0.67726 50104 19711031 0.019802 2.34271 Exampleofth edatafileusedforbacktesting Return Cumret0 marketcap (Marketcap) 33270.00 49723.25 Table 2.1 Data file used for Backtesting rmarketcap (Rankfor marketcap) 5 6 3. Creating a Backtest Once the primary file has been created, the backtest can be defined throughh these parameters: Structural Parameters: - Date range of the investment. - Number of portfolios and amount invested in each portfolio. - Number of stocks in each portfolio. - Rebalancing Frequency: For portfolios designated as equally weighted the stocks in each portfolio are periodically reallocated so they have equal value. (Portfolios that are value weighted are not rebalanced). - Refilling Frequency: The frequency of determining whether a stock still qualifies for a portfolio (seee Portfolio Criteria below) and replacing the stock if it doesn t. Portfolio Criteria (these are specified in date-specific percentile, not absolute values): - Market capitalization: the total value of all publically traded shares for a firm. - Book-to-Market Ratio: the accounting value of a firm vs. its market capitalization. - Lagged Returns: the return for the previous fiscal period. - P/ /E Ratio: Ratio of the price of each share to the company earnings per share - Price: Price of a share. The process of taking these parameters and generating a backtest is displayed in the following figure: Initialcash Startandenddate Stocksperportfolio Typeofportfolio weighting(market caporequal weighted) rebalance&refill period Screening(optional) Screenbydeciles.Screen metricsaremarketcap, booktomarketratio, earningstopriceratioorlag returns,priceetc.) keepeverythinginone portfolio Usedeifferentmetricsto dividesecuriesintomultiple potfolios Partition(optional) Analysis Sourcefilesetup Fig. 3.1 Creating a Backtest 3

This introduces a number of programming tasks. The primary tasks are: 1. For the start date and each refill date, generate percentiles for the portfolio criteria. 2. At the start date, randomly draw stocks for each portfolio from qualifying stock. 3. Track monthly cumulative return (i.e. cumulative increase or decrease) in the value of each stock in each portfolio. Each stock is tracked so that rebalancing can be done, if needed. 4. If a stock stops trading at any point, reallocate its residual value to the rest of the portfolio. 5. At every refill point, keep all stocks in the portfolio that are still eligible (buy and hold) and randomly select replacements for all stocks no longer eligible. By default, all available securities are considered for inclusion in the backtest. The universe can be filtered by adding one or more screens based on the portfolio criteria (expressed in deciles in this paper). Multiple portfolios can be created by dividing securities into distinct partitions based on the value of one or two metrics. For example, using two metrics, book to market and price with 2 partitions for book-to-market and 3 partitions for price will result in 6 portfolios. Once the portfolio is constructed, performance of each portfolio will be analyzed. 4. Portfolios are populated by randomly selected securities During the creation of a backtest, securities within a portfolio are randomly selected, which is made possible by generation of a random number for each stock_id, /*randomization of the stocks*/ proc sort data=inds out=outds; by stock_id date; %let seed =10; data randomized_stocks / view = randomized_stocks; set outds; by stock_id; retain ru; if first.stock_id then ru=ranuni(&seed); output; inds is the input dataset with one record per stock_id - date. outds is the output dataset with added random variable sorted by stock_id - date. ru is the random variable generated from the seed. A constant unique random value is generated for each stock_id. Each call with different seed will cause a new set of random numbers generated for the stock_ids (See table 4.1). ru ru stock_id date (seed=10) (seed=30) 10042 20050831 0.70089 0.10266 10042 20060831 0.70089 0.10266 10042 20070831 0.70089 0.10266 10078 20050831 0.99824 0.99473 10078 20060831 0.99824 0.99473 10078 20070831 0.99824 0.99473 Table 4.1 Sample outputs with different seed 4

5. The refill process People buy and hold securities for a certain period of time. During the holding period, some stocks may disappear due to delisting or become disqualified using the initial portfolio set up criteria. In either case, the size of the portfolio shrinks. To bring the portfolio back to its original size, a refill process is performed on each user specified date. One possible problem that can distort the refill process is the possibility that a stock can cease trading (become delisted ) and later reappear on the market. If the stock retains the same randomly assigned priority used in the initial sampling then it would be included in the refill event after its re-entry on the market. In order to avoid this problem we used the following approach: generate the random number that associates with the date variable and assign a stage variable to indicate its on-off appearance if any. Whenever the stock reappears, generate a new random number for that stock. Sort the stock pool by date and random number. When it is the time for refill, the first nth stocks (n is the number of stocks asked by the user) should be selected to form the desired portfolio. /* Randomization procedure used for portfolio Buy & Hold and Refill process*/ data stocks_held(drop=lagdate); set stocks; by stock_id; retain ru stage; lagdate = lag(date); if first.stock_id then do; stage =1; ru = date + ranuni(&seed); else if intck('month', lagdate, date)>1 then do; stage = stage +1; ru = date + ranuni(&seed); proc sort data= stocks_held; by date ru; 6. Rebalance Rebalancing brings your portfolio back to your original asset allocation mix. This is necessary because over time some of your investments may become out of alignment. Table 6.1 illustrates a simple example for equal- weighted portfolio with two stocks, 5

OnJan31,1990,Initialcash:$120.Bought12sharesofstock1and15sharesofstock2 stock_id=1 Stock_id=2 total Date price cumret0 money money amountin return Price return cumret0 forstock1 invested invested Portfolio 19900131 $5. 1 $60 $4. 1 $60 $120 19900228 $6 0.2000 1.2000 $72 $5 0.2500 1.2500 $75 $147 19900331 $10 0.6667 2.0000 $120 $6 0.2000 1.5000 $90 $210 OnApril1,1990,theportfolioisrebalanced.Initialcash:$210.Sold1.5sharesofstock1andpurchased2.5sharesofstock2 19900430 $12 0.2000 2.4000 $126 $10 0.6667 2.5000 $175 $301 19900531 $15 0.2500 3.0000 $157.5 $10 0.0000 2.5000 $175 $332.5 19900630 $18 0.2000 3.6000 $189 $12 0.2000 3.0000 $210 $399 OnJuly1,1990,,theportfolioisrebalanced.Initialcash:$399.Bought0.583sharesofstock1andsold0.875sharesofstock2 Note:$399=$210*[(1+0.2000)*(1+0.2500)*(1+0.2000)+(1+0.6667)*(1+0.0000)*(1+0.2000)]/2 =$210*(3.6000/2.0000+3.0000/1.5000)/2 Table 6.1 Equal- weighted portfolio with two stocks The task is to calculate the cumret0 divide by the cumret0 at beginning of the rebalance period (denoted by eq_rebal_wgt in the following SAS code). The following SAS uses SAS hash object. It can quickly retrieve cumret_rebal_start (cumret0 at beginning of the rebalance period). The hash object is uniquely suited to this step in the process. Not only does it provide a quick lookup of the starting values for each stock, it easily accommodates the changing composition of a portfolio, and updating of those values in place. The result is listed in table 6.2. /* equal- weighted portfolio rebalance weight calculation for a single stock*/ data bal_source; if _n_=1 then do; declare hash ht(); ht.definekey("stock_id"); ht.definedata("cumret_rebal_start"); ht.definedone(); set source_sample end=done; if rebal_flag=1 then do; cumret_rebal_start= (cumret0)/ (1+return); rc=ht.replace(); else do; rc=ht.find(); drop rc; eq_rebal_wgt = cumret0 / cumret_rebal_start; return cumret0 rebal_flag cumret_rebal_start eq_rebal_wgt date Stock_id=1 19900131. 1 1 19900228 0.2000 1.2000 0 19900331 0.6667 2.0000 0 19900430 0.2000 2.4000 1 2.0000 19900531 0.2500 3.0000 0 2.0000 19900630 0.2000 3.6000 0 2.0000 1.8 Table 6.2 Calculation of eq_rebal_wgt 6

Once eq_rebal_wgt is calculated for all the stocks, the rebalance weight for the portfolio (p_rebal_wt) can easily be calculated by use of proc means on eq_rebal_wgt as following, /* equal- weighted portfolio rebalance weight calculation*/ proc means data= bal_source ; class portfolio_id date;/*date here corresponds to rebalance date.*/ var eq_rebal_wgt ; output out = outds mean(eq_rebal_wgt)= p_rebal_wt; Conclusion This paper focuses on the randomization procedure used for portfolio construction for backtesting, as well as how the portfolio is refilled and rebalanced during its evolution. The randomization procedure is designed to accommodate the buy-and-hold strategy of portfolio management. We also illustrate how a SAS hash object is used for fast and simple retrieval of stock cumulative returns, making the calculation of multi-stock portfolio returns a simple use of proc means. CONTACT INFORMATION Author: Address: Email Xuan Liu Wharton Research Data Services 216 Vance Hall 3733 Spruce St Philadelphia, PA 19104-6301 xuanliu@wharton.upenn.edu Author: Address: Email Mark Keintz Wharton Research Data Services 216 Vance Hall 3733 Spruce St Philadelphia, PA 19104-6301 mkeintz@wharton.upenn.edu TRADEMARKS SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. Indicates USA registration. 7