CONTRIBUTORS Jeff Bacidore Managing Director, Head of Algorithmic Trading, ITG, Inc. Jeff.Bacidore@itg.com +1.212.588.4327 Kathryn Berkow Quantitative Analyst, Algorithmic Trading, ITG, Inc. Kathryn.Berkow@itg.com +1.212.444.6146 Ben Polidore Director, Algorithmic Trading, ITG, Inc. Benjamin.Polidore@itg.com +1.212.323.3408 Nigam Saraiya Vice President, Algorithmic Trading, ITG, Inc. Nigam.Saraiya@itg.com 1.212.444.6479 CONTACT Asia Pacific +852.2846.3500 Canada +1.416.874.0900 EMEA +44.20.7670.4000 United States +1.212.588.4000 info@itg.com www.itg.com Cluster Analysis for Evaluating Trading Strategies 1 ABSTRACT In this paper, we introduce a new methodology to empirically identify the primary strategies used by a trader using only post-trade fill data. To do this, we apply a well-established statistical clustering technique called k-means to a sample of progress charts, representing the portion of the order completed by each point in the day as a measure of a trade s aggressiveness. Our methodology identifies the primary strategies used by a trader and determines which strategy the trader used for each order in the sample. Having identified the strategy used for each order, trading cost analysis (TCA) can be done by strategy. We also discuss ways to exploit this technique to characterize trader behavior, assess trader performance, and suggest the appropriate benchmarks for each distinct trading strategy. BACKGROUND Assessing trader performance is challenging because traders often vary their strategies depending on the objectives of each trade. For example, when orders are benchmarked to the open, traders may front-load their trades, perhaps executing a large portion of the trade in the opening auction. For larger, more impactful orders, traders may choose to trade more passively, stretching the order over a longer period of time. Ideally, trading cost analysis (TCA) should take into account the trader s underlying strategy. In reality, doing so is challenging because 1) it is often unclear how to characterize the underlying strategies used by the trader and 2) even if the strategies were known, determining which orders apply to which strategy can be difficult if that information is not captured in post-trade databases. In light of these challenges, one common approach to assessing trader performance is to group trades by algorithm as a proxy for the trader s underlying strategy. If traders use specific algorithms to meet their objectives (e.g. using Close Algorithms for trades benchmarked to the close, VWAP Algorithms for trades benched to VWAP, etc.), this approach makes sense because the algorithm is the strategy. However, high-touch traders often use algorithms as tactics rather than strategies, switching between different algorithms within a given order. As a result, TCA by algorithm will 1 This is the submitted version of the following article: Cluster Analysis for Evaluating Trading Strategies, Jeff Bacidore, Kathryn Berkow, Ben Polidore, and Nigam Saraiya, The Journal of Trading Vol. 7 No. 3, 2012, Institutional Investor, Inc., which has been published in final form at: www.iijournals.com/doi/ abs/10.3905/jot.2012.7.3.006
2 not yield information about the effectiveness of the trader s hybrid strategy. Another commonly used approach to evaluate trader performance is to assess their performance in the context of average aggressiveness. For example, one could look at the average progress chart of a trader to see how passively or aggressively the trader tends to work orders, and assess performance in that context. Such averages may not be meaningful, however, as they aggregate across underlying strategies. For example, Figure 1 shows the aggregate fill progress chart for a single trader. From the graph, it would appear that this trader s underlying strategy is VWAP. However, in reality, this trader may have used multiple strategies that resemble VWAP in aggregate, even if the trader never actually targeted full-day VWAP on a single order. Figure 1. This is an example of the aggregate fill progress chart for all orders in a sample dataset. The horizontal axis represents time from 9:30 AM 9:45 AM (bin 1) to 3:45 PM 4:00 PM (bin 26); the vertical axis represents percent of the order completed. Analyzing trader performance correctly requires first identifying the different underlying strategies used by a trader and then aggregating orders by these strategies. In this paper, we present a new methodology that allows us to both identify the core trading strategies used by a trader and classify each of the trader s orders into these strategies empirically, without having to tag orders prior to execution. To do this, we first create a progress chart for each order and then apply a well-established statistical clustering methodology called k-means to identify the primary strategies used to execute these orders. The k-means methodology classifies each order within one of the strategies, allowing for analysis by strategy. This new approach to identifying trading strategies can be very useful when doing TCA, especially for high touch trading. First, our methodology can identify the underlying strategies used by each trader. Because of its dynamic nature, any new strategies employed will be uncovered even if traders change them over time. Second, for desks with multiple traders, our approach can be used to report which strategies are used by the desk as a whole and divide strategy usage by trader. Third, this type of granular trader-level analysis allows desks to assess relative trader performance as a means to share best practices, instead of simply measuring which trader is best. In particular, this analysis not only identifies which traders outperformed, but also helps explain why they outperformed. Finally, since these strategies can be represented graphically, we are able to infer what the trader s benchmark may have been for a given trade. For example, for highly front-loaded trades, the open may be the most relevant benchmark, while for back-loaded trades, the closing price may be more appropriate. As noted before, all this can be done empirically on a post-trade basis, so our approach does not require traders to enter additional data or for systems to be adapted to accommodate new post-trade strategy information.
3 METHODOLOGY Our methodology uses the intuition of a progress chart when characterizing a trading strategy, but applies a common clustering technique called k-means to divide the aggregate strategy into its component strategies in the same way a prism divides light into its component colors (as shown in Figure 2). The process begins by creating a progress chart for each order. Specifically, for each 15-minute period in the trading day (26 in total), it computes the cumulative fraction of the order that was completed by the end of that period, i.e., the progress of the order at that point. The trading strategy itself is represented by the collection of these 26 progress points, an example of which is given in Figure 1. These charts will always begin at 0% and end at 100%, and will increase as we move from left to right along the x-axis to represent the order s cumulative fill progress over the day. We then apply k-means to group them into k distinct trading strategies. Figure 2. The methodology takes an aggregate progress chart and splits it into its underlying component strategies. To understand how k-means works intuitively, assume that we break the trading day into 3 bins instead of 26 bins. For each order, we determine the percent of the order that was complete at the end of each bin. For example, suppose the trader executed a 10,000-share order by executing 2000 shares in bin 1, 1000 shares in bin 2, and 7000 shares in bin 3. Our methodology would characterize this order as a progress chart with the values 20%, 30%, and 100%, to represent the percent complete at the end of each bin. Since all orders are completed by the end of the last bin, all orders will have a value of 100% in bin 3. For this reason, we only need to look at the progress at the end of the first two bins when attempting to distinguish between strategies. 2 In Figure 3, we plot a sample of orders, where each black dot on the graph represents an order. The x-axis represents the percent of the order completed by the end of bin 1, and the y-axis represents the percent completed by the end of bin 2. In the 2 Adding the third bin where all orders take on a value of 100% to the k-means methodology does not provide any useful information in helping us differentiate between how the different orders were traded. So one can exclude the third bin from the k-means methodology without influencing the results.
4 example of the 10,000-share order above, the order can be represented graphically as the dot labeled X in Figure 3A. Since this order was 20% complete at the end of bin 1 and 30% complete by the end of bin 2, the point is represented with an x-axis value of 20% and a y-axis value of 30%. Figure 3. Illustration of k-means algorithm. In Figure 3A, the black dots are the existing, classified observations. The triangle in Figure 3B represents a new order that must be classified, and the squares represent the centers of the two existing clusters. The grey arrows show the distance between the new point and the existing clusters centers. The algorithm classifies the new point with the cluster whose center is the shortest distance from it. The black squares in Figure 3C represent the original cluster centers. The grey square is the updated center of the cluster with the additional order. Looking at Figure 3A, there are clearly two distinct groups of dots one cluster in the lower left quadrant and another in the upper right quadrant. Intuitively, these clusters represent the two distinct strategies that the trader used. The former represents orders that are executing slowly, i.e., those that have made relatively little progress after both bin 1 (x axis) and bin 2 (y axis). The latter represents orders that are being executed more quickly, where progress in both bin 1 and bin2 is significantly higher. In two-dimensions with a small amount of data, one could do cluster analysis visually, as in Figure 3A. When the data set is large or the number of dimensions is higher, as is the case here where we could have thousands of orders each split into 26 distinct bins, one must rely on statistical techniques to manage the clustering. This is where k-means methodology comes into play. The k-means algorithm begins by assigning k initial cluster centers, which can be specified by the user or selected randomly by the algorithm. Iteratively, the algorithm works through the sample, using a distance metric to assign each observation to the nearest cluster. Figure 3B provides an example of an iteration of k-means. Suppose we were to add a new observation, represented by the triangle in Figure 3B. K-means computes the distance between that point and the two existing cluster centers, represented by the squares in Figure 3B, to determine the nearest cluster. Since the triangle is closer to the left cluster, k-means assigns it to the left cluster. With the addition of a new data point, however, k-means must now compute a new cluster center. Figure 3C shows the new cluster center, represented by the grey square, which has shifted in the direction of the new observation. When cluster centers and assignments of observations stop changing dramatically, the algorithm stops. At this point, the output contains information on the k cluster centers, which can be used to characterize the group itself, as well as the assignment of each observation into a cluster. 3 In our specific application, the center point of a group characterizes the average progress chart of that strategy and the assignments indicate the strategy that each order most closely resembles. 3 See Johnson & Wichern (2007) and MacQueen (1967) for a detailed discussion of k-means.
EXAMPLE To demonstrate the methodology s effectiveness, we apply it to a sample of orders sent to two different algorithms over two different trading horizons to determine whether it can identify these four distinct algorithm-trading horizon combinations. Specifically, the sample includes both half-day and full-day 4 not-held market orders sent to either a VWAP or implementation shortfall (IS) algorithm 5 between January 1, 2011 and September 31, 2011. We limit our sample to orders greater than five hundred shares, ensuring orders were worked over time and not executed in one slice by the algorithm. With no strategy context, k-means identified the four trading strategies and classified orders within them with a high degree of accuracy. The results in Figure 4 show the trading strategies identified in the sample that comprise the VWAP-like aggregate progress chart shown in Figure 1. Figure 4A represents half-day VWAP orders, Figure 4B represents full-day VWAP orders, Figure 4C represents IS algo orders starting before 9:40 AM, and Figure 4D represents half-day IS algo orders. K-means was able to classify over 98% of the orders correctly. As shown in Table 1, VWAP orders were correctly identified more than 99.5% of the time. IS orders were identified correctly more than 98% of the time. Therefore, k-means was able to both correctly identify the four different strategies and assign orders to each strategy with precision. 5 Figure 4. Trading styles identified from post-trade data; example results for sample full- and half-day VWAP and IS algo orders. Order Type Accurary Half-Day VWAP 99.73% Full-Day VWAP 99.54% Full-Day IS 98.58% Half-Day IS 98.19% Table 1 Accuracy of k-means in assigning orders to strategies. APPLICATIONS This methodology can be used to assess trader performance in several ways. First, k-means can be used to identify underlying trading strategies for large client orders. Figure 5 shows the output for a hypothetical client. For this client, we see three distinct fill trajectories trading into the close (strategy A), front-loaded trading (strategy B), and participation-based trading throughout the day (strategy C). Another benefit of k-means is the ability to uncover less dominant strategies used by a trader. 4 Orders considered full-day arrived before 9:40 AM; orders considered half-day arrived between 12:00 and 12:50 PM. All VWAP orders ended after 3:20 PM, but there was no restriction on end time of IS orders. 5 Specifically, we include orders sent to ITG Active Algorithm, a single stock implementation shortfall algorithm.
This is evidenced in Table 2, which shows that only 5% of value was executed via strategy C. Here, k-means uncovered a minority strategy that may have been overlooked in a traditional analysis. In effect, our methodology gives traders the ability to experiment with trading strategies in real time without having to change their work flow to capture any strategy-level information. 6 Figure 5. Hypothetical client trades aggregated over the day and grouped by style via k-means. Three distinct trading strategies emerge from the data. Second, for desks with multiple traders, k-means can be used to help characterize strategies by trader. The diagrams in Figure 6 show trader usage of the strategies identified by k-means. For example, we can see that Trader 1 is the dominant user of strategy C, but C makes up only 25% of Trader 1 s trading. Using the k-means results, we can report how often each strategy was used and understand the trades composing each strategy by trader, fund, order size, market capitalization, time period, market conditions, or any combination thereof. Figure 6. Breakdown of trader usage of strategies for hypothetical client analysis shown in Figure 4 and Table 1. Traders within strategies (Figure 5B) and strategies within traders (Figure 5A). Beyond usage patterns, the k-means output allows us to evaluate trades according to appropriate benchmarks, identifying which strategies are most successful. Why compare all executions to the close benchmark if 10% of orders were actually front-loaded and 5% traded in a VWAP algorithm? The k-means results implicitly provide suggestions concerning the benchmark a given trader may have been targeting, which can help to better evaluate performance. For example, Trader 1 may use strategy A when benchmarked to close, B when benched to the open and C when benched to VWAP. Table 2 indicates that strategy A is performing well versus the close benchmark, strategy B is performing well versus arrival and open, and strategy C is performing well versus VWAP benchmarks. These results are intuitive since traders
7 likely target different benchmarks with different strategies. The ability to infer benchmarks is especially useful for traders whose systems do not permit benchmark information to flow to their post-trade databases. Strategy Orders % Value Arrival Open Close Performance (bps) Prev. Close Day VWAP A 10,334 46% -3-1 1 1-8 8 B 17,957 49% -6 4-2 -12-2 2 C 3,940 5% -17-13 -6-9 2 1 Table 2. Performance results for hypothetical client orders grouped into trading styles illustrated in Figure 1. Interval VWAP Finally, our methodology can help to evaluate trader performance in the context of the underlying trading strategies. If a given trader is under- or outperforming his peers, our methodology can help identify the strategies driving his relative performance. For example, if Trader 1 strongly underperforms his peers, it may be due to his overuse of strategy C, which Table 2 shows is the worst-performing strategy relative to the pre-trade cost benchmark. More generally, Table 2 shows which strategies do best against each benchmark, implicitly making suggestions for how to execute future trades. CONCLUSION In this paper, we provide a new methodology for identifying trading strategies using only post-trade data. Specifically, we apply a well-established statistical technique called k-means to both identify the primary strategies used by a trader and classify each order into one of these strategies. This approach is particularly useful since it does not require changes to trader workflows or post-trade systems to capture strategy or benchmark information. Once the underlying strategies have been identified and orders classified, TCA can be done by strategy. Analysis by strategy is crucial because the choice of strategy can often be the primary determinant of a trader s performance. Visual representations of the underlying strategies naturally suggest the trader s benchmark, yielding relevant and useful analysis. Results can be communicated both visually and numerically, making this a practical tool for any trader.
8 REFERENCES Johnson, R. A. and D. W. Wichern Applied Multivariate Statistical Analysis, Sixth Edition. Upper Saddle River, New Jersey: Pearson Prentice Hall, 2007. MacQueen, J.B. Some Methods for Classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, 1, Berkeley, CA: University of California Press (1967), 281-297. 2012 Investment Technology Group, Inc. All rights reserved. Not to be reproduced or retransmitted without permission. 50112-22067 Broker-dealer products and services offered by ITG Inc., member FINRA, SIPC. These materials are for informational purposes only, and are not intended to be used for trading or investment purposes or as an offer to sell or the solicitation of an offer to buy any security or financial product. The information contained herein has been taken from trade and statistical services and other sources we deem reliable but we do not represent that such information is accurate or complete and it should not be relied upon as such. No guarantee or warranty is made as to the reasonableness of the assumptions or the accuracy of the models or market data used by ITG or the actual results that may be achieved. These materials do not provide any form of advice (investment, tax or legal). ITG Inc. is not a registered investment adviser and does not provide investment advice or recommendations to buy or sell securities, to hire any investment adviser or to pursue any investment or trading strategy. The positions taken in this document reflect the judgment of the individual author(s) and are not necessarily those of ITG.