Whitepaper Latency Management For Co-Location Trading
Contents Introduction 3 Intra Co-location Latency 3 The Latency Matrix 5 Inter Co-Location Latency 7 Exchange-Side Latency Monitoring and Transparency 9 Implementing Co-Lo Latency Management 10
Introduction Co-location and proximity services allow market participants to significantly reduce the impact of latency on trading, typically taking market access times down into microsecond and soon to be nanosecond ranges. But co-location does not eliminate latency from other sources, including processing times within the exchange and reaction times within the participant s own installed systems. Network latency also remains a critical factor affecting multi-venue trading and traders using multiple co-location centers. Understanding and providing insight into these aspects of latency can bring significant benefits to co-location participants. Finding and eliminating latency bottlenecks within installed systems ensures they are fast enough to reap the full benefits of co-location. Monitoring the speed of interactions with the exchange can reveal faster ways to trade, as well as showing how to adapt strategy behavior in response to changing latency conditions in different parts of the market. For multi-venue and multi colocation traders, deciding where to deploy strategies and how to route order-flow is easier when you have accurate knowledge of the latency matrix between different venues and co-location centers. This paper discusses these benefits in more detail and explains how to instrument the colocation environment for high-precision latency management. We have divided the discussion into four sections: 1 Intra Co-location Latency within the participant s own installation 2 The Latency Matrix for multi-venue trading strategies 3 Inter Co-location Latency how to monitor network latency between co-lo points 4 Exchange-Side Latency Monitoring and Transparency rationale and requirements for provision of latency transparency. Intra Co-location Latency Intra co-location latency refers to all aspects of latency occurring within the installed systems at a co-location point. For traders, intra-colocation latency means the time it takes their strategies to identify, and respond to opportunities presented by the market. For a co-located DMA provider, it means the time taken to process and forward buy-side client orders and responses from the exchange. To reap maximum benefit from co-location, the latency within the installation must be small compared to the time taken to access the market, and typically this means microseconds. Monitoring the speed of interactions with the exchange can reveal faster ways to trade, as well as showing how to adapt strategy behavior in response to changing latency conditions in different parts of the market. Monitoring intra-co-location latency allows participants to verify that their systems are appropriately fast and to demonstrate speed to their customers or strategy developers. Any slow-down in system performance, whether caused by equipment degradation or particular patterns of market activity, can be spotted as soon it happens. Using the right instrumentation also allows co-lo clients to identify and eliminate system bottlenecks.
Exchange Market Data Tick to Trade Latency Orders Co-Lo Trading Plant Exchange Co-Lo DMA Session Risk Gateway Customer Order-Flow Order Forwarding Latency Figure 1 End-to-end latency metrics for co-located trading and DMA installations. Best practice is to ensure at a minimum that the end-to-end latency across the installation can be determined, including all contributions from the network and the network stack. For traders, relevant end-to-end latency metrics include the time taken from arrival into the co-location installation of a market data tick that triggers an order to the delivery of the resulting order back into the exchange. For a DMA participant a key end-to-end metric will be the latency from arrival of a buy-side customer order at the edge of their system to the delivery and execution of that order by the exchange. These metrics can be augmented where appropriate with hop-by-hop measurements that reveal how long each component within the installation takes to do its job. Hardware time-stamping is required in preference to software time-stamping to accurately measure microsecond latencies. Correct measurement of these metrics requires the use of a specialized network-attached monitoring system that can capture and timestamp messages at the very edge of the colocation installation, thus capturing latency due to network effects as well as application-level latency. Hardware time-stamping is required in preference to software time-stamping to accurately measure microsecond latencies.
While network-attached monitoring is needed for accuracy reasons, the monitoring system must also be able to understand applicationlevel logic within the messages that it sees in order to track latency across complex transformations. Following messages through the tick-to-trade process, or across a DMA system that transforms orders, requires a system that can understand the order-flow, market data and internal messaging protocols used in trading and how they relate to each other. The ability to measure both end-to-end and hop-by-hop performance allows the co-location participant to review latency on a per-customer or per-strategy basis, and also to see how latency breaks out across different infrastructure components. The first view is important for understanding end-user performance, while the second is important for improving operations. Figure 2 The latency matrix captures order-flow and market data latency between each co-lo point and each trading venue. Latency management should support multiple views into the measured data so that the same underlying measurements can provide both of these views and others. The Latency Matrix Traders moving to multi-venue and multi-asset strategies are faced with the challenge of managing latency between their various co-location points, at multiple venues. We refer to this as inter co-location latency. The first view is important for understanding end-user performance, while the second is important for improving operations. Market 1 NYSE Co-Lo Mahwah Exchange 2 Co-Lo Carteret Market 2 Corvil Management Console Unified View CNE Network Market Data & Order Flow Market 3 Exchange 3 Co-Lo Secaucus Exchange 4 Co-Lo Weehawken Market 4 CNE CNE
Strategies must be deployed in locations where they have fast access to the venues they need to trade and the market data feeds they need to consume. Identifying the best location for a multi-venue strategy that uses information from several feeds can be difficult. In addition, each location may have access to several copies of a market data feed, for example A and B sides of the feed and copies carried over different networks. Selecting the fastest copy is important to avoid putting your strategies at an unnecessary disadvantage. Relative latency at other locations can be determined by comparing the arrival time of market data updates to their arrival time at the closest co-lo. Traders can make informed decisions to meet these challenges when they have access to the venue/co-lo latency matrix, i.e. detailed measurements of latency between each colocation point and each trading venue. The latency matrix provides both the order-entry latency and the relative latency of each market data feed at each location. Latency matrix data, built up over a period of time, tells traders where the fastest deployment point is for their strategy and which copy of each feed is the fastest at that location. In live form, real-time access to the latency matrix tells you how to route order-flow for fastest execution. The order-flow component of the latency matrix can be assembled by monitoring request-response times and message-rates at each co-location point on a per-venue basis. At the application level, the relevant request-response transactions include order-toacknowledgement, quote-to-acknowledgement, cancel-to-confirm (U-R-OUT), and order to market data update. Any or all of these transactions can be important, depending on trading style. Market data latency is traditionally considered harder to measure due to the unidirectional nature of the application (and higher volumes). The required elements of the latency matrix can however be assembled by monitoring the relative latency of feeds at different locations. Normally, the co-location point that is closest to the feed source will receive the data first. Relative latency at other locations can be determined by comparing the arrival time of market data updates to their arrival time at the closest co-lo. Where multiple copies of a feed are received, it s sufficient to compare update arrival times locally. This approach does not tell you the absolute latency of your market data, but it does allow you to compare the relative speeds of all feeds at all locations. Where multiple copies of a feed are received, it s sufficient to compare update arrival times locally. This approach does not tell you the absolute latency of your market data, but it does allow you to compare the relative speeds of all feeds at all locations.
Multi co-location trading requires a high performance, low latency network to interconnect strategies, venues and feeds. Implementing these comparisons involves sharing information and timestamps about market data updates across different sites. A very lightweight communication protocol must be used for this purpose to avoid adding network load. The issue of clock synchronization must also be tackled, since the timestamps come from different clocks. Two different approaches have emerged to tackle precision clock synchronization: 1 external synchronization this uses a separate, external clock synchronization infrastructure across all installations, based for example on GPS-synchronized time distributed locally via PPS (note that the millisecond-level precision provided by NTP is not sufficient for trading purposes). 2 auto synchronization The second approach is to use a self-synchronizing monitoring solution that handles synchronization between its own components internally, without requiring an external synchronization source. Corvil supports both methods but in practice we have found external synchronization infrastructures with universal coverage to be expensive and time-consuming to build, and because of this self-synchronized monitoring is normally preferable everywhere except those places where external synchronization is already available. Inter Co-Location Latency Multi co-location trading requires a high performance, low latency network to interconnect strategies, venues and feeds. Today there is a plethora of different network offerings to choose from, but the cost/performance trade-off of different choices is often far from obvious. The truth is that network performance is affected by a broad range of factors ranging from the underlying technology used, the geographical distances involved and the amount of load on the network versus the provisioned capacity. Direct monitoring of the network under actual production load provides a reliable way to: determine true performance assess the different options available assure expected performance An appropriate solution must monitor connection performance continuously throughout the day For the purpose of service level monitoring between their own data centers and co-location points, traders should aim to have greater visibility into network performance than the service operators themselves. This is feasible today using technology that tracks latency and loss with microsecond precision for every single packet sent between network end-points. The techniques used are the same as those discussed in the previous section for monitoring the relative arrival times of market data updates at different places.
Away Markets Exchange Matching Engine Gateways Market Order Limit Order New Cancel Replace Ack Reject Fill U-R-OUT Market Data Order Flow Latency Co-Lo Figure 3 Exchange latency monitoring typically looks at performance across different sessions, transaction types and segments of the market. The quality of inbound market data can also be assessed by looking for sequence gaps in the feed as it enters the infrastructure. Gap detection at the edge is essential for tracking down where missing data has been lost; very often, trading firms struggle to eliminate packet-drops within their own systems without realizing that the gaps they see were already present in the inbound feed. Apart from service level assurance, a second important use case for network monitoring is to understand the contribution of the network to trading latency. Identifying this contribution tells you whether a high measured latency value to a particular trading venue can be eliminated by moving your strategy closer. Conversely if total latency is high but the network contribution is low, then the problem is due to application performance and eliminating network latency will bring no advantage. Finally, traders will need to determine the right capacity for their network connections given the quantity of market data, order-flow and other traffic load that exists at each installed co-lo point. Visibility into network impact on order-flow latency can be achieved by monitoring TCPlevel transaction times and retransmissions. Note however that many general-purpose TCP monitoring solutions are not suitable for use with long-lived trading connections, because they derive their results mainly from observation of the connection set-up handshake (an event that happens only once per day in trading environments). An appropriate solution must monitor connection performance continuously throughout the day, meaning that it must have intelligence built-in to deal with
stack idiosyncrasies such as delayed acknowledgements that general-purpose solutions prefer to avoid. Finally, traders will need to determine the right capacity for their network connections given the quantity of market data, order-flow and other traffic load that exists at each installed co-lo point. A key point to be aware of here is that trading traffic of all types is dominated by short-timescale microbursts i.e. traffic rates can hit very high values over short periods of time even when the average rate over seconds and minutes is relatively low. The network (and indeed, other systems in the data path) needs to be sized to accommodate these short bursts. Otherwise messages arriving in bursts will be forced to queue while they wait to be processed, which adds latency. A starting point for capacity sizing is to instrument for microburst detection by measuring bit-rates at timescales of microseconds or below. If a network connection is struggling to cope with microbursts, you will be able to see this as the bit-rate repeatedly hits the connection speed. It is also possible to determine the capacity needed to keep queuing latency below a specified target for the actual traffic load you have, using techniques from queuing theory. Advanced monitoring systems provide both of these capabilities. Producing results in real-time is an advantage as they can be used immediately to move order-flow to faster sessions, or adapt strategy behavior. Exchange-Side Latency Monitoring and Transparency While latency within exchange systems is unavoidable even to traders who are co-located, most will not be concerned so long as levels are low and affect everyone equally. Understanding how latency affects various transactions and responses within the exchange helps traders to verify the potential impact on different trading styles and strategies. Exchange-side latency monitoring aims to provide the following views to the co-located trader: Performance and scalability are key system requirements. Latency across different trading sessions, to verify equitable trading speeds; Latency broken down by security/symbol, providing insight into performance in different parts of the market; Latency for different order types and interaction styles, allowing impact on different strategies to be compared. The required views can be assembled by continuously monitoring request-response and trade-to-tick latency at the edge of the co-lo installation. Results can be collected for every order/interaction and then classified according to trading session, traded security and request/ response type to provide the above breakdowns. Producing results in real-time is an advantage as they can be used immediately to move order-flow to faster sessions, or adapt strategy behavior.
In today s environment this can mean producing and classifying real-time output for millions of messages per second at each instrumentation point. Implementing Co-Lo Latency Management We ve discussed numerous different latency metrics that are beneficial to co-location participants, for the purposes of: Designing and planning strategies Monitoring operations and rapid troubleshooting Reporting on speed and using latency for trade optimization Producing these metrics in an environment of ever-growing message rates requires considerable processing power. Dedicated latency management systems that use passive monitoring provide an independent way to produce the data without interfering with the trading process itself. A dedicated system for co-lo deployment must be compact and efficient in terms of power and space requirements. We believe that the best solution will be one that supports all of the use cases discussed in this document on a single platform. To achieve this goal, key system capabilities should include: Ability to monitor the latency matrix for both market data and order-flow traffic between multiple venues and co-lo sites, including self-synchronization support for sites that don t have external clock-sync infrastructure. Comprehensive transactional latency measurement for order-flow traffic, including support for order-to-acknowledgement, orderto-fill, tick-to-trade and trade-to-tick latency. Built-in support for low-latency network monitoring, including detection of microbursts, sequence gaps, bandwidth requirements, and one-way and round-trip network latency. To support operational alerting and trading functions such as smart order routing, the latency management system must be able to present its data in real-time. In today s environment this can mean producing and classifying real-time output for millions of messages per second at each instrumentation point. Performance and scalability are therefore key system requirements. In a multi-co-location setting the appliances communicate with each other using lightweight protocols, to determine distributed metrics such as relative market data latency. Accurate measurement of microsecond intra- Co-location latency, with ability to decode and track complex business logic in multiprotocol environments. 10
CorvilNet is an example of a latency management system that offers the set of requirements listed above, among others. It is implemented using network-attached appliances in each co-location point that provide large amounts of computational power in a compact form factor. Each multi-functional appliance delivers a broad range of application and network-level measurements for up to several million messages per second. In a multi-co-location setting the appliances communicate with each other using lightweight protocols, to determine distributed metrics such as relative market data latency. The data produced can be accessed through a web GUI, through a web-services API, or through a variety of alerting functions, and the entire system can be managed from a single centralized platform. CorvilNet demonstrates the increasingly sophisticated capabilities of dedicated latency management and its applicability to co-located high performance trading. 11
Corvil Ltd, 6 George s Dock, IFSC, Dublin 1, Ireland T +353 1 859 1000 Tech Support +353 1 859 1010 E info@corvil.com W www.corvil.com CWP-1011-1 Copyright 2010 Corvil Ltd. Corvil is a registered trademark of Corvil Ltd. All other brand or product names are trademarks of their respective holders.