WHITE PAPER Understanding Flow and Packet Deduplication Riverbed Technical Marketing
2012 Riverbed Technology. All rights reserved. Riverbed, Cloud Steelhead, Granite, Interceptor, RiOS, Steelhead, Think Fast, Virtual Steelhead, Whitewater, Mazu, Cascade, Cascade Pilot, Shark, AirPcap, SkipWare, TurboCap, WinPcap, Wireshark, and Stingray are trademarks or registered trademarks of Riverbed Technology, Inc. in the United States and other countries. Riverbed and any Riverbed product or service name or logo used herein are trademarks of Riverbed Technology. All other trademarks used herein belong to their respective owners. The trademarks and logos displayed herein cannot be used without the prior written consent of Riverbed Technology or their respective owners. Akamai and the Akamai wave logo are registered trademarks of Akamai Technologies, Inc. SureRoute is a service mark of Akamai. Apple and Mac are registered trademarks of Apple, Incorporated in the United States and in other countries. Cisco is a registered trademark of Cisco Systems, Inc. and its affiliates in the United States and in other countries. EMC, Symmetrix, and SRDF are registered trademarks of EMC Corporation and its affiliates in the United States and in other countries. IBM, iseries, and AS/400 are registered trademarks of IBM Corporation and its affiliates in the United States and in other countries. Linux is a trademark of Linus Torvalds in the United States and in other countries. Microsoft, Windows, Vista, Outlook, and Internet Explorer are trademarks or registered trademarks of Microsoft Corporation in the United States and in other countries. Oracle and JInitiator are trademarks or registered trademarks of Oracle Corporation in the United States and in other countries. UNIX is a registered trademark in the United States and in other countries, exclusively licensed through X/Open Company, Ltd. VMware, ESX, ESXi are trademarks or registered trademarks of VMware, Incorporated in the United States and in other countries. 2012 Riverbed Technology. All rights reserved. 1
This paper describes two different concepts used by the Riverbed Cascade product family architecture Flow Deduplication which is used by Riverbed Cascade Gateway software and Riverbed Cascade Profiler appliances; and Packet deduplication which is used by Riverbed Cascade Shark products and Riverbed Cascade Sensor appliances. What is a flow? A Flow is a set of IP packets in the network that all share a common set of attributes Typical flow is based on the 5-tuple: 1. Source IP 2. Destination IP 3. Protocol 4. Source Port 5. Destination Port It also includes additional information such as Number of bytes transmitted Number of packets transmitted Inbound and Outbound interfaces COS/QOS markings TCP Flags used In general, a flow is unidirectional, e.g. describing only half of a TCP connection. A flow may be defined by only a subset of available attributes, such as just <SrcIP, DstIP> Who exports a flow? Most Enterprise-class routers Cascade Sensor appliance Cascade Shark appliance Some Switches (Layer 3 Switch) Wan Optimizers (Riverbed Steelhead products, Juniper) Some other devices (Packeteer, nprobe) Types of flow Riverbed Cascade supports Type of Flow Description Supported Vendors Netflow v5 Netflow v9 J-flow Bluecoat Packeteer FDR S-Flow IPFIX (IP Flow Information Export) Widely in use, supported by multiple vendors Fixed content flow record with basic counters/info Generally supports ingress only Drastic increase in available fields Templates allow customization of data collected Official support for ingress and egress flows NetFlow like variants generally look like NetFlow v5 Includes flow record values plus layer-7 identifier sflow uses sampled packets for network monitoring Similar to the Netflow Protocol IPFIX considers a flow to be any number of packets observed in a specific timeslot and sharing a number of properties eg. Same source, same destination, same protocol etc Cisco Cisco Juniper Bluecoat HP Brocade Extreme Networks Nortel 2012 Riverbed Technology. All rights reserved. 2
VMware Netflow Steelhead Cascade Flow Cascade Sensor Flow Cascade Shark Flow Similar to Netflow v5 VxLAN (Virtual Extensible LAN) information Performance Metrics - Network RTT / Response Time WAN Interface Identification TCP Retransmissions Netflow variant for Cascade use Includes L7 Application tag Includes Performance Metrics = Network RTT / Response Time TCP Retransmissions Netflow like variant for Cascade use Includes Performance Metrics = Network RTT / Response Time Includes TCP Retransmissions VMware What is Flow Deduplication? Flow deduplication is the process of collating and normalizing reports from multiple sources about the same flow. Multiple flow exporters in the network can report to the same flow collector (i.e. Cascade Gateway), which can result into multiple flow records describing the same network traffic. Why Flow Deduplication and Coalescing A typical client to server connection will traverse several segments, often both LAN and WAN of the network. Each segment has the possibility for congestion or packet loss, and each router may cause queuing loss or QoS changes. And a connection may take an asymmetric path, meeting client to server path isn t the same as server to client path. All of these factors are part of the daily realities that engineers must deal with during the course of troubleshooting. While all vendors enable you to report on the traffic seen at a given observation point (a NetFlow source or a probe on the wire), this leaves operators with 2,3 8+ individual reports to examine for each connection. Even the simple example diagram above would give operators 4 different values, based on which observation point was being reported on. This process of needing to know what path a conversation took, and manually reconciling reports from each observation point are a very cumbersome and lengthy process. When Cascade Profiler appliance sees multiple flow exporters in the network, each reporting the same conversation, it automatically deduplicates that traffic into a single record. Note that Cascade is conscious to preserve any per interface data. Further, Cascade recognizes data beyond NetFlow may have value. Cascade Sharks, Sensors, WAN optimizers, Shapers, Capture appliances, load balancers and more may all also see the conversation and have valuable additional data to share about the conversation. This process of integrating additional connection metrics, items like network round trip time, server delay, layer- 7 application name and more is called data-coalescing. Cascade can examine a conversation end-to-end, and report the path 2012 Riverbed Technology. All rights reserved. 3
taken in each direction, as well as byte counts, QoS markings, round trip times, etc. And because some of these factors such as round trip time are pervasive for the connection, even an interface that did not measure RTT (a simple NetFlow exporter) can still be aware of it when reporting. Benefits of flow deduplication and coalescing End-to-end visibility for a connection Report on any element or component in the network (IP Address of Server, TCP-Port, Application, QOS, etc. or any combination there-of) without having to first select an interface or observation point Identify the path a conversation has taken as well as all metrics along the way in a single report Accurately report on a conversation even if it takes an asymmetric path Identify changing QoS tags per hop and in each direction Greatly simplified and more powerful monitoring with conversations now identified as a single entity, anomaly and policy based alarming are much simpler to configure, more comprehensive, and eliminate duplicate notification of the same event Continuous drill down and pivoting between different data views With the ability to associate all records of a conversation together, manipulation through that data can take on many new dimensions not available when you must report interface by interface Simplified and automated WAN optimization bandwidth reduction reporting Shared knowledge as reported by different sources Tagging by one source associates the tag with the flows from all other sources in the aggregate flow Minimizes storage requirements - common information is stored only once Why Packet Deduplication? It is common that a single packet capturing device (such as Cascade Shark or Cascade Sensor) may be fed by copies of the same data in the same network. For an easy deployment model it is typically to use SPAN (also referred to as Port Mirror) technology to collect multiple VLANS or multiplied ports on a switch/router to the same packet capturing device. The problem with this model is that it packet capturing device may see the exact same IP packet multiple times even though there was no errant network behavior. This happens because the vendor switch being monitored may send a copy of the IP packet as it enters one VLAN (or port), and a second copy of the same IP packet as it leaves the VLAN (or port) or as it enters the next VLAN. If all this traffic is going to the same packet capturing device, the device may sense that data is being retransmitted by the sender and may also over-count the volume of data associated with the conversation. Riverbed Cascade Sensor/Shark uses packet deduplication methodology to avoid counting IP packet multiple times and count it as a retransmission. This is an optional feature which may be enabled or disabled on a per-port basis. By enabling this feature when multiple VLANs are SPANed it assures conversations have correct packet counts and that only true retransmissions are reported. About Riverbed Riverbed delivers performance for the globally connected enterprise. With Riverbed, enterprises can successfully and intelligently implement strategic initiatives such as virtualization, consolidation, cloud computing, and disaster recovery without fear of compromising performance. By giving enterprises the platform they need to understand, optimize and consolidate their IT, Riverbed helps enterprises to build a fast, fluid and dynamic IT architecture that aligns with the business needs of the organization. Additional information about Riverbed (NASDAQ: RVBD) is available at www.riverbed.com Riverbed Technology, Inc. 199 Fremont Street San Francisco, CA 94105 Tel: (415) 247-8800 www.riverbed.com Riverbed Technology Ltd. One Thames Valley Wokingham Road, Level 2 Bracknell. RG42 1NG United Kingdom Tel: +44 1344 31 7100 Riverbed Technology Pte. Ltd. 391A Orchard Road #22-06/10 Ngee Ann City Tower A Singapore 238873 Tel: +65 6508-7400 Riverbed Technology K.K. Shiba-Koen Plaza Building 9F 3-6-9, Shiba, Minato-ku Tokyo, Japan 105-0014 Tel: +81 3 5419 1990 2012 Riverbed Technology. All rights reserved. 4