Benchmark Study October 2014 Radisys Leads with DPDK Performance Line Rate Processing on Intel E5-2600v3 Processors Introduction: DPDK Performance Numbers are In Engineers and architects have been keeping a close watch on Data Packet Development Kit (DPDK) performance news with the release of Intel s E5-2600v3 processor family and Ethernet controller, which is expected to take x86 performance to previously locked levels held by customized silicon and specialty processors. The recently announced Radisys T-100 Series Platform containing the E5-2600v3-based processing blades has been combined with Intel s Data Packet Development Kit (DPDK) v1.7 and the benchmark numbers are worth the wait. This new technology combination provides the performance and price point that enables a technology shift from specialized processors and HIGHLIGHTS Radisys A4700 processing blades deliver full wirespeed DPDK Layer 3 forwarding up to 80 Gbps performance DPDK 2 x b line rate performance only required 4 cores, leaving 83% of cores for application processing The Radisys T-100 Series Platform showcases this exceptional DPDK performance silicon to standard x86 blades to meet demanding switching performance required by Network Functions Virtualization Infrastructure (NFVI). Radisys continues to make strides in delivering optimized Network Functions Virtualization Infrastructure, including improving the virtual switching performance. In support of this effort, Radisys is delivering the right performance and price point to enable a technology shift from specialized processors and silicon to x86, speeding up the progress towards NFV with homogenous, hardware-based carrier infrastructure.
2 Product Dual Processors L3 Forwarding - 2x be Ports Frequency Cores 128byte (x2) Packets Radisys Leads E5 2600v3 Benchmarks Radisys is sharing DPDK 1.7 L3 forwarding benchmark testing on the newly released Radisys A4700 blade, tested within the T-Series Platform. The results show why the new processor, Ethernet controller and DPDK combination can be used in applications that previously required specialized hardware or silicon. System throughput for the 12-core, 10-core, and 8-core E5-2600v3 all show 2x b line rate performance using only 4 cores for packet sizes of 128 bytes and larger. Line rate performance using 4 cores leaves 20 cores available for application processing on a dual 12-core E5-2658v3 blade such as the A4700. Using the Radisys T-Series Platform with 12 A4700 blades translates to 240 cores in one platform available and ready for application processing. A4700 E5-2658v3 Compute Blade Total cores used % cores remaining A4745-CPU-BASE E5-2658v3 2.2GHz 12 79.70Gbps 4 83% A4742-CPU-BASE E5-2628Lv3 1.9GHz 10 79.83Gbps 4 80% A4741-CPU-BASE E5-2618Lv3 2.3Ghz 8 79.89Gbps 4 75% Socket 0 Socket 1 L3 Forwarding CLI L3 Forwarding CLI Core 7-11 Core 6 Core 5 Core 4 Core 3 Core 2 Core 1 Core 0 Core 7-11 Core 6 Core 5 Core 4 Core 3 Core 2 Core 1 Core 0 An IXIA Test Engine was used for traffic generation (2xb QSFP connections). Traffic was routed through the A2340 switches in the Radisys T-Series Platform to get a backplane fabric connection to the network controllers on the blade over the backplane. Red Hat ES 6.4 was used. Hyperthreading was disabled. In the A4700 blade, each socket has dual local PCIe ports.
3 Radisys benchmarks for the E5-2600v3 processors explain why many packet processing applications are adopting the Radisys A4700 using DPDK v1.7: line rate performance with 80%+ cores available for applications enables most data plane applications to run sufficiently. Specialized functions such as security, regular expression matching, and search based on TCAM look up may still require nonx86 solutions via custom silicon or packet processors. However, a broad range of performance is now covered with the Radisys T-Series Platform and the A4700 using DPDK. Optimizing Your DPDK Implementation The DPDK benchmarks for the Radisys A4700 and DPDK1.7 provide a useful tool in understanding available CPU capacity for customer applications beyond basic packet transmit and receive functions. Getting the maximum performance and capacity from your CPU core can depend on how flows are classified and assigned. One way to distribute the traffic flows is by using the flow classification and distribution provided natively by the Intel Ethernet Controller. These features include packet distribution to cores based on, MAC/VLAN and flow director. This standard distribution tool provides an efficient method for classifying packets in flows and setting affinity of flows to cores (receive queues). The standard distribution is not difficult to implement as the structure and tables are already in place. However, such a distribution has certain limitations that may be an issue for some applications. For example, Head of Line (HOL) blocking occurs when certain cores are not able to process packets in a receive queue fast enough. This can result in underutilization of other cores and packets being dropped. In some applications, the packet fields needed for flow classification are not present in 5 tuple (TEID for GTP flows, inner packet header in case of tunneled packets etc.). 80% Radisys benchmark tests showed that the improved line rate performance left over 80% of the processor cores available for application and packet exception processing A4700 E5-2658v3 Compute Blade Memory CPU Socket 0 CPU Socket 1 Memory Pool 0 Pool 1 Pool 2 Application Data lcore 0: DPDK Distributor Configuration/Statistics Initialization (Interface, Memory Pool) Radisys Distribution Hash PMD DPDK Ring 1 DPDK Ring 2 DPDK Ring 15 lcore15: lcore 0: DPDK Distributor Configuration/Statistics Initialization (Interface, Memory Pool) Radisys Distribution Hash PMD DPDK Ring 1 DPDK Ring 2 DPDK Ring 15 lcore15: Pool 0 Pool 1 Pool 2 Application Data Linux Switched Network
4 Another method for distributing traffic is to have a DPDK-based distributor module acting as a load balancer for CPU cores. In this case, the Intel forwards all packets to a set of cores running this distribution function, and the distributor assigns flows to cores via a DPDK ring or KNI interface. Such a solution improves application core utilization, provides better control over packet loss by providing deeper queues to handle temporary overload, and provides ability to define stateless or stateful load balancing using many different types of packet fields. This method can be extended to provide load balancing among VMs in a virtualized environment. Many customers are finding that a customized distribution manager is a must have. For example, one customer using the standard flow manager was faced with a bottleneck with all packets being presented to one core, resulting in dropped packets. The packets were fairly generic and flow distribution (based on 5 tuple) was uneven. Moving to a distributor-based solution that could identify flows based on the tunneled packet header enabled flows to be distributed across multiple cores. Summary The DPDK Layer 3 forwarding benchmarks provided in this document demonstrate that the Radisys T-Series Platform with Intel E5-2600v3 series processors and be Ethernet controllers can provide 2 x B line rate performance packet processing while leaving 80% of the remaining cores to be used for the application. This performance level, combined with the standard x86 pricing model, enables a move from specialized processors and silicon to standard x86 blades for most of the demanding switching applications. The adoption of Radisys T-Series telecom-grade products will enable and propel data applications to be incorporated as a standard NFV and SDN implementation, replacing the need for specialized hardware and software. In addition, Radisys professional services has a wide array of technical knowledge, problem solving skills, and developed code to assist customers who are adopting and optimizing DPDK for their high-performance packet processing, packet filtering and networking security applications. This performance level, combined with the standard x86 pricing model, enables a move from specialized processors and silicon to standard x86 blades Professional Services Radisys has developed a variety of custom distribution managers to effectively optimize and manage unique customer application flows, and has years of experience developing and creating customer-specific DPDK solutions. The Radisys professional services group maintains a library of load balancing and DPDK application enhancement programs that are modularized and can be applied to rapidly enable customer-specific solutions. Radisys professional services enables faster development cycles by providing DPDK training classes, supplying developed code, optimizing performance for a Linux distribution, or creating data flows and classifications applicable to unique customer applications. Decreasing costs and time to market for customer products, by providing innovative solutions and expert technical development, remains the prime objective of your Radisys professional services team.
5 About T-100 Series The new T-100 Series Platform is designed to deliver the Virtualized Functions Network Infrastructure (NFVI) for next-generation central office and telecom data centers. Based on a high-reliability 100G ATCA architecture and COTS silicon, the T-100 Series can host thousands of Virtualized Network Functions (VNFs) on the latest Intel Architecture processors. The T-100 intelligent switch, pre-integrated with Radisys FlowEngine, offers 100 Gbps interfaces and over 2 Tbps+ of throughput between a telecom cloud network and the VNFs, overcoming the toughest data plane processing challenges in an SDN or NFV deployment. T-100 Ultra, 14U T-100 Pro, 6U About A4700, Intel E5-2600v3 Series Processor Blade Ideal for data plane workloads, the Radisys A4700 Series blade offers up to 160Gb of I/O with 4x Intel Ethernet Controllers. The A4700 features a mezzanine module for acceleration coprocessors such as Crypto acceleration (Coleto Creek) or graphics acceleration. For compute-intensive applications, the A4700 can be configured with 2x or 4x interfaces to maximize cost-performance effectiveness. About T-Series Compact The T-Series Compact is a high-capacity, 2U carrier-grade rack mount server that can be configured for powerful compute, packet processing, and media processing functionality. The T-Series Compact, based on the latest Intel Xeon E5-2600v3 Series Processor with Intel Data Plane Developer Kit (DPDK) optimizations, is the perfect carrier-grade server solution optimized for virtualization platforms with a rich feature set boosting support for up to 1.5Tb of memory as well as 12 3.5 HDDs which take advantage of Intel s highest bin CPUs A4700 Intel E5-2600v3 Processor Blade T-Series Compact RS220v3 Carrier-grade Server About the Intel Internet of Things Solutions Alliance From modular components to market-ready systems, Intel and the 250+ global member companies of the Intel Internet of Things Solutions Alliance provide scalable, interoperable solutions that accelerate deployment of intelligent devices and end-to-tend analytics. Close collaboration with Intel and each other enables Alliance members to innovate with the latest technologies, helping developers deliver first-in-market solutions. Corporate Headquarters 5435 NE Dawson Creek Drive Hillsboro, OR 97124 USA 503-615-1100 Fax 503-615-1121 Toll-Free: 800-950-0044 www.radisys.com info@radisys.com 2014 Radisys Corporation. Radisys and Trillium are registered trademarks of Radisys Corporation. *All other trademarks are the properties of their respective owners. October 2014