High-Density Network Flow Monitoring

Similar documents
High-Density Network Flow Monitoring

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring

Hardware acceleration enhancing network security

Hardware Acceleration for High-density Datacenter Monitoring

D1.2 Network Load Balancing

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University

Intel DPDK Boosts Server Appliance Performance White Paper

MIDeA: A Multi-Parallel Intrusion Detection Architecture

Wire-speed Packet Capture and Transmission

Router Architectures

Evaluation of 40 Gigabit Ethernet technology for data servers

Transparent Optimization of Grid Server Selection with Real-Time Passive Network Measurements. Marcia Zangrilli and Bruce Lowekamp

Autonomous NetFlow Probe

Boosting Data Transfer with TCP Offload Engine Technology

Open-source routing at 10Gb/s

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

Big Data Technologies for Ultra-High-Speed Data Transfer and Processing

Packet-based Network Traffic Monitoring and Analysis with GPUs

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

HPC performance applications on Virtual Clusters

Enabling Technologies for Distributed Computing

High-performance vswitch of the user, by the user, for the user

Leveraging NIC Technology to Improve Network Performance in VMware vsphere

Monitoring of Tunneled IPv6 Traffic Using Packet Decapsulation and IPFIX

VMWARE WHITE PAPER 1

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

Xeon+FPGA Platform for the Data Center

OpenFlow with Intel Voravit Tanyingyong, Markus Hidell, Peter Sjödin

MoonGen. A Scriptable High-Speed Packet Generator Design and Implementation. Paul Emmerich. January 30th, 2016 FOSDEM 2016

Datacenter Operating Systems

Enabling Technologies for Distributed and Cloud Computing

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Welcome to the Dawn of Open-Source Networking. Linux IP Routers Bob Gilligan

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Tyche: An efficient Ethernet-based protocol for converged networked storage

Arrow ECS sp. z o.o. Oracle Partner Academy training environment with Oracle Virtualization. Oracle Partner HUB

Networking Driver Performance and Measurement - e1000 A Case Study

Infrastructure for active and passive measurements at 10Gbps and beyond

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Challenges in high speed packet processing

Linux NIC and iscsi Performance over 40GbE

Monitoring high-speed networks using ntop. Luca Deri

How To Improve Performance On A Linux Based Router

Oracle SDN Performance Acceleration with Software-Defined Networking

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

Main Memory Data Warehouses

Analyzing the Virtualization Deployment Advantages of Two- and Four-Socket Server Platforms

Extreme Load Test of Hardware-accelerated Adapter against DDoS Attacks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

LS DYNA Performance Benchmarks and Profiling. January 2009

FPGA-based Multithreading for In-Memory Hash Joins

ECLIPSE Performance Benchmarks and Profiling. January 2009

One Server Per City: C Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko, hgs}@cs.columbia.edu

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Intel Xeon Processor E5-2600

FPGAs for Trusted Cloud Computing

Netvisor Software Defined Fabric Architecture

Next Generation Operating Systems

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Figure 1A: Dell server and accessories Figure 1B: HP server and accessories Figure 1C: IBM server and accessories

The Future of Computing Cisco Unified Computing System. Markus Kunstmann Channels Systems Engineer

Boost Database Performance with the Cisco UCS Storage Accelerator

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Riverbed Stingray Traffic Manager VA Performance on vsphere 4 WHITE PAPER

Microsoft Exchange Server 2003 Deployment Considerations

Zeus Traffic Manager VA Performance on vsphere 4

Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking

Network Traffic Monitoring & Analysis with GPUs

Big Data Technologies for Ultra-High-Speed Data Transfer in Life Sciences

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

High-performance vnic framework for hypervisor-based NFV with userspace vswitch Yoshihiro Nakajima, Hitoshi Masutani, Hirokazu Takahashi NTT Labs.

Introduction to Intel Ethernet Flow Director and Memcached Performance

Cisco UCS B460 M4 Blade Server

Multi-Threading Performance on Commodity Multi-Core Processors

How To Compare Two Servers For A Test On A Poweredge R710 And Poweredge G5P (Poweredge) (Power Edge) (Dell) Poweredge Poweredge And Powerpowerpoweredge (Powerpower) G5I (

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

Towards 100 Gbit Flow-Based Network Monitoring. Luca Deri Alfredo Cardigliano

How To Build A Cisco Ukcsob420 M3 Blade Server

Data Center Storage Solutions

GPU-Based Network Traffic Monitoring & Analysis Tools

Networking Virtualization Using FPGAs

Network Function Virtualization: Virtualized BRAS with Linux* and Intel Architecture

Network Security Platform 7.5

iscsi Performance Factors

Performance of Host Identity Protocol on Nokia Internet Tablet

Network Performance Evaluation of Latest Windows Operating Systems

Intel Xeon Processor 3400 Series-based Platforms

Design considerations for efficient network applications with Intel multi-core processor-based systems on Linux*

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

ACANO SOLUTION VIRTUALIZED DEPLOYMENTS. White Paper. Simon Evans, Acano Chief Scientist

UCS M-Series Modular Servers

OBJECTIVE ANALYSIS WHITE PAPER MATCH FLASH. TO THE PROCESSOR Why Multithreading Requires Parallelized Flash ATCHING

Transcription:

Petr Velan petr.velan@cesnet.cz High-Density Network Flow Monitoring IM2015 12 May 2015, Ottawa

Motivation What is high-density flow monitoring? Monitor high traffic in as little rack units as possible Why do we want high-density flow monitoring? Flow monitoring is deployed on many lines Number of flow probes is growing Management and operational costs are growing One probe per link does not scale Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 1 / 18

Our Approach Use our custom made network interface cards to monitor multiple 10G links See what throughput can be achieved in one machine Test how advanced features of our NICs help the monitoring Identify performance bottlenecks Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 2 / 18

Monitoring Setup The Testbed Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 3 / 18

The Server Dell PowerEdge R720 server (2 rack units) 2 E5-2670 v1 CPUs (8 cores, 3 GHz) 64GB DDR3 RAM (1600 MHz) Scientific Linux 6.5 with 2.6.32-41 kernel 2 COMBO-80G cards Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 4 / 18

COMBO-80G NIC FPGA based programmable hardware Two QSPF+ interfaces in 40 G or 4 10 G Ethernet mode 80 G per card PCI-Express gen3 x8 bus (64 Gb/s) Additional features: Accurate timestamps Hash based packet distribution Packet trimming Packet feature extraction into Unified Header Our setup allows to monitor 16 10 G links Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 5 / 18

COMBO-80G NIC Core 0... Core 7 CPU 0... 7 DMA PCI-Express RAM FPGA Channel Hash Data (UH or Packets) Select UH UH Packets Parsing Trimming Timestamp Assign QSFP+ QSFP+ COMBO-80G 8 x 10 GE Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 6 / 18

Flow Exporter Architecture Thread 2N+1 Export Multi-threaded design Utilizes 2N + 1 CPU cores where N is number of ring buffers Thread N+1... Thread 2N Flow cache Thread 1... Input Thread N 0... N-1 Ring Bu ers Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 7 / 18

Data Generation Spirent TestCenter hardware 1 10 G repeated to all 16 interfaces IPv4 UDP packets Packet sizes 64 B, 128 B, 256 B, 512 B Flow counts 2 11, 2 18, 2 21 Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 8 / 18

Results The Measurement Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 9 / 18

Basic Performance on Full Packets 120 x 10 6 100 x 10 6 Maximum Ethernet Maximum PCIe 128 flows [Card 0] 16384 flows [Card 0] 131072 flows [Card 0] 128 flows [Card 1] 16384 flows [Card 1] 131072 flows [Card 1] Packets/s 80 x 10 6 60 x 10 6 40 x 10 6 20 x 10 6 0 x 10 6 64 128 256 512 Packet size Full packet processing performance in packets/s. Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 10 / 18

Basic Performance on Full Packets PCI-Express limit is reached only for the longest packets NUMA architecture affects the performance Number of flows has large impact mainly on short packets Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 11 / 18

Hardware Accelerated Performance 250 x 10 6 Maximum Ethernet Maximum PCIe Full Packets Trimmed Packets Unified Headers 200 x 10 6 Packets/s 150 x 10 6 100 x 10 6 50 x 10 6 0 x 10 6 64 128 256 512 Packet size Packet processing performance in packets/s for 2 18 flows. Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 12 / 18

Hardware Accelerated Performance Packet trimming ad Unified Headers help significantly Full throughput monitoring for 256 B and longer packets Packet trimming and Unified Headers solve the problem of PCI-Express throughput Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 13 / 18

Impact of CPU Choice Comparison of the E5-2670 with E5-2620 Only 6 cores, 2 Ghz frequency 128 flows [E5-2670] 16384 flows [E5-2670] 131072 flows [E5-2670] 128 flows [E5-2620] 16384 flows [E5-2620] 131072 flows [E5-2620] 140 x 10 6 120 x 10 6 100 x 10 6 Packets/s 80 x 10 6 60 x 10 6 40 x 10 6 20 x 10 6 0 x 10 6 64 128 256 512 Packet size Comparison for two different CPUs on trimmed packets Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 14 / 18

Impact of CPU Choice Faster CPU helps greatly for less flows Large number of flows has greater impact on memory bus utilization Both CPU are doing well for longer packets Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 15 / 18

Conclusions & Future Work Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 16 / 18

Conclusions It is possible to monitor 16 10 G links in one 2U box Hardware acceleration can significantly help to improve the performance PCI-Express can be limiting for commodity cards NUMA architecture must be taken into consideration Number of flows has significant impact on performance Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 17 / 18

Future Work Test the performance with COMBO-100G cards (PCIe gen3 x16) High-speed expriment with application flow measurements Build better framework for measurements Different flow lengths More complex packets and flows Different packets in one flow Petr Velan High-Density Network Flow Monitoring 12. 5. 2015 18 / 18

Thanks for your attention High-Density Network Flow Monitoring IM2015 12 May 2015, Ottawa Petr Velan petr.velan@cesnet.cz