WireCAP: a Novel Packet Capture Engine for Commodity NICs in High-speed Networks



Similar documents
Packet-based Network Traffic Monitoring and Analysis with GPUs

Network Traffic Monitoring and Analysis with GPUs

GPU-Based Network Traffic Monitoring & Analysis Tools

Network Traffic Monitoring & Analysis with GPUs

10 Gbit Hardware Packet Filtering Using Commodity Network Adapters. Luca Deri Joseph Gasparakis

MIDeA: A Multi-Parallel Intrusion Detection Architecture

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

Performance of Software Switching

High-Performance Many-Core Networking: Design and Implementation

MoonGen. A Scriptable High-Speed Packet Generator Design and Implementation. Paul Emmerich. January 30th, 2016 FOSDEM 2016

Network Virtualization Technologies and their Effect on Performance

High-Performance Network Traffic Processing Systems Using Commodity Hardware

ncap: Wire-speed Packet Capture and Transmission

Comparing and Improving Current Packet Capturing Solutions based on Commodity Hardware

Haetae: Scaling the Performance of Network Intrusion Detection with Many-core Processors

Wire-speed Packet Capture and Transmission

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Towards 100 Gbit Flow-Based Network Monitoring. Luca Deri Alfredo Cardigliano

Collecting Packet Traces at High Speed

Infrastructure for active and passive measurements at 10Gbps and beyond

The Performance Analysis of Linux Networking Packet Receiving

Chronicle: Capture and Analysis of NFS Workloads at Line Rate

Intel DPDK Boosts Server Appliance Performance White Paper

Exploiting Commodity Multi-core Systems for Network Traffic Analysis

On Multi Gigabit Packet Capturing With Multi Core Commodity Hardware

The Role of Virtual Routers In Carrier Networks

Bridging the Gap between Software and Hardware Techniques for I/O Virtualization

COUNTERSNIPE

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

High-Density Network Flow Monitoring

NicPic: Scalable and Accurate End-Host Rate Limiting

Scheduling. Scheduling. Scheduling levels. Decision to switch the running process can take place under the following circumstances:

Scaling Up Your Network Monitoring: From the Garden Hose to the Fire Hose

End-to-End Network/Application Performance Troubleshooting Methodology

Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking

Bivio 7000 Series Network Appliance Platforms

High Speed Network Traffic Analysis with Commodity Multi-core Systems

NetVM: High Performance and Flexible Networking using Virtualization on Commodity Platforms

Challenges in high speed packet processing

Optimizing Network Virtualization in Xen

Optimizing Network Virtualization in Xen

Hardware Acceleration for High-density Datacenter Monitoring

High-performance vnic framework for hypervisor-based NFV with userspace vswitch Yoshihiro Nakajima, Hitoshi Masutani, Hirokazu Takahashi NTT Labs.

Improved Forwarding Architecture and Resource Management for Multi-Core Software Routers

Scaling Networking Applications to Multiple Cores

Virtual device passthrough for high speed VM networking

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Stream Oriented Network Packet Capture. Hot topics in Security Research the Red Book. Evangelos Markatos FORTH-ICS

10 Gbit/s Line Rate Packet Processing Using Commodity Hardware: Survey and new Proposals

Network Function Virtualization Using Data Plane Developer s Kit

DPDK Summit 2014 DPDK in a Virtual World

Network Monitoring on Multicores with Algorithmic Skeletons

High-Performance IP Service Node with Layer 4 to 7 Packet Processing Features

A Research Study on Packet Sniffing Tool TCPDUMP

OpenFlow with Intel Voravit Tanyingyong, Markus Hidell, Peter Sjödin

Architecture of the Kernel-based Virtual Machine (KVM)

Networking Virtualization Using FPGAs

vpf_ring: Towards Wire-Speed Network Monitoring Using Virtual Machines

Proposal for Virtual Private Server Provisioning

High-performance vswitch of the user, by the user, for the user

A Deduplication File System & Course Review

Effects of Interrupt Coalescence on Network Measurements

Linux KVM Virtual Traffic Monitoring

Virtualization. Jukka K. Nurminen

I3: Maximizing Packet Capture Performance. Andrew Brown

COS 318: Operating Systems

Revisiting Software Zero-Copy for Web-caching Applications with Twin Memory Allocation

Virtualization Technologies

Cray Gemini Interconnect. Technical University of Munich Parallel Programming Class of SS14 Denys Sobchyshak

VMWARE WHITE PAPER 1

Accelerating High-Speed Networking with Intel I/O Acceleration Technology

Chapter 5 Cloud Resource Virtualization

Performance Testing. Configuration Parameters for Performance Testing

Quantum StorNext. Product Brief: Distributed LAN Client

Open Source in Network Administration: the ntop Project

Delivering 160Gbps DPI Performance on the Intel Xeon Processor E Series using HyperScan

Bricata Next Generation Intrusion Prevention System A New, Evolved Breed of Threat Mitigation

Leveraging NIC Technology to Improve Network Performance in VMware vsphere


Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

VALE, a Switched Ethernet for Virtual Machines

Open-source routing at 10Gb/s

IxChariot Virtualization Performance Test Plan

mesdn: Mobile Extension of SDN Mostafa Uddin Advisor: Dr. Tamer Nadeem Old Dominion University

Tyche: An efficient Ethernet-based protocol for converged networked storage

An Oracle Technical White Paper November Oracle Solaris 11 Network Virtualization and Network Resource Management

A Middleware Strategy to Survive Compute Peak Loads in Cloud

Network Functions Virtualization on top of Xen

Firewalls P+S Linux Router & Firewall 2013

Efficient real-time Linux interface for PCI devices: A study on hardening a Network Intrusion Detection System

Datacenter Operating Systems

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Windows Server 2008 R2 Hyper V. Public FAQ

Achieving Mainframe-Class Performance on Intel Servers Using InfiniBand Building Blocks. An Oracle White Paper April 2003

Hardware acceleration enhancing network security

Transcription:

WireCAP: a Novel Packet Capture Engine for Commodity NICs in High-speed Networks Wenji Wu, Phil DeMar Fermilab Network Research Group wenji@fnal.gov, demar@fnal.gov ACM IMC 2014 November 5-7, 2014 Vancouver, BC, Canada 1

Packet Capture is an essential function for many network applications Security analysis (e.g., IDS/IPS) Snort, www.snort.org Suricata, http://suricata-ids.org Network and application performance analysis Riverbed SteelCentral NetProfiler Netscout Sniffer Analysis Wildpackets OmniPeek Network Analyzer Traffic characterization studies Benson, IMC 10 And more 2

A general packet capture process Networks capture Data capture buffer Delivery Apps Typically computationally and I/O throughput intensive Significant performance challenges in high-speed networks 3

Packet drop is a major problem with packet capture in high-speed networks Type I packet drop: packet capture drops Networks capture Buffer Delivery Apps Drop! Type II packet drop: packet delivery drops Buffer Networks capture Delivery Apps Packet drops degrade the accuracy & integrity of network monitoring applications! Fundamental design goal in packet capture tools: Avoid packet drops! 4 Drop!

Packet capture approaches 1. Use dedicated packet capture card Pros The least amount of CPU intervention Lossless packet capture and delivery Cons: costly, relatively inflexible, and not very scalable 2. Use a commodity system with a commodity NIC A commodity NIC in promiscuous mode to intercept pkts A capture engine provides capture and delivery services Pros: flexible and cost effective Cons: significant system CPU and memory resources The 2 nd approach becomes more appealing with recent advances in multicore systems and multi-queue NICs: A new paradigm in packet capturing and processing. This is our research focus. 5

A new packet capturing and processing paradigm 6 Assumption: the hardware-based traffic-steering mechanism is capable of evenly distributing the incoming traffic among cores.

Question 1 Can NIC hardware-based traffic-steering mechanisms evenly distribute the incoming traffic among cores? Question 2 Can existing packet capture engines support this new packet capture and processing paradigm? 7

Experiment Proof Intel 82599 10GE NIC Traffic captured at FNAL border router used as experiment data System A runs as a traffic generator to replicate captured data 8 both content & spacing On System B, two experiments are run: Exp1. Profiling traffic from each receive queue Exp2: A single-threaded application is run at each core to capture & process pkts from its associated receive queue We evaluate existing packet capture engines. -- DNA, PF_RING (Luca et al) -- NETMAP (Luigi ATC 12)

Observation 1: load imbalance occurs frequently in a multicore system Two types of load imbalance Short-term load imbalance (short-term burst of packets) Long-term load imbalance (queue 0 receives much more traffic than queue 3) Existing 9 NICs distribute traffic on a per-flow basis!

Observation 2: existing packet capture engines suffer significant packet drops NETMAP DNA PF_RING Receive Queue 0 Packet Capture Drops 46.5% 50.1% 0% Packet Delivery Drops 0% 0% 56.8% Receive Queue 3 Packet Capture Drops 33.4% 9.3% 0.8% Packet Delivery Drops 0% 0% 0% Packet drop rates with a heavy packet-processing load Existing packet capturing engines suffer significant packet drops with load imbalance of either type! 10

Problems with existing packet capture engines Engines Deficiencies Can it handle Shortterm load imbalance? Can it handle Longterm load imbalance? PF_RING Copying in packet capture Receive livelock problem No offloading mechanism DNA Limited buffering capability No offloading mechanism NETMAP Limited buffering capability No offloading mechanism 11

How to avoid packet drops that are caused by load imbalance? 1. To apply a round-robin traffic steering mechanism at NIC level for traffic distribution Does not preserve application logic Results in poor system performance 2. To use existing packet capture engines (e.g. DNA) and to address load imbalance in application layer An application has little knowledge of low-layer conditions Must involve copying Making the application complex and difficult to design 3. Our solution is to design a new packet capture engine to address load imbalance at packet-capture level 12

Our Solution WireCAP: a Novel Packet Capture Engine for Commodity NICs in High-speed Networks 13

WireCAP Design Goals Provide lossless packet-capture services in high-speed networks Provide efficient packet delivery Be efficient Facilitate design & operation of packetprocessing applications in user space Wide applicability Support middlebox-type applications Needs to implement a transmit function 14

WireCAP Key Techniques The ring-buffer-pool mechanism To handle short-term load imbalance The buddy-group-based offloading mechanism To handle long-term load imbalance Optimization techniques Pre-allocated large packet buffers Zero-copy Packet-level batching processing 15

The ring-buffer pool mechanism: How does a NIC receive packets? Receive ring 16

The ring-buffer pool mechanism Ring buffer pool (R*M) A receive ring is divided into descriptor segments Each segment consists of M receive pkt descriptors Ring is preallocated with R pkt buffer chunks (ie. ring buffer pool) A pkt buffer chunk consists of M packet buffers Each descriptor segment must be attached with an empty pkt buffer chunk to receive pkts Supports four operations through ioctl interface -- Open -- Capture -- Recycle -- Close 17

The buddy-group mechanism Executes traffic offloading in an application-aware manner Receive queues accessed by a single application can form a buddy group Traffic offloading is only permitted within a buddy group 18

WireCAP Architecture A Libpcap-compatible interface for low-level network access The buddy-group-based offloading mechanism Low-level packet capture and transmit service The ring-buffer-pool mechanism 19

WireCAP Operations Lossless zero-copy packet capture and delivery Threshold: T Zero-copy packet forwarding Only involving metadata operations 20

WireCAP Implementation The current implementation consists of a kernelmode driver and a user-mode library OSes Supported Linux kernel 3.16 We are working to support other OSes Commodity NICs supported Intel 82599-based 10GigE NIC Working on support for 40 GigE NICs 21

22 WireCAP Evaluation Exp1 - Packet capture in the basic mode System A generates P 64B packets at wire speed (14.88 million p/s) NIC1 is configured with one RQ A single-threaded application captures and processes pkts at System B Exp2 - Packet capture in the advanced mode System A replays the captured data NIC1 is configured with multiple RQs, all RQs form a buddy group A multiple-threaded application captures and processes pkts at System B Exp3 - Packet forwarding System A replays the captured data We use traffic captured from Fermilab border router as experiment data System A runs as a traffic generator: (or) generates 64B packets at wire speed replays captured data as recorded NIC1 is configured with multiple RQs, all RQs from a buddy group A multi-threaded application captures and processes pkts at System B, ACM processed IMC 14 pkts are forwarded to System C via NIC2

Packet Capture in the basic mode R M WireCAP vs. existing packet capture engines R and M are varied, R*M is fixed WireCAP demonstrates superior buffering capabilities for short-term bursts of packets WireCAP s buffering capability is proportional to the overall ring buffer capacity R*M 23

Packet Capture in the advanced mode R T M WireCAP vs. existing packet capture engines R and M are fixed, T is varied The buddy-group-based offloading mechanism achieved significant improved performance In general, WireCAP performs better when T is set to a relatively lower value 24

Packet Forwarding WireCAP vs. existing packet capture engines WireCAP s packet forwarding function is capable of supporting middlebox applications The buddy-group-based offloading mechanism can achieve a significant improved performance 25

Summary WireCAP provides zero-copy lossless packet capture and delivery services by exploiting multi-queue NICs and multicore WireCAP provides a new packet I/O framework for commodity NICs in high-speed networks 26

Future Work Compare WireCAP with DPDK Adapt WireCAP for 40GE NICs 27

Questions? WireCAP website: http://wirecap.fnal.gov Source code is available: Please contact Fermilab s Office of Partnerships and Technology Transfer (OPTT) <optt@fnal.gov> to obtain a copy of WireCAP 28