Improving Passive Packet Capture: Beyond Device Polling

Similar documents
Wire-speed Packet Capture and Transmission

Monitoring high-speed networks using ntop. Luca Deri

Improving Passive Packet Capture: Beyond Device Polling

Improving Passive Packet Capture: Beyond Device Polling

ncap: Wire-speed Packet Capture and Transmission

Open Source in Network Administration: the ntop Project

The ntop Project: Open Source Network Monitoring

10 Gbit Hardware Packet Filtering Using Commodity Network Adapters. Luca Deri Joseph Gasparakis

Passively Monitoring Networks at Gigabit Speeds Using Commodity Hardware and Open Source Software. Luca Deri January 2003

Transparent Optimization of Grid Server Selection with Real-Time Passive Network Measurements. Marcia Zangrilli and Bruce Lowekamp

Open Source VoIP Traffic Monitoring

High-Speed Network Traffic Monitoring Using ntopng. Luca

Collecting Packet Traces at High Speed

High-Density Network Flow Monitoring

Passively Monitoring Networks at Gigabit Speeds Using Commodity Hardware and Open Source Software

Improving the Performance of Passive Network Monitoring Applications with Memory Locality Enhancements

High-Density Network Flow Monitoring

vpf_ring: Towards Wire-Speed Network Monitoring Using Virtual Machines

Design of an Application Programming Interface for IP Network Monitoring

Software-based Packet Filtering. Fulvio Risso Politecnico di Torino

Computer Communications

Router Architectures

Packet Capture in 10-Gigabit Ethernet Environments Using Contemporary Commodity Hardware

Intel DPDK Boosts Server Appliance Performance White Paper

1. Computer System Structure and Components

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Comparing and Improving Current Packet Capturing Solutions based on Commodity Hardware

How To Monitor And Test An Ethernet Network On A Computer Or Network Card

Open Source VoIP Traffic Monitoring

D1.2 Network Load Balancing

Packet Capture in 10-Gigabit Ethernet Environments Using Contemporary Commodity Hardware

Monitoring Network Traffic using ntopng

Increasing Data Center Network Visibility with Cisco NetFlow-Lite

Autonomous NetFlow Probe

Network packet capture in Linux kernelspace

Accelerate In-Line Packet Processing Using Fast Queue

OpenFlow with Intel Voravit Tanyingyong, Markus Hidell, Peter Sjödin

RF Monitor and its Uses

COMP416 Lab (1) Wireshark I. 23 September 2013

Assessing the Performance of Virtualization Technologies for NFV: a Preliminary Benchmarking

Infrastructure for active and passive measurements at 10Gbps and beyond

A Research Study on Packet Sniffing Tool TCPDUMP

Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck

Gigabit Ethernet Packet Capture. User s Guide

Exploiting Commodity Multicore Systems for Network Traffic Analysis

Performance of Host Identity Protocol on Nokia Internet Tablet

Exploiting Commodity Multi-core Systems for Network Traffic Analysis

Operating Systems Design 16. Networking: Sockets

High-speed Network and Service Monitoring. Luca Deri

An Architecture for High Performance Network Analysis

Insiders View: Network Security Devices

Introducing Scalability in Network Measurement: Toward 10 Gbps with Commodity Hardware

Inside ntop: An Open Source Network Monitoring Tool

HANIC 100G: Hardware accelerator for 100 Gbps network traffic monitoring

Towards 100 Gbit Flow-Based Network Monitoring. Luca Deri Alfredo Cardigliano

vpf_ring: Towards Wire-Speed Network Monitoring Using Virtual Machines

BHyVe. BSD Hypervisor. Neel Natu Peter Grehan

Who is Generating all This Traffic?

Performance of Software Switching

I3: Maximizing Packet Capture Performance. Andrew Brown

On Multi Gigabit Packet Capturing With Multi Core Commodity Hardware


The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

ntopng: Realtime Network Traffic View

On the Development of IETF-based Network Monitoring Probes for High Speed Networks

Configuring Memory on the HP Business Desktop dx5150

The Performance Analysis of Linux Networking Packet Receiving

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University

Gigabit Ethernet Design

The IPTV-Analyzer OpenSourceDays 2012

VMWARE WHITE PAPER 1

High-Performance Many-Core Networking: Design and Implementation

Efficient real-time Linux interface for PCI devices: A study on hardening a Network Intrusion Detection System

Wireshark and tcpdump: Packet Capture for Network Analysis

UPC team: Pere Barlet-Ros Josep Solé-Pareta Josep Sanjuàs Diego Amores Intel sponsor: Gianluca Iannaccone

High Speed Network Traffic Analysis with Commodity Multi-core Systems

Monitoring Network Traffic using ntopng

Virtual device passthrough for high speed VM networking

Tue Apr 19 11:03:19 PDT 2005 by Andrew Gristina thanks to Luca Deri and the ntop team

The new frontier of the DATA acquisition using 1 and 10 Gb/s Ethernet links. Filippo Costa on behalf of the ALICE DAQ group

Performance evaluation of packet capturing systems for high-speed networks

Active-Active Servers and Connection Synchronisation for LVS

Challenges in high speed packet processing

Knut Omang Ifi/Oracle 19 Oct, 2015

Network Performance Optimisation and Load Balancing. Wulf Thannhaeuser

Application Latency Monitoring using nprobe

Design of an Application Programming Interface for IP Network Monitoring

50. DFN Betriebstagung

Virtualization: TCP/IP Performance Management in a Virtualized Environment Orlando Share Session 9308

Packet Sniffer using Multicore programming. By B.A.Khivsara Assistant Professor Computer Department SNJB s KBJ COE,Chandwad

Lecture 6: Interrupts. CSC 469H1F Fall 2006 Angela Demke Brown

Increasing Data Center Network Visibility with Cisco NetFlow-Lite

In today s world the Internet has become a valuable resource for many people.

The Bus (PCI and PCI-Express)

Linux NIC and iscsi Performance over 40GbE

Introduction to Passive Network Traffic Monitoring

dnstap: high speed DNS logging without packet capture Robert Edmonds Farsight Security, Inc.

Removing The Linux Routing Cache

IPV6 流 量 分 析 探 讨 北 京 大 学 计 算 中 心 周 昌 令

A Performance Monitor based on Virtual Global Time for Clusters of PCs

Transcription:

Improving Passive Packet Capture: Beyond Device Polling Luca Deri <deri@>

Packet Capture: State of the Art Many available passive network monitoring tools are based on libpcap (http://www.tcpdump.org) or winpcap (http:// winpcap.polito.it) Despite libpcap offers the very same programming interface across different OSs, the library performance are very different depending on the platform being used. Some library components (e.g. BPF packet filtering) have been implemented into the kernel for better performance.

Packet Capture: Open Issues Monitoring low speed (100 Mbit) networks is already possible using commodity hardware and tools based on libpcap. Sometimes even at 100 Mbit there is some (severe) packet loss: we have to shift from thinking in term of speed to number of packets/second that can be captured analyzed. Problem statement: monitor high speed (1 Gbit and above) networks with common PCs (64 bit/66 Mhz PCI bus) without the need to purchase custom capture cards or measurement boxes.

Libpcap Performance on a Vanilla OS Testbed: Sender: Dual 1.8 GHz Athlon, 3Com 3c59x Ethernet card Collector: VIA C3 533 MHz, Intel 100Mbit Ethernet card Network Switch: Cisco Catalyst 3548 XL Traffic Generator: tcpreplay (http://tcpreplay.sourceforge.net/) Traffic Capture Application Linux 2.4.x FreeBSD 4.8 Windows 2K Standard Libpcap 0.2 % 34 % 68 % mmap Libpcap 1 % Kernel module 4 % Percentage of captured packets [~80K packet/sec, ~45 Mbit]

Libpcap Performance Using Polling Traffic Capture Application Linux 2.6 with NAPI FreeBSD 4.8 with Polling Standard Libpcap 5.6 % 99.9 % Mmap Libpcap Not (yet) working Kernel module 99.5 % Percentage of captured packets (same test setup) Device Polling significantly improved the performance on a 100 Mbit Ethernet card Linux still performs much worse than FreeBSD at userspace Linux kernel performance is basically the same of FreeBSD at userspace

Libpcap Performance at Gbit Speeds Traffic Capture Application FreeBSD 4.8 with Polling Standard Libpcap 148 000 (~3.7%) Dropped packets [4 000 000 packet sample] (165-195k pkt/sec, ~480Mbit) While FreeBSD is usable at 100 Mbit, at Gbit the packet loss is greater (Linux still performs much worse). Device polling, nor card interrupt mitigation, is not enough for capturing almost everything. It is necessary to rethink the traffic capture architecture to achieve better speeds.

Polling vs. non Polling Input Rate No Polling Device Polling Sidenote. Polling is usually disabled by default in OSs as: Slow polling can have negative effects on packet timestamp precision. Fast polling can take over all/most of the available CPU cycles.

Proposed Solution: Driver Packet Ring mmap() Outgoing Packets Userspace Adapter Driver Read Pointer Circular Buffer (ring) Write Pointer Incoming Packets Network Adapter

Driver Packet Ring: Features [1/2] The card driver (not the kernel) fills up the buffer. Features like packet sampling and filtering can be efficiently implemented into the driver (otherwise the packet arrives to the kernel/userland then is filtered/discarded). The bucket copy operation is very fast (memcpy() with no memory handling needed) so even with kernel that does not support device polling, under strong traffic conditions the system is usable.

Driver Packet Ring: Features [2/2] Straight path from the driver to userspace with no kernel overhead (beside the PCI bus). Note that packets are transfer over the PCI bus via DMA and that interrupts arrive only when the packet is already available to the driver. Packets are not queued into kernel network data structures but directly accessible to user-space applications. The mmap() primitive allows userspace applications to access the circular buffer with no overhead due to system calls such as in the case of socket calls.

Driver Packet Ring: Main Advantages Enable people to capture packets at (almost) wire speed without having to purchase a dedicated card or measuring box. It exploits both polling and interrupt mitigation (if available). Common belief that coding into the kernel is more efficient than in userspace is not completely correct. In fact a kernel module is slower than a userland application+ring as packets have to traverse kernel structures until they can talk with the kernel module.

Driver Packet Ring: Validation Traffic Capture Application Linux 2.6 with NAPI FreeBSD 4.8 with Polling Standard Libpcap 96.3% (148 000 drops) Circular Buffer Libpcap > 99.99% (160 drops) Captured packets [4 000 000 packet sample] (165-195k pkt/sec, ~480Mbit) Linux drops most of the packets at startup (polling issue?), little loss over the time. The CPU load on the Linux PC is < 10% (i.e. there s room for improvement), whereas is very high on FreeBSD (little CPU time left to measurement applications).

Driver Packet Ring: Current Status Port of the code to Linux 2.6. Currently available only for Intel cards (10/100/1Gb/10Gb). Easy to port to other card drivers as the code is device independent. Enhanced libpcap to seamlessly support the ring. Successfully tested several applications (e.g. tcpdump) on top of the ring in order to validate the code. Both ntop and nprobe (homegrown NetFlow v5/v9 probe for IPv4/6) run on top of the ring.

Work in Progress / Future Work Testing at 10 Gbit using a commercial card (e.g. Intel 10 Gbit) and comparison of its performance with custom cards (DAG, Combo6). Wishlist: port it to *BSD in order to evaluate the overall performance gain on a non-linux platform.

Availability Paper and Documentation: http://luca./ring.pdf Code (GNU GPL2 license): http://sourceforge.net/projects/ntop/