SO_REUSEPORT Scaling Techniques for Servers with High Connection Rates. Ying Cai ycai@google.com



Similar documents
SIDN Server Measurements

Understanding Slow Start

Server Traffic Management. Jeff Chase Duke University, Department of Computer Science CPS 212: Distributed Information Systems

OpenFlow with Intel Voravit Tanyingyong, Markus Hidell, Peter Sjödin

Improving DNS performance using Stateless TCP in FreeBSD 9

Introduction to Intel Ethernet Flow Director and Memcached Performance

How To Balance A Load Balancer On A Server On A Linux (Or Ipa) (Or Ahem) (For Ahem/Netnet) (On A Linux) (Permanent) (Netnet/Netlan) (Un

How To Block A Ddos Attack On A Network With A Firewall

Challenges of Sending Large Files Over Public Internet

OpenFlow Based Load Balancing

HP-UX 11i TCP/IP Performance White Paper

Scalable Internet Services and Load Balancing

Scalable Linux Clusters with LVS

Building a Highly Available and Scalable Web Farm

New Obvious and Obscure MikroTik RouterOS v5 features. Budapest, Hungary MUM Europe 2011

Non-authoritative answer: home.web.cern.ch canonical name = drupalprod.cern.ch. Name: drupalprod.cern.ch Address:

High-Density Network Flow Monitoring

Introduction to Computer Networks

Locating Cache Performance Bottlenecks Using Data Profiling

Table of Contents. 1 Overview 1-1 Introduction 1-1 Product Design 1-1 Appearance 1-2

Scalable Internet Services and Load Balancing

CS514: Intermediate Course in Computer Systems

Highly parallel, lock- less, user- space TCP/IP networking stack based on FreeBSD. EuroBSDCon 2013 Malta

Removing The Linux Routing Cache

F5 Intelligent DNS Scale. Philippe Bogaerts Senior Field Systems Engineer mailto: Mob.:

Open-source routing at 10Gb/s

10 Gbit Hardware Packet Filtering Using Commodity Network Adapters. Luca Deri Joseph Gasparakis

Acquia Cloud Edge Protect Powered by CloudFlare

DEPLOYMENT GUIDE Version 1.1. DNS Traffic Management using the BIG-IP Local Traffic Manager

CloudFlare advanced DDoS protection

Cisco Integrated Services Routers Performance Overview

Datacenter Operating Systems

Background. Industry: Challenges: Solution: Benefits: APV SERIES CASE STUDY Fuel Card Web Portal

Single Pass Load Balancing with Session Persistence in IPv6 Network. C. J. (Charlie) Liu Network Operations Charter Communications

dnstap: high speed DNS logging without packet capture Robert Edmonds Farsight Security, Inc.

Network Diagnostic Tools. Jijesh Kalliyat Sr.Technical Account Manager, Red Hat 15th Nov 2014

E) Modeling Insights: Patterns and Anti-patterns

Implementation, Simulation of Linux Virtual Server in ns-2

bbc Adobe LiveCycle Data Services Using the F5 BIG-IP LTM Introduction APPLIES TO CONTENTS

Exploration of Large Scale Virtual Networks. Open Network Summit 2016

IPv6/IPv4 Automatic Dual Authentication Technique for Campus Network

Performance of Software Switching

Application Latency Monitoring using nprobe

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Performance Tuning Guidelines for PowerExchange for Microsoft Dynamics CRM

ontune SPA - Server Performance Monitor and Analysis Tool

SNMP OIDs. Content Inspection Director (CID) Recommended counters And thresholds to monitor. Version January, 2011

How To Set Up Mybpx Security Configuration Guide V1.2.2 (V1.3.2) On A Pc Or Mac)

PeerMon: A Peer-to-Peer Network Monitoring System

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

About Firewall Protection

TRACE PERFORMANCE TESTING APPROACH. Overview. Approach. Flow. Attributes

Network Virtualization Technologies and their Effect on Performance

CloudCmp:Comparing Cloud Providers. Raja Abhinay Moparthi

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

Scaling Analysis Services in the Cloud

Server Management in the WSD

Control and forwarding plane separation on an open source router. Linux Kongress in Nürnberg

Benchmarking Virtual Switches in OPNFV draft-vsperf-bmwg-vswitch-opnfv-00. Maryam Tahhan Al Morton

Chapter 3. Internet Applications and Network Programming

Link Load Balancing :50:44 UTC Citrix Systems, Inc. All rights reserved. Terms of Use Trademarks Privacy Statement

Agenda. Distributed System Structures. Why Distributed Systems? Motivation

MySQL performance in a cloud. Mark Callaghan

Effect of anycast on K-root

UNIX Sockets. COS 461 Precept 1

Scaling to Millions of Simultaneous Connections

Worksheet 9. Linux as a router, packet filtering, traffic shaping

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

Resource Utilization of Middleware Components in Embedded Systems

Firewall Port Handling in TENA Applications

Extending Your Availability Group for Disaster Recovery

The application of TTCN-3 in M2M Testing

Load Balancing a Cluster of Web Servers

Wireshark in a Multi-Core Environment Using Hardware Acceleration Presenter: Pete Sanders, Napatech Inc. Sharkfest 2009 Stanford University

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

NAT and Firewall Traversal with STUN / TURN / ICE

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application. Author: Fung, King Pong

V-ISA Reputation Mechanism, Enabling Precise Defense against New DDoS Attacks

How To - Configure Virtual Host using FQDN How To Configure Virtual Host using FQDN

IxLoad - Layer 4-7 Performance Testing of Content Aware Devices and Networks

Many network and firewall administrators consider the network firewall at the network edge as their primary defense against all network woes.

Intel DPDK Boosts Server Appliance Performance White Paper

MilsVPN VPN Tunnel Port Translation. Table of Contents Introduction VPN Tunnel Settings...2

Load balancing MySQL with HaProxy. Peter Boros Percona 4/23/13 Santa Clara, CA

Transport Layer Services Mul9plexing/Demul9plexing. Transport Layer Services

Disk Storage Shortfall

WAN Optimization, Web Cache, Explicit Proxy, and WCCP. FortiOS Handbook v3 for FortiOS 4.0 MR3

Multi-core and Linux* Kernel

AN EFFICIENT LOAD BALANCING ALGORITHM FOR A DISTRIBUTED COMPUTER SYSTEM. Dr. T.Ravichandran, B.E (ECE), M.E(CSE), Ph.D., MISTE.,

Port Use and Contention in PlanetLab

Content Distribution Networks (CDN)

Delivering Quality in Software Performance and Scalability Testing

Ethereal Lab: DNS PART 1. 1.Run nslookup to obtain the IP address of a Web server in Asia. I performed nslookup for

Configuring Health Monitoring

An overwhelming majority of IaaS clouds leverage virtualization for their foundation.

Denial of Service. Tom Chen SMU

Monitoring Forefront TMG

Transcription:

SO_REUSEPORT Scaling Techniques for Servers with High Connection Rates Ying Cai ycai@google.com

Problems Servers with high connection/transaction rates TCP servers, e.g. web server UDP servers, e.g. DNS server On multi-core systems, using multiple servicing threads, e.g. one thread per servicing core. The single server socket becomes bottleneck Cache line bounces Hard to achieve load balance Things will only get worse with more cores

Scenario

Single TCP Server Socket - Solution 1 Use a listener thread to dispatch established connections to server threads The single listener thread becomes bottleneck due to high connection rate Cache misses of the socket structure Load balance is not an issue here

Single TCP Server Socket - Solution 2 All server threads accept() on the single server socket Lock contention on the server socket Cache line bouncing of the server socket Loads (number of accepted connections per thread) are usually not balanced Larger latency on busier CPUs It can almost be achieved by accept() at random intervals, but it is hard to decide the interval value, and may introduce latency.

Single UDP Server Socket Have same issues as TCP SO_REUSEADDR allows multiple UDP sockets bind() to the same local IP address and UDP port, but it will not distribute packets among them. It is not designed to solve this problem.

New Socket Option - SO_REUSEPORT Allow multiple sockets bind()/listen() to the same local address and TCP/UDP port Every thread can have its own server socket No locking contention on the server socket Load balance is achieved by kernel - kernel randomly picks a socket to receive the TCP connection or UDP request For security reason, all these sockets must be opened by the same user, so other users can not "steal" packets

SO_REUSEPORT

How to enable 1. sysctl net.core.allow_reuseport=1 2. Before bind(), setsockopt SO_REUSEADDR and SO_REUSEPORT 3. Then the same as a normal socket - bind()/listen() /accept()

Status Developed by Tom Herbert at Google Submitted to upstream, but has not been accepted yet Deployed internally at Google Will be deployed on Google Front End servers Already deployed on Google DNS servers. Some test shows change from 50k request/s with some losses to 80k request/s without loss.

Known Issues - Hashing Hash is based on 4 tuples and the number of server sockets, so if the number is changed (server socket opened/closed), a packet may be hash into a different socket TCP connection can not be established Solution 1: Use fixed number of server sockets Solution 2: Allow multiple server sockets to share the TCP request table Solution 3: Do not use hash, pick local server socket which is on the same CPU

Known Issues - Cache Have not solved the cache line bouncing problem completely Solved: The accepting thread is the processing thread Unsolved: The processed packets can be from another CPU Instead of distribute randomly, deliver to the thread/socket on the same CPU

Silo'ing

Interactions with RFS/RPS/XPS-mq - TCP Bind server theads to CPUs RPS (Receive Packet Steering) distributes the TCP SYN packets to CPUs TCP connection is accept() by the server thread bound to the CPU Use XPS-mq (Transmit Packet Steering for multiqueue) to send replies using the transmit queue associated with this CPU Either RFS (Receive Flow Steering) or RPS can guarantee that succeeding packets of the same connection will be delivered to that CPU

Interactions with RFS/RPS/XPS-mq - TCP RFS/RPS is not needed is RxQs are set up per CPU But hardware may not support as many RxQs as CPUs

Interactions with RFS/RPS/XPS-mq - UDP Similar to TCP

Interactions with scheduler Some scheduler mechanism may harm the performance Affine wakeup - too aggressive in certain conditions, causing cache misses

Other Scalability Issues Locking contentions HTB Qdisc

Questions?