Scaling to Millions of Simultaneous Connections

Similar documents
JPGs and 3GPs and AMRs Oh My!

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010

MailEnable Scalability White Paper Version 1.2

Inside the Erlang VM

Windows Server Performance Monitoring

Crystal Reports Server 2008

Bernie Velivis President, Performax Inc

Performance White Paper

Hardware Performance Optimization and Tuning. Presenter: Tom Arakelian Assistant: Guy Ingalls

Performance And Scalability In Oracle9i And SQL Server 2000

Performance Comparison of Fujitsu PRIMERGY and PRIMEPOWER Servers

Help! My system is slow!

LSI MegaRAID CacheCade Performance Evaluation in a Web Server Environment

Tuning Tableau Server for High Performance

Running a Workflow on a PowerCenter Grid

Performance Management for Cloudbased STC 2012

Packet Capture in 10-Gigabit Ethernet Environments Using Contemporary Commodity Hardware

DDoS protection. Using Netfilter/iptables. Jesper Dangaard Brouer Senior Kernel Engineer, Red Hat Network-Services-Team DevConf.

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Performance and scalability of a large OLTP workload

Tableau Server 7.0 scalability

PERFORMANCE TUNING ORACLE RAC ON LINUX

Understanding Server Configuration Parameters and Their Effect on Server Statistics

The Lagopus SDN Software Switch. 3.1 SDN and OpenFlow. 3. Cloud Computing Technology

SIDN Server Measurements

QoS & Traffic Management

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including:

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers

5 Performance Management for Web Services. Rolf Stadler School of Electrical Engineering KTH Royal Institute of Technology.

How To Set Up A Shared Insight Cache Server On A Pc Or Macbook With A Virtual Environment On A Virtual Computer (For A Virtual) (For Pc Or Ipa) ( For Macbook) (Or Macbook). (For Macbook

Windows 8 SMB 2.2 File Sharing Performance

About Me: Brent Ozar. Perfmon and Profiler 101

Chapter 13 Embedded Operating Systems

SO_REUSEPORT Scaling Techniques for Servers with High Connection Rates. Ying Cai

Performance Tuning Guide for ECM 2.0

Agility Database Scalability Testing

Distribution One Server Requirements

Benchmarking Cassandra on Violin

HyperThreading Support in VMware ESX Server 2.1

Realtime Linux Kernel Features

Performance Testing at Scale

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

Page 1 of 5. IS 335: Information Technology in Business Lecture Outline Operating Systems

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

POSIX and Object Distributed Storage Systems

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator

CGL Architecture Specification

McAfee Enterprise Mobility Management Performance and Scalability Guide

Performance Analysis of Web based Applications on Single and Multi Core Servers

Red Hat Summit 2009 Bryan Che

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

Optimizing Linux Performance

Scaling Analysis Services in the Cloud

Disk Storage Shortfall

Planning Domain Controller Capacity

Benchmarking FreeBSD. Ivan Voras

Virtualization: TCP/IP Performance Management in a Virtualized Environment Orlando Share Session 9308

An Oracle White Paper July Oracle Primavera Contract Management, Business Intelligence Publisher Edition-Sizing Guide

Virtualised MikroTik

Web Services Performance: Comparing Java 2 TM Enterprise Edition (J2EE TM platform) and the Microsoft.NET Framework

Deciding which process to run. (Deciding which thread to run) Deciding how long the chosen process can run

Evaluation of 40 Gigabit Ethernet technology for data servers

Understanding Linux on z/vm Steal Time

Multi-core and Linux* Kernel

Scaling out a SharePoint Farm and Configuring Network Load Balancing on the Web Servers. Steve Smith Combined Knowledge MVP SharePoint Server

Real-Time Scheduling 1 / 39

OpenMosix Presented by Dr. Moshe Bar and MAASK [01]

The Hardware Dilemma. Stephanie Best, SGI Director Big Data Marketing Ray Morcos, SGI Big Data Engineering

VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5

Kernel Optimizations for KVM. Rik van Riel Senior Software Engineer, Red Hat June

Stingray Traffic Manager Sizing Guide

Squeezing The Most Performance from your VMware-based SQL Server

LCMON Network Traffic Analysis

OpenFlow Based Load Balancing

MS SQL Performance (Tuning) Best Practices:

High-Density Network Flow Monitoring

Performance Testing. Configuration Parameters for Performance Testing

An Architecture for Dynamic Allocation of Compute Cluster Bandwidth

Microsoft Windows Server 2003 with Internet Information Services (IIS) 6.0 vs. Linux Competitive Web Server Performance Comparison

OTM Performance OTM Users Conference Jim Mooney Vice President, Product Development August 11, 2015

A Scalability Study for WebSphere Application Server and DB2 Universal Database

KVM PERFORMANCE IMPROVEMENTS AND OPTIMIZATIONS. Mark Wagner Principal SW Engineer, Red Hat August 14, 2011

Zeus Traffic Manager VA Performance on vsphere 4

Improving the performance of data servers on multicore architectures. Fabien Gaud

Performance Management in the Virtual Data Center, Part II Memory Management

Technical Paper. Moving SAS Applications from a Physical to a Virtual VMware Environment

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Users are Complaining that the System is Slow What Should I Do Now? Part 1

Cloud Operating Systems for Servers

InterScan Web Security Virtual Appliance

Thomas Fahrig Senior Developer Hypervisor Team. Hypervisor Architecture Terminology Goals Basics Details

Performance Tuning and Optimizing SQL Databases 2016

Operating Systems 4 th Class

Transcription:

Scaling to Millions of Simultaneous Connections Rick Reed WhatsApp Erlang Factory SF March 30, 2012 1

About... Joined WhatsApp in 2011 New to Erlang Background in performance of C-based systems on FreeBSD and Linux Prior work at Yahoo!, SGI 2

Overview The good problem to have Performance Goals Tools and Techniques Results General Findings Specific Scalability Fixes 3

The Problem A good problem, but a problem nonetheless Growth, Earthquakes, and Soccer! Msg rates for past four weeks goals HT FT Mexican earthquake 4

The Problem Initial server loading: ~200k connections Discouraging prognosis for growth Cluster brittle in the face of failures/overloads 5

Performance Goals 1 Million connections per server! Resilience against disruptions under load Software failures Hardware failures (servers, network gear) World events (sports, earthquakes, etc.) 6

Performance Goals Our standard configuration Dual Westmere Hex-core (24 logical CPUs) 100GB RAM, SSD Dual NIC (user-facing, back-end/distribution) FreeBSD 8.3 OTP R14B03 7

Tools and Techniques System activity monitoring (wsar) OS-level BEAM 8

Tools and Techniques Processor hardware perf counters (pmcstat) dtrace, kernel lock-counting, gprof 9

Tools and Techniques fprof (w/ and w/o cpu_timestamp) BEAM lock-counting (invaluable!!!) 10

Tools and Techniques Synthetic workload Good for subsystems with simple interfaces Limited value for user-facing systems 11

Tools and Techniques Tee'd workload Where side-effects can be contained Extremely useful for tuning 12

Tools and Techniques Diverted workload Add additional production load to server DNS via extra IP aliases TTL issues IPFW forwarding Ran into a few kernel panics at high conn counts 13

Results Initial bottlenecks appeared around 425k First round of fixes got us to 1M conns Fruit was hanging pretty low 14

Results Continued attacking similar bottlenecks Achieved 2M conns about a month later Put further optimizations on back burner 15

Results Began optimizing app code after New Years Unintentional record attempt in Feb Peaked at 2.8M conns before we intervened 571k pkts/sec, >200k dist msgs/sec 16

Results Still trying to obtain elusive 3M conns St. Patrick's Day wasn't as lucky as hoped 17

General Findings Erlang has awesome SMP scalability >85% cpu utilization across 24 logical cpus FreeBSD shines as well 18

General Findings CPU% vs. # Conns 19

General Findings Contention, contention, contention From 200k to 2M were all contention fixes Some issues are internal to BEAM Some addressable with app changes Most required BEAM patches Some required app changes Especially: partitioning workload correctly Some common Erlang idioms come at a price 20

Specific Scalability Fixes FreeBSD Backported TSC-based kernel timecounter gettimeofday(2) calls much less expensive Backported igb network driver Had issues with MSI-X queue stalls sysctl tuning Obvious limits (e.g., kern.ipc.maxsockets) net.inet.tcp.tcphashsize=524288 21

Specific Scalability Fixes BEAM metrics Scheduler (%util, csw, waits, sleeps, ) statistics(message_queues) Msgs queued, #non-empty queues, longest queue process_info(message_queue_stats) Enq/deq/send count & rates (1s, 10s, 100s) statistics(message_counts) Aggregation of message_queue_stats Enable fprof cpu_timestamp for FreeBSD 22

Specific Scalability Fixes BEAM metrics (cont.) Make lock-counting work for larger async thread counts (e.g., +A 1024) Add suspend, location, and port_locks options to erts_debug:lock_counters Enable/disable process/port lock counting at runtime Fix missing accounting for outbound dist bytes 23

Specific Scalability Fixes BEAM tuning +swt low Avoid scheduler perma-sleep +Mummc/mmmbc/mmsbc 99999 Prefer mseg over malloc +Mut 24 Want allocator instance per scheduler 24

Specific Scalability Fixes BEAM tuning +Mulmbcs 32767 +Mumbcgs 1 +Musmbcs 2047 Want large 2M-aligned mseg allocations to maximize superpage promotions Run with real-time scheduling priority +ssct 1 (via patch; scheduler spin count) 25

Specific Scalability Fixes BEAM contention timeofday lock (esp., timeofday delivery) Reduced slot traversals on timer wheel Widened bif timer hash table Ended up moving bif timers to receive timeouts Improved check_io allocation scalability Added prim_file:write_file/3 & /4 (port reuse) Disable mseg max check 26

Specific Scalability Fixes BEAM contention (cont.) Reduce setopts calls in prim_inet:accept and in inet:tcp_controlling_process 27

Specific Scalability Fixes OTP throughput Add gc throttling when message queue is long Increase default dist receive buffer from 4k to 256k (and make configurable) Patch mnesia_tm to dispatch async_dirty txns to separate per-table procs for concurrency Add pg2 denormalized group member lists to improve lookup throughput Increase max configurable mseg cache size 28

Specific Scalability Fixes Erlang usage Prefer os:timestamp to erlang:now Implement cross-node gen_server calls without using monitors (reduces dist traffic and proc link lock contention) Partition ets and mnesia tables and localize access to smaller number of processes Small mnesia clusters 29

Specific Scalability Fixes Operability fixes Added [prepend] option to erlang:send Added process_flag(flush_message_queue) 30

Questions? Comments? rr@whatsapp.com 31