Adaptive Overload Control for Busy Internet Servers

Similar documents

SEDA: An Architecture for Robust, Well-Conditioned Internet Services

Adaptive Overload Control for Busy Internet Servers

Application Performance Testing Basics

Web Server Architectures

MEASURING WORKLOAD PERFORMANCE IS THE INFRASTRUCTURE A PROBLEM?

A Hybrid Web Server Architecture for e-commerce Applications

Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Development and Evaluation of an Experimental Javabased

The Critical Role of an Application Delivery Controller

JBoss Seam Performance and Scalability on Dell PowerEdge 1855 Blade Servers

Cisco Integrated Services Routers Performance Overview

Capacity Planning Guide for Adobe LiveCycle Data Services 2.6

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

MAGENTO HOSTING Progressive Server Performance Improvements

Estimate Performance and Capacity Requirements for Workflow in SharePoint Server 2010

How To Test On The Dsms Application

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

Transparent Optimization of Grid Server Selection with Real-Time Passive Network Measurements. Marcia Zangrilli and Bruce Lowekamp

Cloud Based Application Architectures using Smart Computing

Lecture 3: Scaling by Load Balancing 1. Comments on reviews i. 2. Topic 1: Scalability a. QUESTION: What are problems? i. These papers look at

Resource Containers: A new facility for resource management in server systems

Datacenter Operating Systems

How To Model A System

Testing & Assuring Mobile End User Experience Before Production. Neotys

Snapt Balancer Manual

Performing Load Capacity Test for Web Applications

Enterprise Edition Scalability. ecommerce Framework Built to Scale Reading Time: 10 minutes

Quality of Service versus Fairness. Inelastic Applications. QoS Analogy: Surface Mail. How to Provide QoS?

Scalability Factors of JMeter In Performance Testing Projects

One Server Per City: C Using TCP for Very Large SIP Servers. Kumiko Ono Henning Schulzrinne {kumiko, hgs}@cs.columbia.edu

Low-rate TCP-targeted Denial of Service Attack Defense

Best Practices for Web Application Load Testing

Network Infrastructure Services CS848 Project

A Comparison of Oracle Performance on Physical and VMware Servers

MailEnable Scalability White Paper Version 1.2

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

Windows Server Performance Monitoring

APPENDIX 1 USER LEVEL IMPLEMENTATION OF PPATPAN IN LINUX SYSTEM

Web Application s Performance Testing

EWeb: Highly Scalable Client Transparent Fault Tolerant System for Cloud based Web Applications

Visualizations and Correlations in Troubleshooting

SIDN Server Measurements

Business Application Services Testing

Tuning Tableau Server for High Performance

GeoCloud Project Report USGS/EROS Spatial Data Warehouse Project

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Liferay Portal Performance. Benchmark Study of Liferay Portal Enterprise Edition

Internet Content Distribution

Rapid Bottleneck Identification A Better Way to do Load Testing. An Oracle White Paper June 2009

Comparing the Performance of Web Server Architectures

An Oracle White Paper February Rapid Bottleneck Identification - A Better Way to do Load Testing

MuleSoft Blueprint: Load Balancing Mule for Scalability and Availability

1 How to Monitor Performance

IBM Tivoli Monitoring Version 6.3 Fix Pack 2. Infrastructure Management Dashboards for Servers Reference

Chronicle: Capture and Analysis of NFS Workloads at Line Rate

Delivering Quality in Software Performance and Scalability Testing

Distributed applications monitoring at system and network level

There are a number of factors that increase the risk of performance problems in complex computer and software systems, such as e-commerce systems.

Black-box and Gray-box Strategies for Virtual Machine Migration

Performance Workload Design

PERFORMANCE TUNING ORACLE RAC ON LINUX

Performance Testing Process A Whitepaper

- An Essential Building Block for Stable and Reliable Compute Clusters

1 How to Monitor Performance

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

How To Test A Web Server

SIP Server Overload Control: Design and Evaluation

Oracle Net Services: Best Practices for Database Performance and Scalability. Kant C Patel Director, Oracle Net Development

Quality of Service on the Internet: Evaluation of the IntServ Architecture on the Linux Operative System 1

Energy-aware Memory Management through Database Buffer Control

Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking

Optimizing TCP Forwarding

Comparison of Request Admission Based Performance Isolation Approaches in Multi-tenant SaaS Applications

Apache Tomcat Tuning for Production

D1.2 Network Load Balancing

Performance Testing. Slow data transfer rate may be inherent in hardware but can also result from software-related problems, such as:

Performance of Enterprise Java Applications on VMware vsphere 4.1 and SpringSource tc Server

Application Level Congestion Control Enhancements in High BDP Networks. Anupama Sundaresan

Comparison of Web Server Architectures: a Measurement Study

HyLARD: A Hybrid Locality-Aware Request Distribution Policy in Cluster-based Web Servers

Throughput Capacity Planning and Application Saturation

Scheduling. Yücel Saygın. These slides are based on your text book and on the slides prepared by Andrew S. Tanenbaum

.NET 3.0 vs. IBM WebSphere 6.1 Benchmark Results

Load Balancing Web Applications

The Evolution of Application Acceleration:

Managing Latency in IPS Networks

Transcription:

Adaptive Overload Control for Busy Internet Servers Matt Welsh Intel Research and Harvard University David Culler Intel Research and U.C. Berkeley 1

The Problem: Overload in the Internet Overload is an inevitable aspect of systems connected to the Internet (Approximately) infinite user populations Large correllation of user demand (e.g., flash crowds) Peak load can be orders of magnitude greater than average Modern Internet services as highly dynamic Web servers do much more than serve up static pages e.g., server-side scripts (CGI, PHP), SSL, database access Requests have highly unpredictable CPU, memory, and I/O demands Makes overload very difficult to predict and manage Some high-profile (and low-profile) examples of overload CNN on Sept. 11th: 3, hits/sec, down for 2.5 hours E*Trade failure to execute trades during overload Final Fantasy XI launch in Japan: All servers down for 2 days Slashdot effect: daily frustration to nerds everywhere Matt Welsh, Intel Research Berkeley 2

Outline Traditional approaches to overload The Staged Event-Driven Architecture Adaptive overload control in SEDA Service differentiation and degradation Performance evaluation under massive load spikes Conclusions Matt Welsh, Intel Research Berkeley 3

Common approaches to overload control Prior work on bounding system performance metrics such as: CPU utilization, memory, network bandwidth No connection to user-perceived performance Instead, we focus on 9th percentile response time Meaningful to users, closely tied to SLAs Overload management often based on static resource limits e.g., Fixed limits on number of clients or CPU utilization Can underutilize resources (if limits set too low) or lead to oversaturation (if limits too high) Static page loads or simple performance models e.g., Assume linear overhead in size of Web page Can t account for dynamic services (scripts, SSL, etc.) Many techniques studied only under simulation Matt Welsh, Intel Research Berkeley 4

The Staged Event Driven Architecture (SEDA) list folders accept connection read packet parse packet show message... database access send response delete/refile message (some stages not shown) Decompose service into stages separated by queues Each stage performs a subset of request processing Stages use light-weight event-driven concurrency internally Each stage embodies a set of states from FSM Queues make load management explicit Event Handler Stages contain a thread pool to drive execution Small number of threads per stage Dynamically adjust thread pool sizes Thread Pool Dynamic Resource Control Apps don t allocate, schedule, or manage threads Matt Welsh, Intel Research Berkeley 5

Exposing overload to applications Overload is explicit in the programming model Every stage is subject to admission control policy e.g., Thresholding, rate control, credit-based flow control Enqueue failure is an overload signal Block on full queue backpressure Drop rejected events load shedding Can also degrade service, redirect request, etc. foreach (request in batch) { // Process request...? Reject Accept } try { next_stage.enqueue(req); } catch (rejectedexception e) { // Must respond to enqueue failure! // e.g., send error, degrade service, etc. } Matt Welsh, Intel Research Berkeley 6

Alternatives for Overload Control Basic idea: Apply admission control to each stage Expensive stages throttled more aggressively Reject request (e.g., Error message or Please wait... ) Social engineering possible: fake or confusing error message Redirect request to another server (e.g., HTTP redirect) Can couple with front-end load balancing across server farm Degrade service (e.g., reduce image quality or service complexity) Deliver differentiated service Give some users better service; don t reject users with a full shopping cart! Matt Welsh, Intel Research Berkeley 7

Feedback-driven response time control Token Bucket λ Stage Response Times Controller RT Target Distribution Adaptive admission control at each stage 9th %tile RT target supplied by administrator Measure stage latency and throttle incoming event rate to meet target Additive-increase/Multiplicative-decrease controller design Slight overshoot in input rate can lead to large response time spikes! Clamp down quickly on input rate when over target Increase incoming rate slowly when below target Matt Welsh, Intel Research Berkeley 8

Arashi: A Web-based e-mail service Yahoo Mail clone - real world service Dynamic page generation, SSL New Python-based Web scripting language Mail stored in back-end MySQL database Realistic client load generator Traces taken from departmental IMAP server Markov chain model of user behavior Overload control applied to each request type separately: admission control list folders accept connection read packet parse packet degrade service, send error page, etc. show message... delete/refile message database access send response (some stages not shown) Matt Welsh, Intel Research Berkeley 9

Overload prevention during a load spike 12 9th percentile response time (sec) 1 8 6 4 2 Load spike (1 users) 2 4 6 8 1 Time (5 sec intervals) No overload control With overload control Reject rate Load spike ends Sudden spike of 1 users hitting Arashi service 7 request types, handled by separate stages with overload controller 9th %tile response time target: 1 second Rejected requests cause clients to pause for 5 sec Overload controller has no knowledge of the service! 1.5 Reject rate Matt Welsh, Intel Research Berkeley 1

Overload control with scaling load 9th percentile response time, sec 5 45 4 35 3 25 2 15 1 Target 9th %tile RT, with overload control 9th %tile RT, no overload control Reject rate 1.9.8.7.6.5.4.3.2 Reject rate 5.1 2 4 8 16 32 64 128 256 512 124 Number of clients Matt Welsh, Intel Research Berkeley 11

Service Differentiation Token Buckets λλ Stage Response Times Controller Distributions Target Target Differentiate users into multiple classes Give certain users higher priority than others Based on IP address, cookie, header field, etc. Multiclass admission controller design Gather RT distributions for each class, compare to target If RT below target, increase rate for this class If RT above target, reduce rate of lower priority classes Matt Welsh, Intel Research Berkeley 12

Service differentiation at work 9th percentile response time, sec 5 4 3 2 1 Target Low priority High priority 1.8.6.4 Fraction of rejected requests.2 5 1 15 2 25 Time (5 sec intervals) Two classes of users with a 1 second response time target 128 users in each class High priority requests suffer fewer rejections Without differentiation, both classes treated equally Matt Welsh, Intel Research Berkeley 13

Service degradation 9th percentile response time (sec) 8 7 6 5 4 3 2 1 Load spike 2 4 6 8 1 Time (5 sec intervals) No overload control Degrade only Degrade + Reject Quality (Degrade only) Quality (Degrade + Reject) Reject rate Load spike ends 1.5 Quality / Reject rate Degrade fidelity of service in response to overload Artifical benchmark: Stage crunches numbers with a varying quality level Stage performs AIMD control on service quality under overload Enable/disable admission controller based on response time and quality Matt Welsh, Intel Research Berkeley 14

Related Overload Management Techniques Dynamic listen queue thresholding [Voigt, Cherkasova, Kant] Threshold or token-bucket rate limiting of incoming SYN queues Problem: Dropping or limiting TCP connections is bad for clients! Specialized scheduling techniques [Crovella, Harchol-Balter] e.g., Shortest-connection-first or Shortest-remaining-processing-time Often assumes 1-to-1 mapping of client request to server process Class-based service differentiation [Bhoj, Voigt, Reumann] Kernel- and user-level techniques for classifying user requests Sometimes requires pushing application logic into kernel Adjust connection/request acceptance rate per class No feedback - static assignment acceptance rates We argue that overload management should be an application design primitive and not simply tacked onto existing systems Matt Welsh, Intel Research Berkeley 15

Control theoretic resource management Increasing amount of theoretical and applied work in this area Control theory for physical systems with (sometimes) well-understood behaviors Capture model of system behavior under varying load Design controllers using standard techniques (e.g., pole placement) e.g., PID control of Internet service parameters [Diao, Hellerstein] Feedback-driven scheduling [Stankovic, Abdelzaher, Steere] Accurate system models difficult to derive Capturing realistic models is difficult Highly dependent on test loads Model parameters change over time Upgrading hardware, introducing new functionality, bit-rot Much control theory based on linear assumptions Real software systems highly nonlinear Matt Welsh, Intel Research Berkeley 16

Future Work Automatic profiling, modeling, and tuning of overload controller Capture traces of stage performance vs. traffic and load mix Offline or online tuning of admission control parameters Use learning algorithms? Extend local overload controller to global actions Adjust front door admission rate based on back-end bottleneck Prioritize stages that release resources Use global information, e.g., memory availability, to help Further explore tradeoff between request rejection and degradation How to build general-purpose degradation into a service? Tie in with complex set of service level agreements Matt Welsh, Intel Research Berkeley 17

Summary SEDA programming model exposes overload to the application Break event-driven application into stages connected with queues Queues can reject new requests - overload signal Adaptive overload control at each stage Attempt to meet 9th-percentile response time target Adjust admission rate of each stage s queue Differentiated service using multiple admission controllers Extensive evaluation in realistic overload setting Arashi service with highly dynamic behavior Realistic client load generator Evaluated overload control, service differentiation, and service degradation For more information, software, and papers: http://www.cs.berkeley.edu/~mdw/proj/seda/ Matt Welsh, Intel Research Berkeley 18

Backup slides follow Matt Welsh, Intel Research Berkeley 19

Adaptive overload control algorithm Monitor response time for each request in system Tag with current time on entry to service Gather distribution of accumulated response times at each stage Controller adjusts admission rate of requests using token bucket Controller invoked after N requests processed or timeout EWMA filter used on 9th-percentile RT estimate Calculates error between current RT estimate and target If err > err d, token bucket reduced by multiplicative factor: adj d. If err < err i, token bucket increased by additive factor: (err c i )adj i. Parameters determined through extensive experimentation N = 1, timeout = 1 sec EWMA filter =.7 err i = -.5, err d = adj i = 2., adj d = 1.2 Matt Welsh, Intel Research Berkeley 2

Response Time Controller Operation 9th percentile response time (sec) 4 3.5 3 2.5 2 1.5 1.5 Target 9th percentile RT Admission rate 22 2 18 16 14 12 1 8 6 4 2 Admission rate (reqs/sec) 6 8 1 12 14 16 Time (5 sec intervals) Adjust incoming token bucket using AIMD control Target response time 1 second Sample response times of requests through stage After 1 samples or 1 second: Sort measurements and measure 9th percentile If 9th RT <.9 target RT, add f(err) to rate If 9th RT > target RT, divide rate by 1.2 Matt Welsh, Intel Research Berkeley 21

Overload control by request type 2 1818 36589 116735 3133 19673 Request type login list folders list messages 9th percentile response time (ms) 15 1 5 3847 5752 9286 3875 6852 674 868 11358 1633 show message delete message refile message search Target Response Time 124 users with AC 124 users without AC Matt Welsh, Intel Research Berkeley 22

Without response time control 15 135 Target Low priority High priority 9th percentile response time, sec 12 15 9 75 6 45 3 15 5 1 15 2 25 Time (5 sec intervals) Two classes of users without overload control enabled 128 users in each class Terrible response time performance Matt Welsh, Intel Research Berkeley 23

Without service differentation 9th percentile response time, sec 5 4 3 2 1 Target Low priority High priority 1.8.6.4 Fraction of rejected requests.2 5 1 15 2 25 Time (5 sec intervals) Two classes of users with a 1 second response time target 128 users in each class No differentiation between classes of users High-priority users see same loss rate as low priority Matt Welsh, Intel Research Berkeley 24

SEDA Scales Well with Increasing Load Throughput, MBit/sec 24 22 2 18 16 14 12 1 8 6 4 2 Apache Flash SEDA Throughput 1 2 4 8 16 32 64 128 256 512 124 Number of clients 4-way Pentium III 5 MHz, Gigabit Ethernet, 2 GB RAM, Linux 2.2.14, IBM JDK 1.3 SEDA throughput 1% higher than Apache and Flash (which are in C!) But higher efficiency was not really the goal! Apache accepts only 15 clients at once - no overload despite thread model But as we will see, this penalizes many clients Matt Welsh, Intel Research Berkeley 25

Max Response Time vs. Apache and Flash Max response time, seconds 1 9 8 7 6 5 4 3 2 1 Apache Flash SEDA 1 4 16 64 256 124 Number of clients Apache and Flash are very unfair when overloaded 1.7 minutes 37 sec 3.8 sec Long response times due to exponential backoff in TCP SYN retransmit Not accepting connections is the wrong approach to overload management Matt Welsh, Intel Research Berkeley 26

User-level vs. kernel-level resource management SEDA is a user-level solution: no kernel changes Runs on commodity systems (Linux, Solaris, BSD, Win2k, etc.) In contrast to extensive work on specialized OS, schedulers, etc. Explore resource control on top of imperfect OS interface Grey box approach - infer properties of underlying system from observed behavior What would a SEDA-based dream OS look like? Scalable I/O primitives: remove emphasis on blocking ops SEDA stage-aware scheduling algorithm? Greater exposure of performance monitors and knobs Matt Welsh, Intel Research Berkeley 27

Scalable concurrency and I/O interfaces Threads don t scale, but are the wrong interface anyway Too coarse-grained: Don t reflect internal structure of a service Little control over thread behavior (priorities, kill -9) I/O interfaces typically don t scale 25 225 2 Bandwidth, Mbit/sec 175 15 125 1 75 5 (Can t run beyond 4 connections) 25 SEDA asynchronous socket layer Thread-based asynchronous socket layer 1 4 16 64 256 124 496 16384 Number of connections Matt Welsh, Intel Research Berkeley 28

Distributed programming models and protocols HTTP pushes overload into the network Relies on TCP connection backoff rather than more explicit mechanisms Simultaneous connections, progressive download, and out-of-order requests complicate matters Protocol design should consider service availability Distributed computing models generally do not express overload CORBA, RPC, RMI,.NET all based on RPC with generic error conditions On error, should app fail, retry, or invoke an alternate function? Single bottleneck in large distributed system causes cascading failure in network web service storage service browser bottleneck etc. Matt Welsh, Intel Research Berkeley 29

Playing dodgeball with the kernel OS resource management abstractions often inadequate Resource virtualization hides overload from applications e.g., malloc() returns NULL when no memory Forces system designers to focus only on capacity planning Internet services require careful control over resource usage e.g., Avoid exhausting physical memory to avoid paging Back off on processing heavyweight requests when saturated SEDA approach: Application-level monitoring & throttling Service performance monitored at a per-stage level Request throughput, service rate, latency distributions Staged model permits careful control over resource consumption Throttle number of threads, admission control on each stage Cruder than kernel modifications, but very effective (and clean!) Matt Welsh, Intel Research Berkeley 3