Using Certes to Infer Client Response Time at the Web Server



Similar documents
The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

An Alternative Way to Measure Private Equity Performance

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

What is Candidate Sampling

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

DEFINING %COMPLETE IN MICROSOFT PROJECT

The OC Curve of Attribute Acceptance Plans

Project Networks With Mixed-Time Constraints

Fault tolerance in cloud technologies presented as a service

Conferencing protocols and Petri net analysis

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Activity Scheduling for Cost-Time Investment Optimization in Project Management

Updating the E5810B firmware

BERNSTEIN POLYNOMIALS

RequIn, a tool for fast web traffic inference

Forecasting the Direction and Strength of Stock Market Movement

Vembu StoreGrid Windows Client Installation Guide

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

Recurrence. 1 Definitions and main statements

8 Algorithm for Binary Searching in Trees

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA )

VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays

How To Detect An Traffc From A Network With A Network Onlne Onlnet

Multiple-Period Attribution: Residuals and Compounding

Traffic-light a stress test for life insurance provisions

Traffic State Estimation in the Traffic Management Center of Berlin

Intra-year Cash Flow Patterns: A Simple Solution for an Unnecessary Appraisal Error

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Can Auto Liability Insurance Purchases Signal Risk Attitude?

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Analysis of Premium Liabilities for Australian Lines of Business

Simple Interest Loans (Section 5.1) :

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

Time Value of Money Module

How To Calculate The Accountng Perod Of Nequalty

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Secure Password-Authenticated Key Agreement Using Smart Cards

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

SIMPLE LINEAR CORRELATION

Network Security Situation Evaluation Method for Distributed Denial of Service

The Current Employment Statistics (CES) survey,

1.1 The University may award Higher Doctorate degrees as specified from time-to-time in UPR AS11 1.

Support Vector Machines

An Interest-Oriented Network Evolution Mechanism for Online Communities

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

An Empirical Study of Search Engine Advertising Effectiveness

Using Series to Analyze Financial Situations: Present Value

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

Sketching Sampled Data Streams

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

QOS DISTRIBUTION MONITORING FOR PERFORMANCE MANAGEMENT IN MULTIMEDIA NETWORKS

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing

1. Measuring association using correlation and regression

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

CHAPTER 14 MORE ABOUT REGRESSION

Period and Deadline Selection for Schedulability in Real-Time Systems

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Software project management with GAs

To manage leave, meeting institutional requirements and treating individual staff members fairly and consistently.

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

Joint Scheduling of Processing and Shuffle Phases in MapReduce Systems

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Characterization of Assembly. Variation Analysis Methods. A Thesis. Presented to the. Department of Mechanical Engineering. Brigham Young University

Optimization of network mesh topologies and link capacities for congestion relief

A Passive Network Measurement-based Traffic Control Algorithm in Gateway of. P2P Systems

Politecnico di Torino. Porto Institutional Repository

7.5. Present Value of an Annuity. Investigate

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

Multi-Source Video Multicast in Peer-to-Peer Networks

End-to-end measurements of GPRS-EDGE networks have

Statistical Methods to Develop Rating Models

Self-Adaptive SLA-Driven Capacity Management for Internet Services

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Credit Limit Optimization (CLO) for Credit Cards

Efficient Project Portfolio as a tool for Enterprise Risk Management

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Transcription:

Usng Certes to Infer Clent Response Tme at the eb Server DAVID OLSHEFSKI IBM T.J. atson Research Center and Columba Unversty JASON NIEH Columba Unversty and DAKSHI AGRAAL IBM T.J. atson Research Center As busnesses contnue to grow ther orld de eb presence, t s becomng ncreasngly vtal for them to have quanttatve measures of the mean clent perceved response tmes of ther web servces. e present Certes (ClEnt Response Tme Estmated by the Server), an onlne server-based mechansm that allows web servers to estmate mean clent perceved response tme, as f measured at the clent. Certes s based on a model of TCP that quantfes the effect that connecton drops have on mean clent perceved response tme by usng three smple server-sde measurements: connecton drop rate, connecton accept rate and connecton completon rate. The mechansm does not requre modfcatons to HTTP servers or web pages, does not rely on probng or thrd party samplng, and does not requre clent-sde modfcatons or scrptng. Certes can be used to estmate response tmes for any web content, not just HTML. e have mplemented Certes and compared ts response tme estmates wth those obtaned wth detaled clent nstrumentaton. Our results demonstrate that Certes provdes accurate server-based estmates of mean clent response tmes n HTTP 1./1.1 envronments, even wth rapdly changng workloads. Certes runs onlne n constant tme wth very low overhead. It can be used at webstes and server farms to verfy complance wth servce level objectves. Categores and Subject Descrptors: D.4.8 Operatng Systems]: Performance measurements, models, operatonal analyss Parts of ths work appeared as OLSHEFSKI, D., NIEH, J., AND AGRAAL, D. Inferrng clent response tme at the web server. In ACM Sgmetrcs Conference Proceedngs (Marna Del Rey, Calf.). ACM, New York, 22, pp. 16 171. Ths work was supported n part by an NSF Career Award, NSF grant ANI-117738, and an IBM SUR Award. Authors addresses: D. Olshefsk, IBM T. J. atson Research, 3S-F32, 19 Skylne Drve, Hawthorne, N.Y. 1532, emal: olshef@us.bm.com; J. Neh, Columba Unversty, 45 Computer Scence MC41, 5 est 12th Street, New York, NY 127; D. Agrawal, IBM T. J. atson Research, 3S-E53, 19 Skylne Drve, Hawthorne, N.Y. 1532. Permsson to make dgtal or hard copes of part or all of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or drect commercal advantage and that copes show ths notce on the frst page or ntal screen of a dsplay along wth the full ctaton. Copyrghts for components of ths work owned by others than ACM must be honored. Abstractng wth credt s permtted. To copy otherwse, to republsh, to post on servers, to redstrbute to lsts, or to use any component of ths work n other works requres pror specfc permsson and/or a fee. Permssons may be requested from Publcatons Dept., ACM, Inc., 1515 Broadway, New York, NY 136 USA, fax: +1 (212) 869-481, or permssons@acm.org. C 24 ACM 734-271/4/2-49 $5. ACM Transactons on Computer Systems, Vol. 22, No. 1, February 24, Pages 49 93.

5 D. Olshefsk et al. General Terms: Algorthms, Management, Measurement, Performance, Expermentaton Addtonal Key ords and Phrases: eb server, clent perceved response tme 1. INTRODUCTION The focus of web server performance s shftng from throughput and utlzaton benchmarks Nahum et al. 1999; Barford and Crovella 1999; Nelsen et al. 1997] to guaranteeng delay bounds for dfferent classes of clents Lu et al. 21; Vogt et al. 21; Kanoda and Knghtly 2; Parekh et al. 21; Eggert and Hedemann 1999; Almeda et al. 1998; Pandey et al. 1998; Chen et al. 21; Bhatt and Fredrch 1999]. Provders of web servces are faced wth the challenge of provdng dfferentated servces that guarantee bounds on clent perceved response tmes whle at the same tme maxmzng throughput. In order for a webste to guarantee delay bounds for ts clents, t should be able to determne, n real-tme, the clent perceved response tme. Ths nformaton can then be used to verfy complance wth servce-level objectves and to dentfy potental problems that may exst on the server or n the network. Unfortunately, the problem of obtanng an accurate measure of clent response tme remans a key factor preventng delay bounded web servces from beng realzed. e have created Certes (ClEnt Response Tme Estmated by the Server), an onlne mechansm that accurately estmates mean clent perceved response tme usng only nformaton avalable at the web server. Certes combnes a model of TCP retransmsson and exponental back-off mechansms wth three smple server-sde measurements: connecton drop rate, connecton accept rate, and connecton completon rate. The model and measurements are used to quantfy the tme due to faled connecton attempts and determne ther effect on mean clent perceved response tme. Certes then measures both tme spent watng n kernel queues as well as tme to retreve requested web data. It acheves ths by gong beyond applcaton-level measurements to usng a kernel-level measure of the tme from the very begnnng of a successful connecton untl t s completed. Our approach does not requre probng or thrd party samplng, and does not requre modfcaton of web pages, HTTP servers, or clent-sde modfcatons. Certes uses a model that s nherently able to decompose response tme nto varous server and network components to help determne whether server or network provders are responsble for performance problems. Certes can be used to measure response tmes for any web content, not just HTML. e have mplemented Certes and verfed ts response tme measurements aganst those obtaned va detaled clent-sde nstrumentaton. Our results demonstrate that Certes provdes accurate server-based measurements of mean clent response tmes n HTTP 1./1.1 envronments, even wth rapdly changng workloads. Our results show that Certes s partcularly useful under overloaded server condtons when web server applcaton-level and kernellevel measurements can be grossly naccurate. e further demonstrate the need for Certes measurement accuracy n web server control mechansms that

Usng Certes to Infer Clent Response Tme at the eb Server 51 manpulate nbound kernel queung or that perform admssons control to acheve response tme goals. Ths artcle s outlned as follows: Secton 2 provdes some necessary background on the components of response tme dscusses related work. Secton 3 presents an overvew of the Certes approach, the mathematcal constructon of the Certes model focusng on how t accounts for tme attrbuted to faled connecton attempts, and a fast onlne mplementaton of the Certes model. Secton 4 descrbes our mplementaton of Certes on Lnux. Secton 5 presents expermental results demonstratng the effectveness of Certes n estmatng mean clent response tme at the server wth varous dynamc workloads for both HTTP 1./1.1. Fnally, we present some concludng remarks. 2. BACKGROUND AND RELATED ORK To understand the ssues nvolved n measurng response tme, we begn by presentng an anatomcal vew of the clent/server behavor that occurs when a web clent accesses a remote Internet webste. Once a URL, such as http://www.cnn.com/us/ndex.html, s entered nto a web browser, the followng ten steps occur to download and dsplay the web page: (1) URL parsng. The clent browser parses the URL to obtan the name of the remote host, www.cnn.com, from whch to obtan the web page, /US/ndex.html. eb browsers mantan a cache of web pages, so f the web page s n cache and has not expred, processng can be performed locally and Steps (2) (7) below can be skpped. (2) DNS lookup. In order to contact the webste (.e., www.cnn.com), the browser must frst obtan ts IP address from DNS Mockapetrs 1987a, 1987b]. Snce the browser mantans a local cache contanng the IP addresses of frequently accessed webstes, contactng the DNS server for ths nformaton s only performed on a cache mss, whch often mples that the webste s beng vsted for the frst tme. (3) TCP connecton setup. The clent establshes a TCP connecton wth the remote web server. Before a clent can send the HTTP request to the web server, a TCP connecton must frst be establshed, va the TCP three-way handshake mechansm Cardwell et al. 2; Almeda et al. 1998]. Frst, the clent sends a SYN packet to the server. Second, the server acknowledges the clent request for connecton by sendng a SYN/ACK back to the clent. Thrd, the clent responds by sendng an ACK to the server, completng the process of establshng a connecton. Note that f the clent s web browser already had an establshed TCP connecton to the server and persstent HTTP connectons are used, the browser may reuse ths connecton, skppng ths step. (4) HTTP request sent. The browser requests the web content, /US/ndex. html, from the remote ste by sendng an HTTP request over the establshed TCP connecton. (5) HTTP request receved. hen the web server machne receves an HTTP packet, the operatng system determnes whch applcaton should receve

52 D. Olshefsk et al. the message. The HTTP request s then passed to an HTTP server applcaton such as Apache whch s typcally executng n user space. (6) HTTP request processed. The HTTP server applcaton processes the request by obtanng the content ether from a dsk fle, CGI scrpt or other such program. (7) HTTP response sent. The HTTP server applcaton passes the content to the operatng system, whch n turn, sends the content to the clent. (8) HTTP response processed. Upon recevng the response to the HTTP request, the clent processes the web content. If the content conssts of an HTML page, the browser parses the HTML, dentfes any embedded objects such as mages, and begns renderng the web page on the dsplay. (9) Embedded objects retreved. The browser opens addtonal connectons to retreve any embedded objects, allowng the browser to make multple, smultaneous requests for the embedded objects. Ths parallelsm helps to reduce overall latency. Dependng on where the embedded objects are located, connectons may be to the same server, other web servers, or content delvery networks (CDNs). If the connectons are persstent and embedded objects are located on the same server, then several embedded objects wll be obtaned over each connecton. Otherwse, a new connecton wll be establshed for each embedded object. (1) Renderng. Once all the embedded objects have been obtaned, the browser can fully render the web page on the dsplay. Ths ten-step process may repeat tself at any pont n tme, preemptng any of the Steps (2) through (1). For example, the user may clck on a hyperlnk when the web page s not fully rendered. Such behavor causes an mmedate halt to the current actvty and a jump to Step (1). A complete measure of the tme to download and dsplay a web page would account for the tme spent across all ten steps. The only way to completely measure the actual clent perceved response tme s to measure the response tme on the clent machne. Ths requres the ablty to nstrument the web browser on every clent, and requres that most users use the nstrumented browser. Furthermore, for webstes to use such nformaton, clents would need to nclude mechansms to send the measurements back to the respectve webstes for them to use ths nformaton to verfy complance wth servce-level objectves. Unfortunately, ths drect browser nstrumentaton s not possble n practce. As a result, several pragmatc approaches have been developed to determne clent response tme wthout requrng clent browser modfcaton. These approaches must be consdered as methods to estmate clent perceved response tme, though some may be more accurate than others. e provde an overvew of these approaches and how they account for response tme assocated wth each of the ten steps for downloadng and dsplayng a web page. e also dscuss other related work below. One approach beng taken by a number of companes KeyNote; Mercury Interactve; Exodus; StreamCheck] s to perodcally measure response tmes obtaned by a geographcally dstrbuted set of montors. These montors can

Usng Certes to Infer Clent Response Tme at the eb Server 53 be fully nstrumented to provde a complete measurement of response tme across all of the ten steps prevously dscussed, as perceved by the montors. However, ths approach suffers from fve mportant lmtatons. Frst, no actual clent transactons are beng measured only the response tme for transactons generated by the montors are reported. Second, any approach based on coarsedgraned samplng may suffer from statstcal bases. Thrd, montors are lmted to performng transactons that do not affect other users or modfy state n backend databases. For example, t would be unwse to confgure a montor to actually purchase an arlne tcket or trade stock on an open exchange. Fourth, the nformaton gathered by montors s generally not avalable at the web server n real-tme, lmtng the ablty of a web server to respond to changes n response tme to meet delay bound guarantees. Lastly, CDN provders are known to place servers near montors used by these companes to artfcally mprove ther own performance measurements Danzg 21]. A second approach s to nstrument exstng web pages wth clent-sde scrptng n order to gather clent response tme statstcs Rajamony and Elnozahy 21]. The approach can be used to account for actual clent transactons. However, clent-sde scrptng wll always consder the start of the transacton to be sometme after the frst byte of HTTP data s receved by the clent and the clent begns processng the HTTP response. A post-connecton approach as ths does not account for any delays that occur n Steps (1) through (7), ncludng tme due to TCP connecton setup or watng n kernel queues on the web server. Clent-sde scrptng also cannot be appled to non-html fles that cannot be nstrumented, such as PDF and Postscrpt fles. It may also not work for older browsers or browsers wth scrptng capabltes dsabled. Clent browser measurements cannot accurately decompose the response tme nto server and network components and therefore provde no nsght nto whether server or network provders would be responsble for problems. A thrd approach s to have the web server applcaton track when requests arrve and complete servce L and Jamn 22; Lu et al. 21; Kanoda and Knghtly 2; Almeda et al. 1998]. Ths approach has the desrable propertes that t only requres nformaton avalable at the web server and can be used for non-html content. However, ths approach only measures Step (6) of the total response tme. Server latency measures at the applcaton-level do not properly nclude network nteractons and provde no nformaton on network problems that mght occur and affect clent perceved response tme. They also do not account for overheads assocated wth the TCP protocol underlyng HTTP, ncludng the tme due to TCP connecton setup or watng n kernel queues. These tmes can be sgnfcant, especally for servers whch dscard connecton attempts to avod overloadng the server Vogt et al. 21], or for servers whch lmt nput queue lengths of an applcaton server Eggert and Hedemann 1999] n order to provde a bound on the tme spent n the applcaton layer. A fourth approach s to capture network packet traffc to the web server and use those traces to reconstruct the clent response tme. Ths can be done ether offlne, by analyzng packet trace logs Smth et al. 21], or onlne as the network packets are passvely captured from the communcaton lne NetQoS].

54 D. Olshefsk et al. To use packet traces to determne web page response tme, EtE Fu et al. 22] provdes an offlne server-sde approach for obtanng a value for clent response tme, based on correlatng the actvty across multple connectons. Ths approach does account for Steps (4) to (9), but does not account for TCP connecton setup tme. Snce EtE s a server-based approach lke Certes, t also does not account for DNS lookup tmes and browser renderng tmes. The approach also does not account for any delays that may be due to web objects that resde outsde of the web server, such as objects stored on other servers or CDNs. Scalablty can also be a drawback wth the offlne approach snce the packet capturng and analyss may not be able to keep pace wth the hgh traffc rate enterng and leavng a busy server farm, requrng a number of packet montorng machnes. The cost of buyng and managng montorng machnes may be prohbtve. In addton to the above, offlne analyss fals to provde nformaton n real-tme. In addton to the above approaches that focus on web performance, a number of analytcal models have been proposed for modelng TCP behavor Pahdye and Floyd 21; Padhye et al. 1998; Cardwell et al. 2]. For example, Padhye et al. 1998] derved steady state throughput of a TCP bulk transfer for a gven lke loss rate and round trp tme. Ths model s further extended n Cardwell et al. 2] to nclude the effects of TCP three-way handshake and TCP slow start. The extended model can accurately estmate throughput for TCP transfers of any length. These analytcal models focus on estmatng TCP transfer throughput nstead of estmatng clent perceved response tme. They also assume a fxed packet loss rate that remans constant over tme and s known a pror. These assumptons are often not vald n measurng web server performance. For example, SYN packet loss rates may change frequently due to server load or f a web server uses SYN drops to manpulate ts qualty of servce (QoS). Appendx 6 dscusses further some of the queung modelng ssues. Many recent approaches have proposed methods for controllng QoS at web servers. One approach entals mplementng kernel mechansms that dfferentate among TCP connectons of dfferent servce classes durng the TCP connecton establshment phase. For example, Vogt et al. 21] proposed TCP SYN polcng and prortzed accept queues to support dfferent servce classes. Another approach s to dynamcally manage a system resource by usng a control feedback that depends on the measurements of clent perceved response tme Parekh et al. 21; Chen and Mohapatra 1999]. Such approaches can result n a hgh probablty of TCP SYN drops, resultng n faled connecton attempts that ncrease the clent perceved response tme. For these QoS mechansms to work as desred, t s mportant that the effect of TCP SYN drops on the clent perceved response tme s measured accurately by accountng for the tme n Step (3) above. Unfortunately, prevous response tme measurement methods, ncludng clent-sde scrptng, web server applcaton-level measurements, and usng packet traces, do not accurately account for the tme due to TCP connecton setup n the presence of faled connecton attempts. If faled connecton attempts dd not contrbute sgnfcantly to clent perceved response tme, omttng the tme due to TCP connecton setup would not be an ssue. However, Fgures 1 and 2 llustrate that faled connecton

Usng Certes to Infer Clent Response Tme at the eb Server 55 Fg. 1. Typcal TCP clent server nteracton. attempts can contrbute to substantal ncreases n response tme. The fgures show a TCP-orented vew of the process of downloadng and dsplayng a web page wth a focus on TCP connecton establshment. URL parsng and web page renderng do not usually account for a sgnfcant porton of response tme and DNS lookups are often cached to reduce ther mpact on response tme. As a result, Fgures 1 and 2 only llustrate Steps (3) to (9) wth a sngle clent and server to smplfy our dscusson. Fgure 1 shows the more common case n whch there s no packet loss. Here, the TCP connecton s establshed va the TCP three-way handshake mechansm, then a seres of HTTP requests are sent to the web server to request data. In ths scenaro, the tme due to TCP connecton setup for transmttng and processng the ntal SYN, SYN/ACK, and ACK s lkely to be small compared to the tme for processng the HTTP requests and may not contrbute much to the overall clent perceved response tme. Fgure 2 shows the same clent-server nteracton n the presence of SYN drops at the server due to server overload or admssons control Stevens 1994]. hen the ntal SYN s dropped, the server does not send the correspondng ACK packet. As a result, the clent ncurs a TCP tmeout and retransmts the ntal SYN to the server. Due to TCP tmeout and exponental back-off mechansms, the clent may have to wat 3 seconds, 9 seconds, 21 seconds, etc., before ts SYN packet s accepted by the server Braden 1989]. Ths wat tme to ntate a TCP connecton s often larger than the tme requred to transfer the actual web data and wll be a domnant factor n the overall clent perceved response tme. Droppng a SYN does not represent a denal of access n ths case, but rather a delay n establshng the connecton. The latency assocated wth ths behavor needs to be quantfed.

56 D. Olshefsk et al. Fg. 2. Effect of SYN drops on clent response tme. 3. THE CERTES MODEL The man contrbuton of Certes s to provde a server-sde measure of mean clent perceved response tme that ncludes the mpact of faled TCP connecton attempts on web server performance. To smplfy our dscusson and focus on the ssue of faled TCP connecton attempts, we make the followng assumptons: (1) e focus on measurng the response tme due to TCP connecton setup through retrevng embedded objects, Steps (3) through (9) n Secton 2. e do not consder Steps (1), (2), and (1). e assume that URL parsng and web page renderng tmes are small and DNS lookups are generally cached to reduce ther mpact on response tme. (2) e focus on determnng the contrbuton to clent perceved response tme due to the performance of a gven web server. e do not quantfy delays that may be due to web objects resdng on other servers or CDNs.

Usng Certes to Infer Clent Response Tme at the eb Server 57 (3) e lmt our dscusson to an estmate of response tme based on the duraton of a TCP connecton. For nonpersstent connectons where each HTTP request uses a separate TCP connecton, ths estmate corresponds to measurng the response tme for ndvdual HTTP requests. For persstent connectons where multple HTTP requests may be served over a sngle connecton, ths estmate wll nclude the tme for multple requests. Snce web page wth embedded objects may requre multple HTTP requests n order to be dsplayed, determnng the response tme for downloadng a web page may requre correlatng the response tmes of multple HTTP requests. Complementary work Fu et al. 22] on correlatng connectons to web pages can be used for ths purpose. Although mportant, these ssues are orthogonal to the focus of ths artcle. Gven these assumptons, a measure of clent-perceved response tme should nclude the tme startng from when the frst SYN packet s sent from the clent to the server untl the last HTTP response data packet s receved from the server by the clent. For a gven connecton, we defne CONN-FAIL as the tme between when the frst SYN packet s sent from the clent and when the last SYN packet s sent from the clent. Ths s the tme due to faled TCP connecton attempts. hen there are no faled connecton attempts, CONN-FAIL s zero. For a gven connecton, we defne SYN-to-END as the tme between when the server receves the last SYN packet untl the tme when the server sends the last data packet. Ths s essentally the server s percepton of response tme n the absence of SYN drops. The clent perceved response tme s equal to CONN-FAIL and SYN-to-END plus one round trp tme (RTT) to account for the the tme t takes to send the SYN packet from the clent to the server plus the tme t takes to send the last data packet from the server to the clent. The clent perceved response tme over the connecton s: CLIENT RT = CONN-FAIL + SYN-to-END + RTT. (1) Determnng clent perceved response tme then reduces to determnng CONN-FAIL, SYN-to-END, and RTT. Note that any falure to complete the 3-way handshake after the SYN s accepted by the server s captured by SYNto-END. For example, delays caused by dropped SYN/ACKs from the server to the clent (the second part of the 3-way handshake) are accounted for n the SYN-to-END tme (as shown n Fgure 3). The equaton also holds f the server termnates the connecton before sendng any data by sendng a FIN or RST. Determnng the SYN-to-END component of the clent perceved response tme s relatvely straghtforward. The SYN-to-END tme can be decomposed nto two components: the tme taken to establsh the TCP connecton after recevng the ntal SYN, and the tme taken to receve and process the HTTP request(s) from the clent. In certan crcumstances, for example when the web server s lghtly loaded and the data transfer s large, the frst component of the SYN-to-END tme can be gnored, and the second component can be used as an approxmaton to the processng tme spent n the applcaton-level server. In such cases, measurng the processng tme n the applcaton-level server can

58 D. Olshefsk et al. Fg. 3. Dropped SYN/ACK from server to clent captured n SYN-to-END tme. provde a good estmate of the SYN-to-END tme. In general, the processng tme n the applcaton-level server s not a good estmate of the SYN-to-END tme. If the web server s heavly loaded, t may delay sendng the SYN/ACK back to the clent, or t may delay delverng the HTTP request from the clent to the applcaton-level server. In such cases, the tme to establsh the TCP connecton may consttute a sgnfcant component of the SYN-to-END tme. Thus, to obtan an accurate measure of the SYN-to-END tme, measurements must be done at the kernel level. A smple way to measure SYN-to-END s by tmestampng n the kernel when the last SYN packet s receved by the server and when the last data packet s sent from the server. If the kernel does not already provde such a packet tmestamp mechansm, t can be added wth mnor modfcatons. Secton 4 descrbes n further detal how we measured SYN-to-END for our Certes Lnux mplementaton. Determnng the RTT component of the clent perceved response tme s also relatvely straghtforward. RTT can be determned at the server by measurng

Usng Certes to Infer Clent Response Tme at the eb Server 59 the tme from when the SYN/ACK s sent from the server to the tme when the server receves the ACK back from the clent. The RTT tme measured n ths way ncludes the tme spent by the clent n processng the SYN/ACK and preparng ts reply. Our experence ndcates that typcally the tme taken by clents to process a SYN/ACK packet and send a reply s not sgnfcant, and ths method yelds an accurate measure of RTT. Other approaches for estmatng RTT can also be used Allman 2]. For both SYN-to-END and RTT measurements, the kernel at the web server must provde the respectve tmestamps. As dscussed n Secton 4, these tmestamps can be added wth mnor modfcatons. However, determnng CONN-FAIL s a dffcult problem. The problem s that when a server accepts a SYN and processes the connecton, the server s unaware of how many faled connecton attempts have been made by the clent pror to ths successful attempt. The TCP header Postel 1981] and the data payload of a SYN packet do not provde any ndcaton of whch attempt the accepted SYN represents. As a result, the server cannot examne the accepted SYN to determne whether t s an ntal attempt at connectng, or a frst retry at connectng, or an Nth retry at connectng. Even n the cases where the server s responsble for droppng the ntal SYN and causng a retry, t s dffcult for the server to remember the tme the ntal SYN was dropped and correlate t wth the eventually accepted SYN for a gven connecton. For such a correlaton, the server would be requred to retan addtonal state for each dropped SYN at precsely the tme when the server s nput network queues are probably near capacty, whch could result n performance scalablty problems for the server. Certes solves ths problem by takng advantage of two propertes of server mechansms for supportng SYNs. Frst, snce the server cannot dstngush between whether a SYN packet s an ntal attempt or Nth retry, t must treat them all equally. Second, t s easy for a server to smply count the number of SYNs that are dropped versus accepted snce t only requres a small amount of state. As a result, Certes can compute the probablty that a SYN s dropped and apply that probablty equally to all SYNs durng a gven tme perod to estmate the number of SYN retres that occur. Ths nformaton s then combned wth a understandng of the TCP exponental backoff mechansm to correlate accepted SYNs wth the number of SYN drops that occurred to determne how many retres were needed before establshng a connecton. Certes can then determne CONN-FAIL based on how many retres were needed and the amount of tme necessary for those retres to occur. In partcular, due to TCP tmeout and exponental backoff mechansms specfed n RFC 1122 Braden 1989], the frst SYN retry occurs 3 seconds after the ntal SYN, the second SYN retry occurs 6 seconds after the frst retry, the thrd SYN retry occurs 12 seconds after the second retry, etc. Certes does assume that all clents adhere to ths exact exponental behavor on SYN retres from RFC 1122. Ths s a reasonable assumpton gven that RFC 1122 s supported by all major operatng systems, ncludng Mcrosoft operatng systems Mcrosoft], Lnux RedHat], FreeBSD FreeBSD], NetBSD 1.5 NetBSD], AIX 5.x, and Solars. OneStat.com OneStat 22] estmates that 97.46% of the web server accesses

6 D. Olshefsk et al. on the Internet are from users runnng a ndows operatng system. The rest they attrbute to Macntosh and Lnux users (1.43% and.26%, respectvely). Secton 3.1 presents a more detaled step-by-step constructon of the Certes model. In partcular, we dscuss the mpact of the varance of RTT on when retres arrve at the server and how Certes accounts for ths varablty. Secton 3.2 descrbes a more smplfed Certes model that can be mplemented effcently and yet stll yelds good response tme results. 3.1 Mathematcal Constructon of The Certes Model Certes determnes the mean clent perceved response tme by accountng for CONN-FAIL usng a statstcal model that estmates the number of frst, second, thrd, etc., retres that occur durng a specfed tme nterval. Certes dvdes tme nto dscrete ntervals for groupng connectons by ther temporal relatonshp. thout loss of generalty, we wll assume that tme s dvded nto one second ntervals, but n general any nterval sze less than the ntal TCP retry tmeout value of three seconds may be used. For ease of exposton, let m = 3 be the number of dscrete tme ntervals that occur durng the ntal TCP retry tmeout value of three seconds. Certes determnes the number of retres that occurred before a SYN s accepted by usng smple counters to take three aggregate server-sde measurements for each tme nterval. The measurements are: DROPPED the total number of SYN packets that the server dropped durng the th nterval. ACCEPTED the total number of SYN packets that the server dd not drop durng the th nterval. COMPLETED the total number of connectons that completed durng the th nterval. Usng these three measurements, we can compute for a gven nterval the offered load at the server, whch s the number of SYN packets arrvng at the server. The offered load n the th nterval s: OFFERED LOAD = ACCEPTED + DROPPED. (2) Certes decomposes each of these measured quanttes, OFFERED LOAD, DROPPED, ACCEPTED, and COMPLETED as a sum of terms that have assocatons to connecton attempts. Let R j be the number of SYNs that arrved at the server as a j th retry durng the th nterval, startng wth R as the number of ntal attempts to connect to the server durng nterval. Let D j be the number of SYNs that arrved at the server as a j th retry durng the th nterval but were dropped by the server. Let A j be the number of SYNs that arrved at the server as a j th retry durng the th nterval and were accepted by the server. Let C j be the number of connectons completed durng the th nterval that were accepted by the server as a j th retry. Let k be the maxmum number of retres attempted by any clent. For each nterval, wehavethe

followng decomposton: Usng Certes to Infer Clent Response Tme at the eb Server 61 OFFERED LOAD DROPPED ACCEPTED COMPLETED = k j = R j = k j = D j = k j = A j = k j = C j. For each tme nterval, Certes determnes the mean clent perceved response tme for those web transactons that are completed durng the tme nterval. Ths ncludes both connectons that are completed durng the tme nterval as well as connectons that gve up durng the nterval after exceedng the maxmum number of retres attempted by any clent. COMPLETED s the s the number of clents that gave up durng the nterval. Applyng Eq. (1) to a tme nterval, Certes computes the mean clent response tme for the th nterval as: number of transactons that completed durng the th nterval and R k+1 CLIENT RT = R k+1 32 k+1 1] + k j =1 C j 32 j 1] + SYN-to-END + RTT COMPLETED + R k+1. (4) Equaton (4) essentally dvdes the sum of the response tmes by the number of transactons to obtan mean response tme. In the denomnator, Eq. (4) sums the total number of transactons that completed and clents that gave up. In the numerator, there are four terms summed together. The frst term R k+1 32 k+1 1] s the amount of tme that clents wated before gvng up based on the TCP exponental backoff mechansm. The second term k j =1 C j 32 j 1]] represents the total CONN-FAIL tme experenced by those clents that completed n the th nterval. The thrd term SYN-to-END s the sum of the measured SYNto-END tmes for all transactons completed n the th nterval. The fourth term RTT s the sum of one round trp tme for all transactons completed durng the th nterval. For example, f k = 2, then Eq. (4) reduces to: SYN-to-END + RTT + 21R k+1 + 9C 2 + 3C 1 CLIENT RT =. COMPLETED + R k+1 C 1 ndcates the number of clents that wated an addtonal 3 seconds due to a SYN drop, C 2 s the number of clents that wated an addtonal 9 seconds due to two SYN drops, and R k+1 s the number of clents that gave up after watng 21 seconds. To compute the mean clent perceved response tme for each nterval, Certes uses Eq. (3) to derve the values of C j and R k+1 from the measured quanttes OFFERED LOAD, DROPPED, ACCEPTED, and COMPLETED. e start from the observaton that the TCP header Postel 1981] and the data payload of a SYN packet do not provde any ndcaton of whch connecton attempt a dropped SYN represents. As a result, the server s TCP mplementaton cannot dstngush a SYN packet contanng a j th SYN retry from a SYN packet contanng a kth SYN retry. Ths mples that all types of SYN packets are dropped (3)

62 D. Olshefsk et al. or accepted wth equal probablty. The mean SYN drop rate at the server for the th nterval can be computed from OFFERED LOAD and DROPPED : DR = DROPPED /OFFERED LOAD. (5) A key hypothess of Certes s that the drop rate must therefore be equal for n the th nterval. Ths results n the followng relatons between R j all R j and D j : D = DR R D 1 = DR R 1 D 2 = DR R 2. D k = DR R k. Each ndvdual connecton that completes durng the th nterval was accepted durng the ( SYN-to-END)th nterval. Because each connecton may have a dfferent SYN-to-END tme, connectons that complete durng the th nterval may have been accepted durng dfferent ntervals. Let ACCEPTED p, be the number of connectons that were accepted durng the pth nterval and completed durng the th nterval. Therefore, (6) COMPLETED = p ACCEPTED p,. (7) Let ACCEPTED p, = k A j p,, (8) j = where A j p, s the number of SYNs that were accepted durng the pth nterval as a j th retry and completed durng the th nterval. Therefore, C j = p A j p,. (9) As mentoned above, when a server accepts a SYN and processes the connecton, the server s unaware of how many faled connecton attempts have been made by the clent pror to ths successful attempt. Therefore, there s no drect method for determnng the number of retres assocated wth a specfc connecton. As such, there s no drect method for obtanng A j p,. e estmate the value of A j p, from the rato of A p j to ACCEPTED p: ] A j p, = Ap j ACCEPTED p,. (1) ACCEPTED p Snce the SYNs that do not get dropped get accepted, Eq. (6) mples that A j s: A j = R j D j = R j DR R j ]. (11)

Usng Certes to Infer Clent Response Tme at the eb Server 63 Fg. 4. Varance n RTT affects arrval tme of retres. Combnng Eq. (1) and (11) allows us to rewrte Eq. (9) as: C j = R j p ] DR p R p] j ACCEPTED p,. (12) ACCEPTED p p Equaton (12) solves for C j n terms of R p j, DR p and ACCEPTED p,.ecan substtute Eq. (12) nto our equaton for calculatng CLIENT RT, effectvely removng C j from Eq. (4). e now turn our attenton to solvng for R j. Drops occurrng durng the th nterval return as retres n future ntervals. Based on the TCP exponental backoff mechansm, the tmng of the return depends on whether t was an ntal SYN, a 1st retry, a 2nd retry, etc. As a result, the number of retres arrvng durng the th nterval s a functon of the number of drops that occurred n pror ntervals: R 1 R 2 = D m = D 1 2m R 3 = D 4m 2. R k+1 = D k 2 k 1 m. (13) Equaton (13) assumes that retres arrve at the server exactly when expected based on the TCP specfcaton (.e., n 3 seconds, 6 seconds, etc.). Due to varance n RTT, ths assumpton may not hold n practce. Such a scenaro s shown n Fgure 4, where the network delay changes between connecton attempts for a, snce retres may not always arrve at the server exactly when expected (.e., n 3 seconds, 6 seconds, etc.). Note that t s the varance n RTT for a specfc clent that affects the model and not the dfferences n RTT between clents. For example, the server wll observe the 3-second, 6-second, 12-second, etc. retry delay for each clent wth a consstent RTT, regardless of the magntude of the RTT. specfc clent. Ths has the effect of skewng the estmates for R j Ths effect can be accounted for by treatng R j as a weghted dstrbuton of past ntervals nstead of just usng a sngle nterval. Let j p, be over the D j

64 D. Olshefsk et al. the porton of D j p j +1 that wll return as R. The followng holds: 1 = j p,. (14) Usng these weghts, we can modfy Eq. (13) so that R j s a combnaton of drops occurrng n a small set of pror ntervals, rather than the number of drops that occurred n one specfc pror nterval: R 1 = + m 1, D m 1] + m, D m] + m+1, D m+1] + R 2 = + 1 2m 1, D1 2m 1] + 1 2m, D 1 2m] + 1 2m+1, D 1 2m+1] + R 3 = + 4m 1, 2 4m 1] D2 + 2 4m, D 4m] 2 + 2 4m+1, D 4m+1] 2 +. R k = + k 1 k 1 2 k 1 m 1, Dk 1 2 m 1] + k 1 2 k 1 m, D k 1 2 k 1 m] + k 1 2 k 1 m+1, Dk 1 2 k 1 m+1] +. (15) Equaton (6) allows us to rewrte Eq. (15) n terms of DR, j p, and R j substtutng DR R j for D j : by R 1 = + m 1, DR m 1 R m 1] + m, DR m R m] + m+1, DR m+1 R m+1] + R 2 = + 1 2m 1, DR 2m 1 R 1 2m 1] + 1 2m, DR 2m R 1 2] + 1 2m+1, DR 2m+1 R 1 2m+1] + R 3 = + 2 4m 1, DR 4m 1 R 2 4m 1] + (16) 2 4m, DR 4m R 2 4m] + 2 4m+1, DR 4m+1 R 2 4m+1] +..

Usng Certes to Infer Clent Response Tme at the eb Server 65 R k = + k 1 2 k 1 m 1, DR 2 k 1 m 1 R k 1 2 m 1] + k 1 k 1 DR 2 k 1 m, 2 k 1 m R k 1 2 m] + k 1 k 1 2 k 1 m+1, DR 2 k 1 m+1 R k 1 2 m+1] +. k 1 By recursve substtuton of the R j terms, we can transform these k equatons nto terms of the unknowns R and j p,.fork=2 and m = 3, the result s: R 1 = 4, DR ] ] ] 4 R 4 + 3, DR 3 R 3 + 2, DR 2 R 2 R 2 = 7, 1 DR 7 11, 7 DR 11 R 11] + 1, 7 DR 1 R 1] + 9, 7 DR 9 R 9]] + 6, 1 DR 6 1, 6 DR 1 R 1] + 9, 6 DR 9 R 9] + 8, 6 DR 8 R 8]] + 5, 1 DR 5 9, 5 DR 9 R 9] + 8, 5 DR 8 R 8] + 7, 5 DR 7 R 7]]. (17) From Eq. (3), we have: OFFERED LOAD = R + R 1 + R 2 (18) and by substtutng Eq. (17) nto Eq. (18), we get: OFFERED LOAD = R + ] ] 4, DR 4 R 4 + 3, DR 3 R 3 + 2, DR 2 R 2] + 7, 1 DR 7 11, 7 DR 11 R 11] + 1, 7 DR 1 R 1] + 9, 7 DR 9 R 9]] + 6, 1 6 1, 6 1 R 1] + 9, 6 DR 9 R 9] + 8, 6 DR 8 R 8]] + 5, 1 5 9, 5 9 R 9] + 8, 5 DR 8 R 8] + 7, 5 DR 7 R 7]]. (19)

66 D. Olshefsk et al. Equaton (19) provdes one equaton for each nterval, n terms of OFFERED LOAD (whch s measured), DR (whch s measured), R (whch s unknown) and j p, (whch s unknown). Once solutons for R are found, they can be used to calculate R j,, j. Addtonally, the presence of j p, ntroduces nonlnearty. Each nterval contans seven unknowns: R,,+2,,+3,,+4,,+5 1,,+6 1, and,+7 1. From Eq. (14), we have the followng equatons for each nterval : 1 =,+2 +,+3 +,+4 1 =,+5 1 +,+6 1 +,+7 1. (2) All values n Eq. (19) must be postve, and hence we have the constrants: R, j p,, j, p. (21) Of course, f the values for j p, were somehow magcally known, then Eq. (19) could be solved drectly snce t reduces to a lnear system of N equatons n N unknowns. In practce, however, j p, are unknown and need to be estmated. e descrbe one approach to a soluton whose general steps are as follows: (1) Determne an ntal estmate for all j p, over a wndow of pror ntervals. Errors n the estmates for j p, are drectly related to the errors n R. As such, determnng the bounds for ths error s a known solved problem: boundng the error n solvng a system of lnear equatons whose coeffcents may contan expermental error Golub and Loan 1996]. (2) Solve Eq. (19) usng these j p, estmated values. (3) If there s no soluton n Step (2), (.e., Eq. (21) s not satsfed) or there s a postve change n the optmzaton objectve, then change the values for j p, and terate. Let I be the ntal vector of j p, estmated values. The objectve of the optmzaton may be to mnmze I S, where S s the fnal soluton vector of weghts. In other words, assumng that the ntal best estmate s based on pror fact, the soluton vector ought not to devate sgnfcantly from t. Step (1). One approach for determnng I to account for the mpact of varance n RTT shown n Fgure 4 would be to base I on average hstorcal measures of the changes n RTT over tme. Let χ k be the probablty densty functon of RTT over a perod of length 32 k ]m. Gven that the arrvals of R are unformly dstrbuted over the th nterval (defned by the probablty

densty functon t ), then Usng Certes to Infer Clent Response Tme at the eb Server 67 E ] +2,+2 = f χ (x) dx +1 E ] +3,+3 = f χ (x) dx +2 E ] +4,+4 = f χ (x) dx +3 E ] +5,+5 1 = f χ1 (x) dx +4 E ] +6,+6 1 = f χ1 (x) dx +5 E ] +7,+7 1 = f χ1 (x) dx. +6 (22) here f χk (t) s the convoluton of t and χ k : f χ (t) = 3 + f χ1 (t) = 9 + χ (x)t (t x) dx χ 1 (x)t (t x) dx. (23) In other words, E,+2 ] s the mean porton of R that s expected to return durng the ( + 2)nd nterval as R+2 1. Note that, n Eq. (22), the E j p, ] terms are ndependent of p. e now set I to E j p, ], n effect, replacng j p, n Eq. (19) wth ts hstorcal mean, E j p, ]. By replacng the varables j p, by ther means, the error can be quantfed usng Chernoff s Bound Papouls and Plla 21]. Step (2). Substtutng the current estmated values of p, and p, 1 nto Eq. (19) translates the problem nto a lnear system of N equatons n N unknowns, for N ntervals (.e., snce p, and p, 1 are now constants, the only unknowns left are R ). Durng system ntalzaton, note that all SYNs arrvng, accepted or dropped durng the frst nterval are ntal SYNs. Lkewse, R j = for 1 j k, 1 3 (no 1st, 2nd, 3rd,...,kth, retres can occur n the frst three ntervals) and R j = for 2 j k, 4 9 (no 2nd, 3rd,..., kth, retres can occur durng the 4th and 9th ntervals). In general, R j R j = for 3(2 z 1), j z,1 z k, = for, j. (24)

68 D. Olshefsk et al. For the ntal N ntervals, there are only N unknowns: OFFERED LOAD 1 = R 1 OFFERED LOAD 2 = R 2 OFFERED LOAD 3 = R3 OFFERED LOAD 4 = R1 1,4 DR 1] + R2 2,4 DR 2] + R4 OFFERED LOAD 5 = R1 1,5 DR 1] + R2 2,5 DR 2] + R3 3,5 DR 3] + (25) Step (3). If Step (2) does not produce a satsfactory soluton, an adjustment s made to the values of p, and p, 1. There are several ways to perform ths adjustment. One method s based on the partal dervatves of R wth respect to p, and p, 1, as defned by the gradent matrx: G =. R 2, R 3, R 4, R 5 R 1 2, R 1 3, R 1 4,. R 2 2, R 2 3, R 2 4,........ (26)...... The number of columns n G s equal to the number of ntervals n the sldng wndow and the number of rows n G s equal to the total number of j p, n the sldng wndow. Usng G we can formulate a lnear program to determne j p, for the next teraton: R 2, R 1 2, R 2 2,.. R 3, R 1 3, R 2 3,. G T R R 2, R 4, 3, R 1 4, 4, 1 7, = R 2 6, 1 4, 5, 1........ R 1 R 2 R 3 R 4 R 5. (27)

Usng Certes to Infer Clent Response Tme at the eb Server 69 The column vector s the amount of (unknown) change to apply to the j p, for the next teraton. The column vector R s the amount of change we would lke to wtness for each R by applyng the new values for j p,.in ths case, { R R f R < = (28) otherwse. Essentally, Eq. (27) uses the gradent matrx G T to determne how much each weght ought to be changed n order to acheve a vable soluton. Equaton (27) can be solved usng a lnear least squares method Press et al. 1992] to obtan a best ft soluton for the. Fnal Step. Once Step (2) produces a satsfactory soluton for R and j p,, these values can be plugged nto Eq. (16) to obtan the values for R j. The values for R j can then be used n Eq. (12) to determne C j. Havng determned the values for R j and C j for the th nterval, we use these values n Eq. (4) to obtan the mean clent response tme. 3.2 Fast Onlne Approxmaton of The Certes Model Secton 3.1 descrbes a computatonally expensve algorthm: solvng a system of nonlnear equatons. e now present a fast, onlne, mplementaton of Certes that produces near optmal results based on a nonteratve approach. e smplfy the mathematcal approach n two ways: (1) e assume that all transactons that complete durng the th nterval have roughly the same SYN-to-END tme. If varance n SYN-to-END tme leads to an nconsstency n the model, we make an onlne adjustment smlar to Eq. (12) but based on the mean SYN-to-END tme for a gven nterval. For the remander of the artcle, when referrng to SYN-to-END tme, we mply the mean SYN-to-END tme for a gven nterval. (2) e compute an ntal estmate of weghts, I, by assumng RTT has no varance. If ths assumpton leads to an nconsstency n the model, we make smple onlne adjustments to j p, n the current and future tme ntervals. hat follows s a step-by-step example exposng ths approach. Step (1). An alternatve to the approach gven n the pror secton for determnng I s to begn wth the assumpton that the RTT has no varance. Gven an assumpton of zero varance n the RTT, the ntal values for I become: = m 1, = 2m 1, 1 = 4m 1, 2 1 = m, = 2m, 1 = 4m, 2 = m+1, = 2m+1, 1 = 4m+1, 2 = = k 1 2 k 1 m 1, = = k 1 2 k 1 m, = = k 1 2 k 1 m+1,. (29) If, by usng ths assumpton, a soluton cannot be found, we add-n or adjust for RTT varance by ncreasng or decreasng the values for j p, usng smple onlne heurstcs n Step (3). These adjustments serve as an alternatve to teratng over Eq. (19) to determne optmal values for j p,.

7 D. Olshefsk et al. Fg. 5. Intal connecton attempts that get dropped become retres three seconds later. Step (2). The followng demonstrates how to effcently solve Eq. (19) va onlne drect substtuton over a sldng wndow of ntervals. Assume that the server s booted at tme t (or there s a perod of nactvty pror to t ), as shown n Fgure 5. Certes assumes that all SYNs arrvng durng the frst nterval t, t 1 ] are ntal SYNs. Durng the frst nterval t, t 1 ], the server measures ACCEPTED 1 and DROPPED 1 and can use those measurements to determne A 1 = ACCEPTED 1, D1 = DROPPED 1, and R1 = OFFERED LOAD 1. Appendx 6 shows the results when Certes s appled when SYNs n the frst nterval are not all ntal SYNs. The dropped SYNs, D1, wll return to the server as 1st retres three seconds later as R4 1 durng nterval t 3, t 4 ]. Movng ahead n tme to nterval t 3, t 4 ], as shown n Fgure 6, the server measures ACCEPTED 4 and DROPPED 4 and calculates the SYN drop rate for the 4th nterval, DR 4, usng Eq. (5). The web server cannot dstngush between an ntal SYN or a 1st retry, therefore, the drop rate apples to both R4 and R4 1 equally, gvng D1 4 = DR 4 R4 1, and then A1 4 = R1 4 D1 4. From Eq. (3), A 4 = ACCEPTED 4 A 1 4 and D 4 = DROPPED 4 D4 1. Fnally, the number of ntal SYNs arrvng durng the 4th nterval s R4 = A 4 + D 4.ehave determned the values for all terms n Fgure 6. Note that the D4 1 dropped SYNs wll return to the server as 2nd retres sx seconds later durng nterval t 9, t 1 ], as R1 2, when those clents experence ther second TCP tmeout and that the D4 dropped SYNs wll return to the server as 1st retres, as R7 1, three seconds later durng nterval t 6, t 7 ]. By contnung n ths manner t s possble to recursvely compute all values of R j, A j and D j for all ntervals, for a gven k. Fgure 7 depcts the 1th nterval, ncludng those ntervals that drectly contrbute to the values n the 1th nterval. Clents that gve up after k connecton attempts are depcted as endng the transacton.

Usng Certes to Infer Clent Response Tme at the eb Server 71 Fg. 6. A second attempt at connecton, that gets dropped, becomes a retry sx seconds later. Fg. 7. After three connecton attempts the clent gves up. Fgure 8 shows the fnal model defnng the relatonshps between the ncomng, accepted, dropped and completed connectons durng the th nterval. Connectons accepted durng the th nterval complete durng the ( + SYN-to-END)th nterval. The clent frustraton tmeout s specfed n seconds and the term R j ndcates that clents who do not get accepted +FTO 2 k 1 m] durng the th nterval on the kth retry wll cancel ther attempt for servce durng the + FTO 2 k 1 m] nterval. The model n Fgure 8 can be mplemented n a web server by usng a smple data structure wth a sldng wndow. Note that durng each tme nterval, only the aggregate counters for DROPPED, ACCEPTED, and COMPLETED are ncremented. At the end of each tme nterval, the more detaled counters for R j, A j, D j, C j are computed usng a fxed number of computatons.

72 D. Olshefsk et al. Fg. 8. Relatonshp between ncomng, accepted, dropped, completed requests. Fg. 9. The smaller the nterval, the more dffcult to accurately dscretze events. Step 3. As mentoned n Secton 3.1, due to nconsstences n network delays the 1st retry from a clent may not arrve at the server exactly three seconds later, rather t may arrve n the nterval pror to or after the nterval t was expected to arrve. Lkewse, snce the measurement for SYN-to-END s not constant, there wll be nstances where C j +SYN-to-END A j ; n other words, some of the j retres accepted n the th nterval may complete pror to or after the ( + SYN-to-END)th nterval. These occurrences relate to an nterestng aspect of the choce for nterval length. In general, when samplng technques are used, the smaller the samplng perod (more frequent the samplng), the more accurate the result. Certes s not a samplng based approach yet one mght ntut that usng shorter ntervals would somehow provde for better results just the opposte s true. As shown n Fgure 9, as the sze of the nterval s reduced below a certan pont, the probablty that events happen when expected reduces as well. For example, the probablty that a dropped ntal SYN wll arrve back at the server durng