Implementation and Evaluation of Transparent Fault-Tolerant Web Service with Kernel-Level Support



Similar documents
Screentrade Car Insurance Policy Summary




tools for Web data extraction



(1) continuity equation: 0. momentum equation: u v g (2) u x. 1 a


N V V L. R a L I. Transformer Equation Notes

Orbits and Kepler s Laws

Summary: Vectors. This theorem is used to find any points (or position vectors) on a given line (direction vector). Two ways RT can be applied:

(Ch. 22.5) 2. What is the magnitude (in pc) of a point charge whose electric field 50 cm away has a magnitude of 2V/m?

Software Engineering and Development

ClearPeaks Customer Care Guide. Business as Usual (BaU) Services Peace of mind for your BI Investment

Adaptive Control of a Production and Maintenance System with Unknown Deterioration and Obsolescence Rates

How To Use A Network On A Network With A Powerline (Lan) On A Pcode (Lan On Alan) (Lan For Acedo) (Moe) (Omo) On An Ipo) Or Ipo (

Random Variables and Distribution Functions

Efficient Implementation of Concurrent Programming Languages

Introducing Kashef for Application Monitoring

How to recover your Exchange 2003/2007 mailboxes and s if all you have available are your PRIV1.EDB and PRIV1.STM Information Store database

Cloud Service Reliability: Modeling and Analysis

HEALTHCARE INTEGRATION BASED ON CLOUD COMPUTING

Effect of Contention Window on the Performance of IEEE WLANs

Modeling and Verifying a Price Model for Congestion Control in Computer Networks Using PROMELA/SPIN

An Approach to Optimized Resource Allocation for Cloud Simulation Platform

INITIAL MARGIN CALCULATION ON DERIVATIVE MARKETS OPTION VALUATION FORMULAS

Scheduling Hadoop Jobs to Meet Deadlines

VoIP for the Small Business

In-stope bolting for a safer working environment

Chapter 3 Savings, Present Value and Ricardian Equivalence

Vendor Rating for Service Desk Selection

Intro to Circle Geometry By Raymond Cheong

Polynomial Functions. Polynomial functions in one variable can be written in expanded form as ( )

Curvature. (Com S 477/577 Notes) Yan-Bin Jia. Oct 8, 2015

Alarm transmission through Radio and GSM networks

Give me all I pay for Execution Guarantees in Electronic Commerce Payment Processes

Chapter 2 Valiant Load-Balancing: Building Networks That Can Support All Traffic Matrices

Things to Remember. r Complete all of the sections on the Retirement Benefit Options form that apply to your request.

Data replication in mobile computing

Automatic Testing of Neighbor Discovery Protocol Based on FSM and TTCN*

Comparing Availability of Various Rack Power Redundancy Configurations

for Student Service Members and Veterans in Indiana

DlNBVRGH + Sickness Absence Monitoring Report. Executive of the Council. Purpose of report

STUDENT RESPONSE TO ANNUITY FORMULA DERIVATION

Power Monitoring and Control for Electric Home Appliances Based on Power Line Communication

AMB111F Financial Maths Notes

Protocol Analysis / Analysis of Software Artifacts Kevin Bierhoff

ON THE (Q, R) POLICY IN PRODUCTION-INVENTORY SYSTEMS

An Efficient Group Key Agreement Protocol for Ad hoc Networks

Over-encryption: Management of Access Control Evolution on Outsourced Data

THE DISTRIBUTED LOCATION RESOLUTION PROBLEM AND ITS EFFICIENT SOLUTION

Marketing Logistics: Opportunities and Limitations

Continuous Compounding and Annualization

Valuation of Floating Rate Bonds 1

Comparing Availability of Various Rack Power Redundancy Configurations

Faithful Comptroller s Handbook

Adaptive Queue Management with Restraint on Non-Responsive Flows

r (1+cos(θ)) sin(θ) C θ 2 r cos θ 2

Optimizing Content Retrieval Delay for LT-based Distributed Cloud Storage Systems

Network Configuration Independence Mechanism

An Analysis of Manufacturer Benefits under Vendor Managed Systems

How To Reduce Telecommunictions Costs

Distributed Computing and Big Data: Hadoop and MapReduce

VoIP for the Small Business

MATH 150 HOMEWORK 4 SOLUTIONS

DRIVER BEHAVIOR MODELING USING HYBRID DYNAMIC SYSTEMS FOR DRIVER-AWARE ACTIVE VEHICLE SAFETY

Reduced Pattern Training Based on Task Decomposition Using Pattern Distributor

Small Business Networking

9:6.4 Sample Questions/Requests for Managing Underwriter Candidates

Small Business Cloud Services

Financing Terms in the EOQ Model

How To Set Up A Network For Your Business

2.016 Hydrodynamics Prof. A.H. Techet

est using the formula I = Prt, where I is the interest earned, P is the principal, r is the interest rate, and t is the time in years.

Definitions and terminology

Health insurance exchanges What to expect in 2014

Towards Automatic Update of Access Control Policy

ENABLING INFORMATION GATHERING PATTERNS FOR EMERGENCY RESPONSE WITH THE OPENKNOWLEDGE SYSTEM

How to create RAID 1 mirroring with a hard disk that already has data or an operating system on it

VoIP for the Small Business

How To Network A Smll Business

VoIP for the Small Business

VoIP for the Small Business

Engineer-to-Engineer Note

VoIP for the Small Business

by K.-H. Rutsch*, P.J. Viljoen*, and H. Steyn* The need for systematic project portfolio selection

Transcription:

Poceedings of the IEEE Intentionl Confeence on Compute Communictions nd Netwoks Mimi, Floid, pp. 63-68, Octobe 2002. Implementtion nd Evlution of Tnspent Fult-Tolent Web Sevice with Kenel-Level Suppot Nvid Aghdie nd Yuvl Tmi Concuent Systems Lbotoy UCLA Compute Science Deptment Los Angeles, Clifoni 90095 {nvid,tmi}@cs.ucl.edu Abstct Most of the techniques used fo incesing the vilbility of web sevices do not povide fult tolence fo equests being pocessed t the time of seve filue. Othe schemes equie deteministic seves o chnges to the web client. These limittions e uncceptble fo mny cuent nd futue pplictions of the Web. We hve developed n efficient implementtion of client-tnspent mechnism fo poviding fult-tolent web sevice tht does not hve the limittions mentioned bove. The scheme is bsed on hot stndby bckup seve tht mintins logs of equests nd eplies. The implementtion includes modifictions to the Linux kenel nd to the Apche web seve, using thei espective module mechnisms. We descibe the implementtion nd pesent n evlution of the impct of the bckup scheme in tems of thoughput, ltency, nd CPU pocessing cycles ovehed. I. INTRODUCTION Web seves e incesingly used fo citicl pplictions whee outges o eoneous opetion e uncceptble. In most cses citicl sevices e povided using thee tie chitectue, consisting of: client web bowses, one o moe eplicted font-end seves (e.g. Apche), nd one o moe bck-end seves (e.g. dtbse). HTTP ove TCP/IP is the pedominnt potocol used fo communiction between clients nd the web seve. The font-end web seve is the medito between the clients nd the bck-end seve. Fult tolence techniques e often used to incese the elibility nd vilbility of Intenet sevices. Web seves e often stteless they do not mintin stte infomtion fom one client equest to the next. Hence, most existing web seve fult tolence schemes simply detect filues nd oute futue equests to bckup seves. Exmples of such fult tolence techniques include the use of specilized outes nd lod blnces [4, 5, 12, 14] nd dt epliction [6, 28]. These methods e unble to ecove in-pogess equests since, while the web seve is stteless between tnsctions, it does mintin impotnt stte fom the ivl of the fist pcket of equest to the tnsmission of the lst pcket of the eply. With the schemes mentioned bove, the client neve eceives complete eplies to the in-pogess equests nd hs no wy to detemine whethe o not equested opetion hs been pefomed [1, 15, 16] (see Figue 1). Some ecent wok does ddess the need fo hndling inpogess tnsctions. Client-we solutions such s [16, 23, 26] equie modifictions to the clients to chieve thei gols. Since mny vesions of the client softwe, the 1 2 Client Web Seve Bck-end 4 Figue 1: If the web seve fils befoe sending the client eply (step 4), the client cn not detemine whethe the filue ws befoe o fte the web seve communiction with the bck-end (steps 2,3) web bowse, e widely distibuted nd they e typiclly developed independently of the web sevice, it is citicl tht ny fult tolence scheme used be tnspent to the client. Schemes fo tnspent seve epliction [3, 7, 18, 25] sometimes equie deteministic seves fo eply genetion o do not ecove equests whose pocessing ws in pogess t the time of filue. We discuss some of these solutions in moe detil in Sections II nd V. We hve peviously developed scheme fo clienttnspent fult-tolent web sevice tht ovecomes the disdvntges of existing schemes [1]. The scheme is bsed on logging of HTTP equests nd eplies to hot stndby bckup seve. Ou oiginl implementtion ws bsed on use-level poxies, equied non-stndd fetues of the Solis w socket intefce, nd ws neve integted with el web seve. Tht implementtion did not equie ny kenel modifictions but incued high pocessing ovehed. The contibution of this ppe is moe efficient implementtion of the scheme on Linux bsed on kenel modifictions nd its integtion with the Apche web seve using Apche s module mechnism. The smll modifictions to the kenel e used to povide client-tnspent multicst of equests to pimy seve nd bckup seve s well s the bility to continue tnsmission of eply to the client despite seve filue. Ou implementtion is bsed on off-the-shelf hdwe (PC, oute), nd softwe (Linux, Apche). We ely on the stndd elibility fetues of TCP nd do not mke ny chnges to the potocol o its implementtion. In Section II we pesent the chitectue of ou scheme nd key design choices. Section III discusses ou implementtion bsed on kenel nd web seve modules. A detiled nlysis of the pefomnce esults including thoughput, ltency, nd consumed pocessing cycles is pesented in Section IV. Relted wok is discussed in Section V. 3 63

II. TRANSPARENT FAULT-TOLERANT WEB SERVICE In ode to povide client-tnspent fult-tolent web sevice, fult-fee client must eceive vlid eply fo evey equest tht is viewed by the client s hving been deliveed. Both the equest nd the eply my consist of multiple TCP pckets. Once equest TCP pcket hs been cknowledged to the client, it must not be lost. All eply TCP pckets sent to the client must fom consistent, coect eplies to pio equests. We ssume tht only single seve host t time my fil. We futhe ssume tht hosts e fil-stop [24]. Hence, host filue is detected using stndd techniques, such s peiodic hetbets. Techniques fo deling with filue modes othe thn fil-stop e impotnt but e beyond the scope of this ppe. We lso ssume tht the locl e netwok connecting the two seves s well s the Intenet connection between the client nd the seve LAN will not suffe ny pemnent fults. The pimy nd bckup hosts e connected on the sme IP subnet. In pctice, the elibility of the netwok connection to tht subnet cn be enhnced using multiple outes unning potocols such s the Vitul Route Redundncy Potocol [19]. This cn pevent the locl LAN oute fom being citicl single point of filue. In ode chieve the fult tolence gols, ctive epliction of the seves my be used, whee evey client equest is pocessed by both seves. While this ppoch will hve the best fil-ove time, it suffes fom sevel dwbcks. Fist, this ppoch hs high cost in tems of pocessing powe, s evey client equest is effectively pocessed twice. A second dwbck is tht this ppoch only woks fo deteministic seves. If the seves genete eplies non-deteministiclly, the bckup my not hve n identicl copy of eply nd thus it cn not lwys continue the tnsmission of eply should the pimy fil in the midst of sending eply. An ltentive ppoch is bsed on logging. Specificlly, equest pckets e cknowledged only fte they e stoed edundntly (logged) so tht they cn be obtined even fte filue of seve host [1, 3]. Since the seve my be nondeteministic, none of the pckets of eply cn be sent to the client unless the entie eply is sfely stoed (logged) so tht its tnsmission cn poceed despite filue of seve host [1]. The logging of equests cn be done t the level of TCP pckets [3] o t the level of HTTP equests [1]. If equest logging is done t the level of HTTP equests, the equests cn be mtched with logged eplies so tht equest will neve be epocessed following filue if the eply hs ledy been logged [1]. This is citicl in ode to ensue tht fo ech equest only one eply will ech the client. If equest logging is done stictly t the level of TCP pckets [3], it is possible fo equest to be eplyed to spe seve following filue despite the fct tht eply hs ledy been sent to the client. Since the spe seve my genete diffeent eply, two diffeent eplies fo the sme equest my ech the client, clely violting the equiement fo tnspent fult tolence. We hve peviously poposed [1] implementing tnspent fult-tolent web sevice using hot stndby bckup seve tht logs HTTP equests nd eplies but does not ctully pocess equests unless the pimy seve fils. The eo contol mechnisms of TCP e used to povide elible multicst of client equests to the pimy nd bckup. All client equest pckets e logged t the bckup befoe iving t the pimy nd the pimy elibly fowds copy of the eply to the bckup befoe sending it to the client. Upon filue of the pimy, the bckup semlessly tkes ove eceiving ptilly eceived equests nd tnsmitting logged eplies. The bckup pocesses logged equests fo which no eply hs been logged nd ny new equests. Since ou scheme is client-tnspent, clients communicte with single seve ddess (the dvetised ddess) nd e unwe of seve epliction [1]. The bckup seve eceives ll the pckets sent to the dvetised ddess nd fowds copy to the pimy seve. Fo client tnspency, the souce ddesses of ll pckets eceived by the client must be the dvetised ddess. Hence, when the pimy sends pckets to the clients, it spoofs the souce ddess, using the sevice s dvetised ddess insted of it s own s the souce ddess. The pimy logs eplies by sending them to the bckup ove elible (TCP) connection nd witing fo n cknowledgment befoe sending them to the client. This ppe uses the sme bsic scheme but the focus hee is on the design nd evlution of moe efficient implementtion bsed on kenel modifictions. III. IMPLEMENTATION Thee e mny diffeent wys to implement the scheme descibed in Section II. As mentioned elie, we hve peviously done this bsed on use-level poxies, without ny kenel modifictions [1]. A poxy-bsed implementtion is simple nd potentilly moe potble thn n implementtion tht equies kenel modifiction but it incus highe pefomnce ovehed (Section IV). It is lso possible to implement the scheme entiely in the kenel in ode to minimize the ovehed [22]. Howeve it is genelly desible to minimize the complexity of the kenel [8, 17]. Futhemoe, the moe modul ppoch descibed in this ppe mkes it esie to pot the implementtion to othe kenels o othe web seves. Ou cuent implementtion consists of combintion of kenel modifictions nd modifictions to the use-level web seve (Figue 2). TCP/IP pcket opetions e pefomed in the kenel nd the HTTP messge opetions e pefomed in the web seves. We hve not implemented the bck-end potion of the thee-tie stuctue. This cn be done s mio imge of the font-end communiction [1]. Futhemoe, since the tnspency of the fult tolence scheme is not citicl between the web seve nd bck-end seves, simple nd less costly schemes e possible fo this section. Fo exmple, the font-end seves my include tnsction ID with ech equest to the bck-end. If equest is etnsmitted, it will include the tnsction ID nd the 64

B c k u p Kenel Kenel Module Seve Seve Module Client HTTP Reply Incoming Msg Outgoing Msg Kenel Module Kenel Seve Module Seve Figue 2: Implementtion: epliction using combintion of kenel nd web seve modules. Messge pths e shown. bck-end cn use tht to void pefoming the tnsction multiple times [20]. A. The Kenel Module The kenel module implements the client-tnspent tomic multicst mechnism between the client nd the pimy/bckup seve pi. In ddition it fcilittes the tnsmission of outgoing messges fom the seve pi to the client such tht the bckup cn continue the tnsmission semlessly if the pimy fils. The public ddess of the sevice known to clients is mpped to the bckup seve, so the bckup will eceive the client pckets. Afte n incoming pcket goes though the stndd kenel opetions such s checksum checking, nd just befoe the TCP stte chnge opetions e pefomed, the bckup s kenel module fowds copy of the pcket to the pimy. The bckup s kenel then continues the stndd pocessing of the pcket, s does the pimy s kenel with the fowded pcket. Outgoing pckets to the client e sent by the pimy. Such pckets must be pesented to the client with the sevice public ddess s the souce ddess. Hence, the pimy s kenel module chnges the souce ddess of outgoing pckets to the public ddess of the sevice. On the bckup, the kenel pocesses the outgoing pcket nd updtes the kenel s TCP stte, but the kenel module intecepts nd dops the pcket when it eches the device queue. TCP cknowledgments fo outgoing pckets e, of couse, incoming pckets nd they e multicst to the pimy nd bckup s bove. The key to ou multicst implementtion is tht when the pimy eceives pcket, it is ssued tht the bckup hs n identicl copy of the pcket. The bckup fowds pcket only fte the pcket psses though the kenel code whee pcket my be dopped due to detected eo (e.g., checksum) o hevy lod. If fowded pcket is lost while enoute to the pimy, the client does not eceive n cknowledgment nd thus etnsmits the pcket. This is becuse only the pimy s TCP cknowledgments ech the client. TCP cknowledgments geneted by the bckup e dopped by the bckup s kenel module. P i m y B. The Seve Module The seve module is used to hndle the pts of the scheme tht del with messges t the HTTP level. The Apche module cts s hndle [27] nd genetes the eplies tht e sent to the clients. It is composed of woke, mux, nd demux pocesses. B c k u p To Bckup Kenel Woke Pocs Demux Poc ck eply To Pimy Kenel Woke Pocs Demux Poc Mux Poc Figue 3: Seve Stuctue: The mux/demux pocesses e used to elibly tnsmit copy of the eplies to the bckup befoe they e sent to clients. The seve module implements these pocesses nd the necessy chnges to the stndd woke pocesses. 1) Woke Pocesses: A stndd Apche web seve consists of sevel pocesses hndling client equests. We efe to these stndd pocesses s woke pocesses. In ddition to the stndd hndling of equests, in ou scheme the woke pocesses lso communicte with the mux/demux pocesses descibed in the next subsection. The pimy woke pocesses eceive the client equest, pefom psing nd othe stndd opetions, nd then genete the eply. Othe thn few new bookkeeping opetions, these opetions e exctly wht is done in stndd web seve. Afte geneting the eply, insted of sending the eply diectly to the client, the pimy woke pocesses pss the geneted eply to the pimy mux pocess so tht it cn be sent to the bckup. The pimy woke pocess then wits fo n indiction fom the pimy demux pocess tht n cknowledgment hs been eceived fom the bckup, signling tht it cn now send the eply to the client. The bckup woke pocesses pefom the stndd opetions fo eceiving equest, but do not genete the eply. Upon eceiving equest nd pefoming the stndd opetions, the woke pocess just wits fo eply fom the bckup demux pocess. This is the eply tht is poduced by pimy woke pocess fo the sme client equest. 2) Mux/Demux Pocesses: The mux/demux pocesses ensue tht copy of the eply geneted by the pimy is sent to nd eceived by the bckup befoe the tnsmission of the eply to the client stts. This llows fo the bckup to semlessly tke ove fo the pimy in the event of filue, even if the eplies e geneted non-deteministiclly. The mux/demux pocesses communicte with ech othe ove TCP connection, nd use semphoes nd shed memoy to communicte with woke pocesses on the sme host (figue 3). A connection identifie (client s IP ddess nd TCP pot numbe) is sent long with the eplies nd cknowledgments so tht the demux pocess on the emote host cn identify the woke pocess with the mtching equest. P i m y 65

IV. PERFORMANCE EVALUATION The evlution of the scheme ws done on 350 MHz Intel Pentium II PC s inteconnected by 100 Mb/sec switched netwok bsed on Cisco 6509 switch. The seves wee unning ou modified Linux 2.4.2 kenel nd the Apche 1.3.23 web seve with logging tuned on nd with ou kenel nd seve modules instlled. We used custom clients simil to those of the Wisconsin Poxy Benchmk [2] fo ou mesuements. The clients continuously genete one outstnding HTTP equest t time with no think time. Fo ech expeiment, the equests wee fo files of specific size s pesented in ou esults. Intenet tffic studies [13, 10] indicte tht most web eplies e less thn 10-15 kbytes in size. Mesuement wee conducted on t lest thee system configutions: uneplicted, simplex, nd duplex. The uneplicted system is the stndd system with no kenel o web seve modifictions. The simplex system includes the kenel nd seve modifictions but thee is only one seve, i.e., incoming pckets e not elly multicst nd outgoing pckets e not sent to bckup befoe tnsmission to the client. The ext ovehed of simplex eltive to uneplicted is due minly to the pcket hede mnipultions nd bookkeeping in the kenel module. The duplex system is the full implementtion of the scheme. L t e n c y 10 5 0......................................... 0 10 20 30 40 50 Reply Size (Kbytes) Duplex Reply Ovehed Simplex Uneplicted Figue 4: Avege ltency (msec) obseved by client fo diffeent eply sizes nd system modes. The Reply Ovehed line depicts the ltency cused by epliction of the eply in duplex mode. A. Ltency Figue 4 shows the vege ltency on n unloded seve nd netwok fom the tnsmission of equest by the client to the eceipt of the coesponding eply by the client. Thee is only single client on the netwok nd this client hs mximum of one outstnding equest. The esults show tht the ltency ovehed eltive to the uneplicted system inceses with incesing eply size. This is due to pocessing of moe eply pckets. The diffeence between the Reply Ovehed line nd the Uneplicted line is the time to tnsmit the eply fom the pimy to the bckup nd eceive n cknowledgement t the pimy. This time ccounts fo most of the duplex ovehed. Note tht these mesuements exggete the eltive ovehed tht would impct el system since: 1) the client is on the sme locl netwok s the seve, nd 2) the equests e fo (cched) sttic files. In pctice, tking into ccount seve pocessing nd Intenet communiction delys, seve esponse times of hundeds of milliseconds e common. The bsolute ovehed time intoduced by ou scheme emins the sme egdless of seve esponse times nd theefoe ou implementtion ovehed is only smll fction of the ovell esponse time seen by clients. B. Thoughput Figue 5 shows the pek thoughput of single pi of seve hosts fo diffeent eply sizes. The thoughputs of uneplicted nd simplex (in Mbytes/sec) incese until the netwok becomes the bottleneck. Howeve, the duplex mode thoughput peks t less thn hlf of tht mount. This is due to the fct tht on the pimy, the sending of the eply to the bckup by the seve module nd the sending of eply to the clients (figue 2) occu ove the sme physicl link. Hence, the thoughput to the clients is educed by hlf in duplex mode. To void this bottleneck, the tnsmission of the eplies fom the pimy to the bckup cn be pefomed on septe dedicted link. A high-speed Myinet [9] LAN ws vilble to us nd ws used fo this pupose in mesuements denoted by duplex-mi. These mesuements show significnt thoughput impovement ove the duplex esults, s thoughput of bout twice tht of duplex mode with single netwok intefce is chieved. C. Pocessing Ovehed Tble 1 shows the CPU cycles used by the seves to eceive one equest nd genete eply. These mesuements wee done using the pocesso s pefomnce monitoing countes [21]. Fo ech configution the tble pesents the kenel-level, use-level, nd totl cycles used. The cpu% column shows the cpu utiliztion t pek thoughput, nd indictes tht the system becomes CPU bound s the eply size deceses. This explins the thoughput esults, whee lowe thoughputs (in Mbytes/sec) wee eched with smlle eplies. Bsed on Tble 1, the duplex seve (pimy nd bckup combined) cn equie moe thn fou times (fo the 50KB eply) s mny cycles to hndle equest comped with the uneplicted seve. Howeve, s noted in the pevious subsection, these mesuements e fo eplies geneted by eding cched sttic files. In pctice, fo likely pplictions of this technology (dynmic content), eplies e likely to be smlle nd equie significntly moe pocessing. Hence, the ctul eltive pocessing ovehed cn be expected to be much lowe thn the fcto of 4 shown in the tble. D. Compison with Use-Level Implementtion As mentioned elie, ou oiginl implementtion of this fult tolence scheme ws bsed on use-level poxies, without ny kenel modifictions [1]. Tble 2 shows compison of the pocessing ovehed of the use-level poxy ppoch with the implementtion pesented in this ppe. This compison is not pefectly ccute. While both schemes wee implemented on the sme hdwe, the uselevel poxy ppoch uns unde the Solis opeting system nd could not be esily poted to Linux due to diffeence in 66

Requests pe Second 1000 800 600 400 200 0....... Uneplicted.............. Simplex........................ Duplex-mi Duplex Mbytes pe Second 12 10 8 6 4 2 0....... Uneplicted............................................ Simplex Duplex-mi Duplex 0 10 20 30 40 50 Reply Size 0 10 20 30 40 50 Reply Size Figue 5: System thoughput (in equests nd Mbytes pe second) fo diffeent messge sizes (kbytes) nd system modes. Duplex-mi line denotes setting with multiple netwok intefces fo ech seve - one intefce is used only fo eply epliction. TABLE 1: Bekdown of used CPU cycles (in thousnds) - cpu% column indictes CPU utiliztion duing pek thoughput. 1kbyte eply 10kbyte eply 50kbyte eply System Mode use kenel totl cpu% use kenel totl cpu% use kenel totl cpu% Duplex (pimy) 190 337 527 100 193 587 780 77 224 1548 1772 53 Duplex (bckup) 147 330 477 91 158 615 773 76 185 1790 1958 58 Duplex-mi (pimy) 192 353 545 100 198 544 742 85 225 1283 1508 85 Duplex-mi (bckup) 147 355 502 93 152 545 697 80 169 1124 1293 72 Simplex 186 250 436 100 191 365 556 99 208 871 1079 70 Uneplicted 165 230 395 100 166 342 508 99 178 730 908 60 TABLE 2: Use-level vesus kenel suppot CPU cycles (in thousnds) fo pocessing equest tht genetes 1Kbyte eply. Implementtion Pimy Bckup Totl Use-level Poxies 1860 1370 3230 Kenel/Seve Modules 337 330 667 the semntics of w sockets. In ddition, the seve pogms e diffeent lthough they do simil pocessing. Howeve, the diffeence of lmost fcto of 5 is clely due mostly to the diffeence in the implementtion of the scheme, not to OS diffeences. The lge ovehed of the poxy ppoch is cused by the extneous system clls nd messge copying tht e necessy fo moving the messges between the two levels of poxies nd the seve. V. RELATED WORK Ely wok in this field, such s Round Robin DNS [11] nd DNS lising methods, focused on detecting fult nd outing futue equests to vilble seves. Centlized schemes, such s the Mgic Route [4] nd Cisco Locl Diecto [12], equie equest pckets to tvel though centl oute whee they e outed to the desied seve. Typiclly the oute detects seve filues nd does not oute pckets to seves tht hve filed. The centl oute is single point of filue nd pefomnce bottleneck since ll pckets must tvel though it. Distibuted Pcket Rewiting [7] voids hving single enty point by llowing the seves to send messges diectly to clients nd by implementing some of the oute logic in the seves so tht they cn fowd the equests to diffeent seves. None of these schemes suppot ecoveing equests tht wee being pocessed when the filue occued, no do they del with non-deteministic nd non-idempotent equests. Thee e vious seve epliction schemes tht e not client tnspent. Most still do not povide ecovey of equests tht wee ptilly pocessed. Folund nd Gueoui [16] do ecove such equests. Howeve, the client must etnsmit the equest to multiple seves upon filue detection nd must be we of the ddess of ll instnces of eplicted seves. A consensus geement potocol is lso equied fo the implementtion of thei wite-once egistes which could be costly, lthough it llows ecovey fom non fil-stop filues. Ou kenel module cn be seen s n ltentive implementtion of the wite-once egistes which lso povides client tnspency. Zho et l [29] descibe CORBA-bsed infstuctue fo epliction in thee-tie systems which del with the sme issues, but gin is not client-tnspent. The wok by Snoeen et l [26] is nothe exmple of solution tht is not tnspent to the client. A tnspot lye potocol with connection migtion cpbilities, such s SCTP o TCP with poposed extensions, is used long with session stte synchoniztion mechnism between seves to chieve connection-level filove. The equiement to use specilized tnspot lye potocol t the client is obviously not tnspent to the client. HydNet-FT [25] uses scheme tht is simil to ous. It is client-tnspent nd cn ecove ptilly pocessed equests. The HydNet-FT scheme ws designed to del with seve eplics tht e geogphiclly distibuted. As esult, it must use specilized outes ( ediectos ) to get pckets to thei destintions. These ediectos intoduces single point of filue simil to the Mgic Route scheme. Ou scheme is bsed on the bility to plce ll seve eplics on the sme subnet [1]. As esult, we cn use off-the-shelf 67

outes nd multiple outes cn be connected to the sme subnet nd configued to wok togethe to void single point of filue. Since HydNet-FT uses ctive epliction, it cn only be used with deteministic seves while ou stndby bckup scheme does not hve this limittion. Alvisi et l implemented FT-TCP [3], kenel level TCP wppe tht tnspently msks seve filues fom clients. While this scheme nd its implementtion e simil to ous, thee e impotnt diffeences. Insted of ou hot stndby spe ppoch, logge unning on septe pocesso is used. If used fo web sevice fult tolence, FT-TCP equies deteministic seves (see Section II) nd significntly longe ecovey times. In ddition, they did not evlute thei scheme in the context of web seves. VI. CONCLUSION We hve poposed client-tnspent fult tolence scheme fo web sevices tht coectly hndles ll client equests in spite of web seve filue. Ou scheme is comptible with existing thee-tie chitectues nd cn wok with non-deteministic nd non-idempotent seves. We hve implemented the scheme using combintion of Linux kenel modifictions nd modifiction to the Apche web seve. We hve shown tht this implementtion involves significntly lowe ovehed thn stictly use-level poxy-bsed implementtion of the sme scheme. Ou evlution of the esponse time (ltency) nd pocessing ovehed shows tht the scheme does intoduce significnt ovehed comped to stndd seve with no fult tolence fetues. Howeve, this esult only holds if geneting the eply equies lmost no pocessing. In pctice, fo the tget ppliction of this scheme, eplies e often smll nd e dynmiclly geneted (equiing significnt pocessing). Fo such woklods, ou esults imply low eltive oveheds in tems of both ltency nd pocessing cycles. We hve lso shown tht in ode to chieve mximum thoughput it is citicl to hve dedicted netwok connection between the pimy nd bckup. REFERENCES [1] N. Aghdie nd Y. Tmi, Client-Tnspent Fult-Tolent Web Sevice, Poceedings of the 20th IEEE Intentionl Pefomnce, Computing, nd Communictions Confeence, Phoenix, Aizon, pp. 209-216 (Apil 2001). [2] J. Almeid nd P. Co, Wisconsin Poxy Benchmk, Technicl Repot 1373, Compute Sciences Dept, Univ. of Wisconsin-Mdison (Apil 1998). [3] L. Alvisi, T. C. Bessoud, A. El-Khshb, K. Mzullo, nd D. Zgoodnov, Wpping Seve-Side TCP to Msk Connection Filues, Poceedings of IEEE INFOCOM, Anchoge, Alsk, pp. 329-337 (Apil 2001). [4] E. Andeson, D. Ptteson, nd E. Bewe, The Mgicoute, n Appliction of Fst Pcket Inteposing, Clss Repot, UC Bekeley - http://www.cs.bekeley.edu/ endes/pojects/mgicoute/ (My 1996). [5] D. Andesen, T. Yng, V. Holmedhl, nd O. H. Ib, SWEB: Towds Sclble Wold Wide Web Seve on Multicomputes, Poccedings of the 10th Intentionl Pllel Pocessing Symposium, Honolulu, Hwii, pp. 850-856 (Apil 1996). [6] S. M. Bke nd B. Moon, Distibuted Coopetive Web Seves, The Eighth Intentionl Wold Wide Web Confeence, Toonto, Cnd, pp. 1215-1229 (My 1999). [7] A. Bestvos, M. Covell, J. Liu, nd D. Mtin, Distibuted Pcket Rewiting nd its Appliction to Sclble Seve Achitectues, Poceedings of the Intentionl Confeence on Netwok Potocols, Austin, Texs, pp. 290-297 (Octobe 1998). [8] D. L. Blck, D. B. Golub, D. P. Julin, R. F. Rshid, nd R. P. Dves, Micokenel Opeting System Achitectue nd Mch, Poceedings of the USENIX Wokshop on Mico-Kenels nd Othe Kenel Achitectues, Bekeley, CA, pp. 11-30 (Apil 1992). [9] N. J. Boden, D. Cohen, R. E. Feldemn, A. E. Kulwik, C. L. Seitz, J. N. Seizovic, nd W.-K. Su, Myinet: A Gigbit-pe-Second Locl Ae Netwok, IEEE Mico 15(1), pp. 29-36 (Febuy 1995). [10] L. Beslu, P. Co, L. Fn, G. Phillips, nd S. Shenke, Web Cching nd Zipf-like Distibutions: Evidence nd Implictions, Poceedings of IEEE INFOCOM, New Yok, New Yok (Mch 1999). [11] T. Bisco, DNS Suppot fo Lod Blncing, IETF RFC 1794 (Apil 1995). [12] Cisco Systems Inc, Scling the Intenet Web Seves, Cisco Systems White Ppe - http://www.ieng.com/wp/public/cc/pd/cxs/- 400/tech/scle_wp.htm. [13] C. Cunh, A. Bestvos, nd M. Covell, Chcteistics of Wold Wide Web Client-bsed Tces, Technicl Repot TR-95-010, Boston Univesity, CS Dept, Boston, MA 02215 (Apil 1995). [14] D. M. Dis, W. Kish, R. Mukhejee, nd R. Tewi, A sclble nd highly vilble web seve, Poceedings of IEEE COMPCON 96, Sn Jose, Clifoni, pp. 85-92 (1996). [15] S. Folund nd R. Gueoui, CORBA Fult-Tolence: why it does not dd up, Poceedings of the IEEE Wokshop on Futue Tends of Distibuted Systems (Decembe 1999). [16] S. Folund nd R. Gueoui, Implementing e-tnsctions with Asynchonous Repliction, IEEE Intentionl Confeence on Dependble Systems nd Netwoks, New Yok, New Yok, pp. 449-458 (June 2000). [17] D. Golub, R. Den, A. Foin, nd R. Rshid, Unix s n Appliction Pogm, Poceedings of summe USENIX, pp. 87-96 (June 1990). [18] C. T. Kmnolis nd J. N. Mgee, Configuble Highly Avilble Distibuted Sevices, Poceedings of the 14th IEEE Symposium on Relible Distibuted Systems, Bd Neuenh, Gemny, pp. 118-127 (Septembe 1995). [19] S. Knight, D. Weve, D. Whipple, R. Hinden, D. Mitzel, P. Hunt, P. Higginson, M. Shnd, nd A. Lindem, Vitul Route Redundncy Potocol, RFC 2338, IETF (Apil 1998). [20] Ocle Inc, Ocle8i Distibuted Dtbse Systems - Relese 8.1.5, Ocle Documenttion Liby (1999). [21] M. Pettesson, Linux x86 Pefomnce-Monitoing Countes Dive, http://www.csd.uu.se/ mikpe/linux/pefct/. [22] Red Ht Inc, TUX Web Seve, http://www.edht.com/docs/- mnuls/tux/. [23] M. Syl, Y. Beitbt, P. Scheuemnn, nd R. Vinglek, Selection Algoithms fo Replicted Web Seves, Pefomnce Evlution Review - Wokshop on Intenet Seve Pefomnce, Mdison, Wisconsin, pp. 44-50 (June 1998). [24] F. B. Schneide, Byzntine Genels in Action: Implementing Fil- Stop Pocessos, ACM Tnsctions on Compute Systems 2(2), pp. 145-154 (My 1984). [25] G. Shenoy, S. K. Stpti, nd R. Bettti, HydNet-FT: Netwok Suppot fo Dependble Sevices, Poceedings of the 20th IEEE Intentionl Confeence on Distibuted Computing Systems, Tipei, Tiwn, pp. 699-706 (Apil 2000). [26] A. C. Snoeen, D. G. Andesen, nd H. Blkishnn, Fine-Gined Filove Using Connection Migtion, Poceedings of the 3d USENIX Symposium on Intenet Technologies nd Systems, Sn Fncisco, Clifoni (Mch 2001). [27] L. Stein nd D. McEchen, Witing Apche Modules with Pel nd C, O Reilly nd Assocites (Mch 1999). [28] R. Vinglek, Y. Beitbt, M. Syl, nd P. Scheuemnn, Web++: A System Fo Fst nd Relible Web Sevice, Poceedings of the USENIX Annul Technicl Confeence, Sydney, Austli, pp. 171-184 (June 1999). [29] W. Zho, L. E. Mose, nd P. M. Melli-Smith, Incesing the Relibility of Thee-Tie Applictions, Poceedings of the 12th Intentionl Symposium on Softwe Relibility Engineeing, Hong Kong, pp. 138-147 (Novembe 2001). 68