Simple Failure Resilient Load Balancing



Similar documents
Network Architecture for Joint Failure Recovery and Traffic Engineering

System Business Continuity Classification

System Business Continuity Classification

The Importance Advanced Data Collection System Maintenance. Berry Drijsen Global Service Business Manager. knowledge to shape your future

Implementing ifolder Server in the DMZ with ifolder Data inside the Firewall

Using Sentry-go Enterprise/ASPX for Sentry-go Quick & Plus! monitors

Using PayPal Website Payments Pro UK with ProductCart

Licensing Windows Server 2012 R2 for use with virtualization technologies

Implementing SQL Manage Quick Guide

Technical White Paper

Best Practice - Pentaho BA for High Availability

ViPNet VPN in Cisco Environment. Supplement to ViPNet Documentation

Disk Redundancy (RAID)

Serv-U Distributed Architecture Guide

How do I evaluate the quality of my wireless connection?

2. When logging is used, which severity level indicates that a device is unusable?

New York University Computer Science Department Courant Institute of Mathematical Sciences

The ad hoc reporting feature provides a user the ability to generate reports on many of the data items contained in the categories.

Data Abstraction Best Practices with Cisco Data Virtualization

WHITE PAPER. Vendor Managed Inventory (VMI) is Not Just for A Items

In this lab class we will approach the following topics:

How to deploy IVE Active-Active and Active-Passive clusters

Performance Test Modeling with ANALYTICS

June 29, 2009 Incident Review Dallas Fort Worth Data Center Review Dated: July 8, 2009

Information Services Hosting Arrangements

Licensing Windows Server 2012 for use with virtualization technologies

Firewall/Proxy Server Settings to Access Hosted Environment. For Access Control Method (also known as access lists and usually used on routers)

Standardization or Harmonization? You need Both

Phi Kappa Sigma International Fraternity Insurance Billing Methodology

SBClient and Microsoft Windows Terminal Server (Including Citrix Server)

Volume 2, Issue 11, November 2014 International Journal of Advance Research in Computer Science and Management Studies

Recent Releases $8.95 $5.84* Established Favorites $5.99 $3.89*

1.3. The Mean Temperature Difference

1 GETTING STARTED. 5/7/2008 Chapter 1

Wireless Light-Level Monitoring

Helpdesk Support Tickets & Knowledgebase

Decision Making in Logistics Processes Supported by Cloud Computing

PBS TeacherLine Course Syllabus

What is Software Risk Management? (And why should I care?)

Remote Desktop Tutorial. By: Virginia Ginny Morris

IMPLEMENTING CISCO IP SWITCHED NETWORKS (SWITCH)

Networking Best Practices

Research Findings from the West Virginia Virtual School Spanish Program

HP Point of Sale FAQ Warranty, Care Pack Service & Support. Limited warranty... 2 HP Care Pack Services... 3 Support... 3

Succession Planning & Leadership Development: Your Utility s Bridge to the Future

WEB APPLICATION SECURITY TESTING

Improved Data Center Power Consumption and Streamlining Management in Windows Server 2008 R2 with SP1

Personal Data Security Breach Management Policy

Calibration of Oxygen Bomb Calorimeters

Integrate Marketing Automation, Lead Management and CRM

How To Install An Orin Failver Engine On A Network With A Network Card (Orin) On A 2Gigbook (Orion) On An Ipad (Orina) Orin (Ornet) Ornet (Orn

CHECKING ACCOUNTS AND ATM TRANSACTIONS


Mobile Workforce. Improving Productivity, Improving Profitability

Software and Hardware Change Management Policy for CDes Computer Labs

PBX Remote Line Extension using Mediatrix 4104 and 1204 June 22, 2011

Trends and Considerations in Currency Recycle Devices. What is a Currency Recycle Device? November 2003

COUNTY OF SONOMA AGENDA ITEM SUMMARY REPORT

Performance of Multiple TFRC in Heterogeneous Wireless Networks

CNS-205 Citrix NetScaler 10.5 Essentials and Networking

Serv-U Distributed Architecture Guide

How To Set Up A General Ledger In Korea

Change Management Process

IT Help Desk Service Level Expectations Revised: 01/09/2012

CNS-205: Citrix NetScaler 11 Essentials and Networking

ERAC. Efficient and Robust Architecture for the big data Cloud. Tor Skeie

HOWTO: How to configure SSL VPN tunnel gateway (office) to gateway

Chapter 3: Cluster Analysis

ASSISTANCE TO ENERGY SECTOR TO STRENGTHEN ENERGY SECURITY AND REGIONAL INTEGRATION CONTRACT NUMBER EPP-I

Ensuring end-to-end protection of video integrity

ROSS RepliWeb Operations Suite for SharePoint. SSL User Guide

Accident Investigation

INSTRUCTIONS ON HOW TO IMPORT (Attach) DOCUMENTS TO TRANSACTIONS IN THE EMPLOYEE REIMBURSEMENT SYSTEM

SPECIFICATION. Hospital Report Manager Connectivity Requirements. Electronic Medical Records DRAFT. OntarioMD Inc. Date: September 30, 2010

FINRA Regulation Filing Application Batch Submissions

Best Practices for Optimizing Performance and Availability in Virtual Infrastructures

Implementing an electronic document and records management system using SharePoint 7

In connection with the SEC's Money Market Reform proposal, DST Systems, Inc. respectfully submits our comments for your consideration.

ITIL V3 Planning, Protection and Optimization (PPO) Certification Program - 5 Days

Frequently Asked Questions November 19, Which browsers are compatible with the Global Patent Search Network (GPSN)?

Transcription:

Simple Failure Reilient Lad Balancing Martin Suchara, Dahai Xu, Rbert Dverpike, David Jhnn and Jennifer Rexfrd Cmputer Science Department AT&T Lab Reearch Princetn Univerity, NJ 08544 Flrham Park, NJ 07932 {muchara, jrex}@c.princetn.edu {dahaixu, rdd, dj}@reearch.att.cm Abtract T enable reliable data delivery and balance lad in the preence f failure, we prpe a new mechanim that cmbine path prtectin and traffic engineering. The key benefit f ur lutin i it implicity, allwing fr fat recvery while imping minimal requirement n the ruter. T prvide reilience againt every failure cenari frm a knwn et, we advcate uing a fixed et f parallel end-t-end path fr each traffic demand. Upn detecting a path failure, the ingre ruter ue a lcal rule t rebalance the utging traffic n the remaining available path. We decribe everal candidate rebalancing algrithm, and analyze their perfrmance. Althugh calculating the ptimal et f path and the path-plitting parameter fr each ruter i NP-hard, ur extenive imulatin n a tier-1 IP backbne demntrate that ur eay-t-calculate heuritic uffice t achieve nearly ptimal lad balancing. We believe that a imple-t-implement lutin with a fat recvery time, uch a ur, will appeal t Internet Service Prvider a well a the peratr f data center and enterprie netwrk. I. INTRODUCTION Failure recvery mechanim [1] are ued t enure uninterrupted data delivery in the preence f link r ruter failure. Failure recvery i imprtant fr backbne netwrk peratr wh trive t meet availability requirement f Service Level Agreement (SLA), a well a the peratr f large data center where the failure f individual cmpnent are quite cmmn. Failure recvery i a challenging prblem becaue f the need fr balanced lad after reruting the traffic affected by the failure. In additin, prmpt failure handling and reruting f the affected traffic t alternate path ften reult in ignificant path retructuring, which i anther majr challenge faced by netwrk peratr wh ue the exiting failure-recvery mechanim. Our wrk prpe a fat failure-recvery mechanim that ffer nearly ptimal lad balancing while uing a tatic et f path. We explit multipath ruting between each pair f ingre and egre ruter. Such end-t-end ruting ha everal benefit. Firt, thee rute d nt need t change when a failure i detected, which ave time, reduce verhead, and imprve path tability. Secnd, end-t-end lad balancing pread the traffic in the netwrk mre effectively than lcal reruting. Finally, it enable fater recvery and lwer prtcl verhead than cnventinal link-tate ruting prtcl, like OSPF, which nt nly rely n flding link-tate advertiement and recmputing hrtet path but al uffer frm tranient frwarding lp. Anther benefit f ur lutin i it implicity, with mt f the functinality incrprated in the netwrk-management ftware rather than the netwrk element. The management ftware i repnible fr electing the end-t-end path and calculating the path-plitting parameter fr each ruter. Our deign ha a minimalit cntrl plane, ued nly fr failure detectin, which lead naturally t a impler netwrk where mart, expenive ruter are replaced with cheaper witche with a limited feature et. While ur deign enable the implificatin f the netwrk element, ur lutin i al readily deplyable uing technlgie available in exiting ruter. Multi-Prtcl Label Switching (MPLS) [2] i particularly uitable becaue ingre ruter encapulate packet with label and direct them ver pre-etablihed Label-Switched Path (LSP). Thi enable flexible ruting when multiple LSP are etablihed between each ingre-egre ruter pair. Technique fr plitting traffic ver the multiple LSP, with relatively flexible plitting rati, i already upprted by the majr ruter vendr [3], [4]. Our lutin, then, culd be viewed a a particular applicatin f MPLS, where the netwrk-management ftware cmpute the LSP, intruct the ingre ruter t etablih the path (ay, uing RSVP), and diable any dynamic recmputatin f alternate path when primary path fail. In ur architecture, the netwrk-management ftware lve an ffline ptimizatin prblem t cmpute the path and the plitting rati, given the traffic demand, the netwrk tplgy, and knwn failure cenari. The bjective i t minimize cngetin ver all the failure cenari, weighted by their likelihd r imprtance. The cngetin fr a particular failure cae i determined a a um f the cngetin penalty aciated with each link. Since calculating the ptimal path and plitting rati i NP-hard, we prpe everal effective heuritic that vary in the functinality they expect frm the netwrk element. The heuritic tart with a cmmn firt tep that lve a imple multicmmdity flw prblem t cmpute a mall et f end-t-end path. The heuritic differ in the ecnd tep, which ptimize the (tatic) ruter cnfiguratin that determine hw traffic i plit ver multiple path. We evaluate ur lutin experimentally in the cntext f MPLS n the IP backbne f a tier-1 ISP. We demntrate that ur heuritic are nearly ptimal in balancing traffic lad acr a wide range f failure cenari. We al hw that the number f path cnnecting each ingre-egre ruter pair i mall, which i imprtant fr a calable lutin. Our imulatin achieve a high degree f accuracy by utilizing the real tplgy, link capacitie, link delay, traffic matrice, and Shared Rik Link Grup (SRLG) [5] (indicating which link

2 are likely t fail tgether) f the tier-1 ISP. Our apprach differ frm previuly publihed wrk. Mt f the literature cnider either path prtectin r traffic engineering [4] in ilatin. MPLS path-prtectin mechanim aim t reduce the failure-recvery time by etablihing backup path that carry traffic when the primary LSP fail. Lcal path prtectin [6] reerve a backup path cnnecting the end f individual link in the netwrk. While enabling fat recvery, lcal path prtectin cannt fully explit the available path diverity, leading t ubptimal lad balancing. Glbal path prtectin [7] allw better lad balancing thrugh the ue f end-t-end backup LSP, but the recvery i lw 1. Our apprach addree the ptimality iue by utilizing dynamic traffic plitting acr multiple divere LSP, and it recvery i fater than glbal path prtectin due t it implicity. The paper i rganized a fllw. Fllwing a brief verview f ur prped architecture in Sectin II, Sectin III frmulate the ptimizatn prblem and Sectin IV preent everal heuritic lutin. Next, Sectin V evaluate the heuritic n realitic tplgy and traffic data. The paper cnclude in Sectin VII with a dicuin f future reearch directin. NP-cmpletene prf appear in the Appendix. II. LOAD BALANCING OVER MULTIPLE STATIC PATHS Our netwrk architecture i mtivated by the need t: (i) make netwrk management eaier and enable the ue f impler, cheaper ruter, (ii) balance lad befre, during, and after failure t make efficient ue f netwrk reurce, and (iii) detect and repnd t failure quickly t enure uninterrupted ervice. The reulting deign place mt f functinality in a management ytem that perfrm ptimizatin in an ffline fahin, a depicted in Figure 1. The ruter imply detect failed path and autmatically reditribute traffic n the remaining path baed n their tatic cnfiguratin. Thi implifie netwrk management, reduce ruter ct, and remve dynamic tate frm the ruter. In thi ectin, we dicu hw ingre ruter plit traffic ver multiple path and learn abut path failure, and hw the management ytem pre-cmpute the path and plitting rati. A. Flexible Lad Balancing Over Pre-etablihed Path Our architecture ue multiple path between each ingreegre ruter pair in the netwrk. Uing preetablihed end-tend path allw fat failure recvery a the ingre ruter can hift the lad away frm the failed path, aviding dynamic path recalculatin. Uing multiple path al allw the ingre ruter t balance the lad in the netwrk which help t reduce cngetin. In ur architecture, the ingre ruter ha a imple tatic cnfiguratin that determine the traffic-plitting rati amng the available path, while intermediate ruter merely frward packet ver pre-etablihed path. A a reult, ur ruter i a imple device that de nt need t cllect cngetin feedback, participate in a ruting prtcl, interact 1 Calculatin f ptimal backup LSP and the witch ver i invlved that ISP are currently cnidering deplyment f a hybrid cheme that applie lcal and glbal path prtectin in uccein. fixed path plitting rati 0.5 0.5 0 tplgy deign lit f hared rik demand link cut Fig. 1. The management ytem calculate a fixed et f path and plitting rati, baed n the tplgy, traffic demand, and ptential failure. The ingre ruter learn abut path failure and plit traffic ver wrking path. with the management ytem upn failure detectin, r lve any cmputatinally difficult prblem. In the cntext f MPLS, flexible traffic plitting i already upprted by bth majr ruter vendr [3], [4]. The ruter can be cnfigured t hah packet baed n prt and addre infrmatin in the header int everal grup and frward each grup n a eparate path. Thi prvide path plitting with relatively fine granularity (e.g., at the 1/16th level), while enuring that packet belnging t the ame flw travere the ame path. In a data center, the end-ht erver culd encapulate the packet, a uggeted in [8], and che encapulatin header that plit the traffic ver the multiple path with the deired plitting rati. Thi further reduce the cmplexity f the netwrk element, and al enable finergrain traffic plitting than tday ruter prvide. B. Path-level Failure Detectin and Ntificatin The ingre ruter ue a path-level failure-detectin mechanim t avid ending traffic n faileded path. Thi mechanim culd be implemented, e.g., uing Bidirectinal Frwarding Detectin (BFD) [9]. BFD etablihe ein between the ingre-egre ruter pair t mnitr each f the path. BFD piggyback n exiting traffic and bviate the need t end hell meage. A majr advantage f thi apprach i that the ingre ruter receive a fater failure ntificatin than wuld be pible uing a ruting prtcl wn lcal keepalive mechanim. Anther advantage i that the packet are handled by the hardware interface and d nt ue the ruter CPU time. Althugh the ingre ruter den t learn which link failed, knwledge f end-t-end path failure i ufficient t avid uing the failed path. In fact, ince ur deign de nt require the ruter t be aware f the tplgy, n cntrl prtcl i needed t exchange tplgy infrmatin 2. 2 A backward-cmpatible realizatin f ur architecture culd leverage finer-grain tplgy infrmatin. Fr example, MPLS-capable ruter can be cnfigured t learn abut link failure frm the interir gateway prtcl (e.g., OSPF). If n alternate rute are pecified fr the affected path(), the ruter imply renrmalize the utging traffic n the remaining available path. t

3 C. Offline Rute Optimizatin in the Management Sytem Given the tatic netwrk tplgy, hared-rik infrmatin (i.e., et f link with a hared vulnerability), and traffic matrix (i.e., vlume f exchanged traffic between each ingreegre ruter pair), the management ytem calculate multiple divere path that at leat ne f them wrk fr each failure cenari; thi i pible a lng a n failure partitin the ingre-egre ruter pair. Mrever, the path can be chen that they allw lad balancing in the netwrk. Thee tw gal are cmplementary a bth require path diverity. After cmputing the path and aciated traffic-plitting parameter, the management ytem intall them either by ppulating frwarding table in the ruter r cnfiguring the ingre ruter t ignal the path uing a prtcl like RSVP. The management ytem ha direct acce t accurate infrmatin abut the netwrk tplgy and anticipated traffic demand. The peratr can eaily prvide the management ytem with a lit f ptential r planned failure; crrelated link failure can be determined by cnidering et f link that hare a cmmn vulnerability [5]. Many failure in ISP backbne are planned in advance, r invlve a ingle link, and mt f thee faiulre are hrt lived [10]. Our lutin allw the netwrk t cntinue directing traffic ver the wrking path, withut incurring any prtcl verhead t withdraw r recmpute path; intead, the failed path remain in the frwarding table, ready t be ued upn recvery. Since the netwrk cnfiguratin i cmpletely tatic, the management ytem can calculate path and plitting parameter ffline, and change them nly in repne t ignificant traffic hift r the planned lng-term additin r remval f equipment. III. NETWORK MODEL AND OPTIMIZATION OBJECTIVE The netwrk-management ytem lve an ffline ptimizatin prblem t elect the path and plitting rati fr each ingre-egre pair. The exact frmulatin f the ptimizatin prblem depend n hw the netwrk element repreent and ue the plitting rati. In thi ectin, we preent the cmmn apect f the prblem frmulatin. Firt, we decribe hw we mdel the netwrk tplgy, traffic demand, failure cenari, and end-t-end path. Then, we intrduce the bjective that the management ytem trie t ptimize. Sectin IV preent the remaining detail f the ptimizatin prblem, alng with ur algrithm fr lving them. A. Tplgy, Shared Rik, Traffic Demand, and Path A hwn in Figure 1, the management ytem ha three main input: Fixed tplgy: The tplgy i repreented by a graph G(V, E) with a et f vertice V and directed edge E. The capacity f edge e E i dented by c e, and the prpagatin delay n the edge i y e. Shared rik: The hared rik are dented by the et S, where each S cnit f a et f edge that may fail tgether. Fr example, a ruter failure i repreented by the et f it incident link, a fiber cut i repreented by all link in the affected fiber bundle, and the failure-free cae i repreented by the empty et. Fr implicity, we aume that all demand remain cnnected fr each failure; alternatively, a demand can be mitted fr each failure cae that dicnnect it. Traffic demand: Finally, each traffic demand d D i repreented by a triple (u d, v d, h d ), where u d V i the traffic urce (ingre ruter), v d V i the detinatin (egre ruter), and h d i the flw requirement (meaured traffic). The management ytem utput i a et f path P d fr each demand d and the plitting rati fr each path. Optimizing thee utput mut cnider the effect f each failure tate n the path available fr demand d. Traffic plitting by ingre ruter u d depend nly n which path have failed, nt which failure cenari ha ccurred; in fact, multiple failure cenari may affect the ame ubet f path in P d. T rean abut the handling f a particular demand d, we cnider a et O d f bervable failure tate, where each bervable tate O d crrepnd t a particular Pd P d repreenting the available path. Fr eae f exprein, we let the functin d () map t the failure tate bervable by nde u d when the netwrk i in failure tate S. The amunt f flw aigned t path p in bervable failure tate O d i fp. The ttal flw n edge e in failure tate i le, and the flw n edge e crrepnding t demand d i le,d. Variable G(V, E) c e y e S w D u d v d h d P d α p O d d () Pd fp fp le le,d Decriptin TABLE I SUMMARY OF NOTATION netwrk with vertice V and directed edge E capacity f edge e E prpagatin delay n edge e E family f netwrk failure tate netwrk failure tate (et f failed link) weight f netwrk failure tate S et f demand urce f demand d D detinatin f demand d D flw requirement f demand d D path available t demand d D fractin f the demand aigned t path p family f bervable failure tate fr nde u d tate bervable by u d in netwrk failure tate S path available t u d in bervable failure tate O d flw n path p in netwrk failure tate S flw n path p in bervable failure tate O d ttal flw n edge e in netwrk failure tate flw f demand d n edge e in netwrk failure tate B. Minimizing Cngetin Over the Failure State The management ytem gal i t cmpute path and plitting rati that minimize cngetin ver the range f pible failure tate. A cmmn traffic-engineering bjective [11] i t minimize e E Φ(l e/c e ) where l e i the lad n edge e and c e i it capacity. Φ() culd be a cnvex functin f link lad [11], t penalize the mt cngeted link while till accunting fr lad n the remaining link. T place mre

4 emphai n the cmmn failure cenari, each failure tate can be aciated with a weight w. T minimize cngetin acr the failure cenari, the final bjective functin i bj(l 1 e 1 /c e1,...) = S w e E Φ(l e/c e ). (1) Minimizing thi bjective functin i the gal in each f ur ptimizatin prblem in Sectin IV. Hwever, the cntraint that cmplete the prblem frmulatin differ depending n the functinality placed in the underlying ruter. IV. OPTIMIZING THE PATHS AND SPLITTING RATIOS The ptimizatin prblem the management ytem lve depend n the capabilitie f the underlying ruter. On ne extreme, the netwrk culd upprt an ptimal cnfiguratin f path and plitting rati fr every netwrk failure cenari S. While nt calable in practice, the lutin t thi ptimizatin prblem erve a a perfrmance baeline and a a way t cmpute a uitable et f path P d fr each demand d. A mre practical alternative i t have each ingre ruter u d tre plitting rati fr every bervable failure O d. After berving the path failure(), ruter u d wuld witch t the new plitting cnfiguratin fr the remaining path. An even impler alternative i t have a ingle plitting cnfiguratin that i ued acr all failure. In thi apprach, ruter u d imply renrmalize the plitting percentage fr the active path. In thi ectin, we preent the management ytem algrithm fr each f thee three cenari. Since everal f the ptimizatin prblem are NP-hard (a prven in the Appendix), we ue heuritic that (a hwn in Sectin V) achieve nearly ptimal perfrmance in practice. A. Optimal Slutin: Per Netwrk Failure Sate The ideal lutin wuld cmpute the ptimal path and plitting rati eparately fr each failure tate. T avid intrducing explicit variable fr expnentially many path, we frmulate the prblem in term f the amunt f flw le,d frm demand d travering edge e fr failure tate. The ptimal edge lad are btained by lving the fllwing linear prgram: min bj(l 1 e 1 /c e1,...).t. l e = d D l e,d, e 0 = i:e=(i,j) l e,d i:e=(j,i) l e,d d,, j u d, v d h d = i:e=(u d,i) l e,d i:e=(i,u d ) l e,d d, 0 le,d d,, e, (2) where le and le,d are variable. The firt cntraint define the lad n edge e, the ecnd cntraint enure flw cnervatin, the third cntraint enure that the demand are met, and the lat cntraint guarantee flw nn-negativity. An ptimal lutin can be fund in plynmial time uing cnventinal technique fr lving multicmmdity flw prblem. After btaining the ptimal flw n each edge fr all the failure cenari, we ue a tandard decmpitin algrithm t determine the crrepnding path P d and the flw fp n each f them. The decmpitin tart with a et P d that i empty. New unique path are added t the et by perfrming the fllwing decmpitin fr each failure tate. Firt, anntate each edge e with the value le,d. Remve all edge that have 0 value. Then, find a path cnnecting u d and v d. If multiple uch path exit, we ue the path p with the mallet prpagatin delay. Althugh we culd che any f the path frm u d t v d, ur gal i t btain a hrt path a pible. Add thi path p t the et P d and aign t it flw fp equal t the mallet value f the edge n path p. Reduce the value f thee edge accrdingly. Cntinue in thi fahin, remving edge with zer value and finding new path, until there are n remaining edge in the graph. Nte that we can hw by inductin that thi prce cmpletely partitin the flw le,d int path. The decmpitin yield at mt E path fr each netwrk failure tate becaue the weight f at leat ne edge becme 0 whenever a new path i fund. Hence the ttal ize f the et P d i at mt E S. It i difficult t btain a lutin with a tighter bund a we prve in the appendix that it i NP-hard t lve prblem (2) when the number f allwed path i bunded by a cntant J. The ptimal lutin lve the multicmmdity flw prblem, cmpute the reulting path, and fr each failure cenari S aign flw f p t path p P d. Hwever, thi lutin i nt feaible in practice, becaue f the burden it impe n the underlying ruter. Each ingre ruter wuld need t tre a plitting cnfiguratin fr each failure cenari. The number f failure tate can be quite large, epecially when failure cenari culd invlve multiple link. After a failure, the ingre ruter wuld need t learn which link() failed, identify the aciated failure cenari, and witch t the apprpriate plitting cnfiguratin. Thi add cniderable cmplexity t the netwrk element. Yet, the ptimal lutin i till intereting, fr tw rean. Firt, the lutin prvide an upper bund n the perfrmance f the mre practical cheme, enabling u t judge hw effective they are. Secnd, the ptimal path and plitting rati are a ueful building blck in cmputing the netwrk cnfiguratin in ur practical lutin. B. State-Dependent Splitting: Per Obervable Failure T reduce the cmplexity f the netwrk element, each ingre ruter u d culd have a et f plitting rati fr each bervable failure tate O d. Since the path-plitting rati depend n which path in P d have failed, the the ingre ruter mut tre plitting rati fr min( S, 2 Pd ) cenari; frtunately, the number f path P d i typically mall in practice. When the netwrk perfrm uch tate-dependent plitting, the management ytem gal i t find a et f path P d fr each demand and the flw fp n thee path in all bervable tate O d. If the path P d are knwn and fixed, the prblem can be frmulated a a linear prgram:

5 min bj(l 1 e 1 /c e1,...).t. le = e,, = d () fp d D p P d,e p fp h d = p P d d, O d 0 fp d, O d, p P d, (3) we mimic the behavir f the ptimal lutin a clely a pible. We find the plitting rati fr all path p by letting α p = w f p S h d where fp i the flw aigned by the ptimal lutin t path p in netwrk failure tate. Since w = 1, the calculated rati i the weighted average f the plitting rati ued by the ptimal lutin (2). where l e and f p are variable. The firt cntraint define the lad n edge e, the ecnd cntraint guarantee that the demand d i atified in all bervable failure tate, and the lat cntraint enure nn-negativity f flw aigned t the path. The lutin f the ptimizatin prblem (3) can be fund in plynmial time. The prblem becme NP-hard if the et f path {P d } are nt knwn in advance. In fact, a we hw in the Appendix, it i NP-hard even t tell if tw path that allw an ingre ruter t ditinguih tw netwrk failure tate can be cntructed. Therefre, it i NP-hard t cntruct the ptimal et f path fr all ur frmulatin that aume the urce d nt have infrmatin abut the netwrk failure tate. Therefre, we prpe a imple heuritic t find the path: we ue the path that are fund by the decmpitin f the ptimal lutin (2). Thi apprach guarantee that the path are ufficiently divere t enure traffic delivery in all failure tate. Mrever, ince the path allw ptimal lad balancing fr the ptimal lutin (2), they are al likely t enable gd lad balancing fr the ptimizatin prblem (3). C. State-Independent Splitting: Acr All Failure Scenari T further implify the netwrk element, each ingre ruter culd have a ingle cnfiguratin f plitting rati that are ued under any cmbinatin f path failure. Each path p i aciated with a plitting fractin α p. When ne r mre path fail, the ingre ruter u d renrmalize the plitting parameter fr the wrking path t cmpute the fractin f traffic t direct t each f thee path. If the netwrk element implement uch tate-independent plitting, and the path P d are knwn and fixed, the management ytem need t lve the fllwing nn-cnvex ptimizatin prblem: min bj(l 1 e 1 /c e1,...).t. fp α = h p d q P d le = α q fp d D p P d,e p fp d, O d, p P d e,, = d () h d = p P d d, O d 0 fp d, O d, p P d, where l e, f p and α p are variable. The firt cntraint enure that the flw aigned t every available path p i prprtinal t α p. The ther three cntraint are the ame a in (3). Unfrtunately, n tandard ptimizatin technique allw u t cmpute an ptimal lutin efficiently, even when the path P d are fixed. Therefre, we have t rely n heuritic t find bth the candidate path P d and the plitting rati α p. T find the et f candidate path P d, we again ue the ptimal path btained by decmping (2). T find the plitting rati (4) V. EXPERIMENTAL EVALUATION T evaluate the algrithm decribed in the previu ectin, we wrte a imulatr in C++ that call the CPLEX linear prgram lver in AMPL and lve the ptimizatin prblem (2) and (3). We cmpare ur tw heuritic t the ptimal lutin, a imple equal plitting cnfiguratin, and OSPF with the link weight et uing tate-f-the-art ptimizatin technique. Finally, we hw that ur tw heuritic d nt require many path and nly lightly increae end-t-end prpagatin delay. A. Experimental Setup Our imulatin ue a variety f ynthetic tplgie, the Abilene tplgy, a well a the city-level IP backbne tplgy f a tier-1 ISP with a et f failure prvided by the netwrk peratr. The parameter f the tplgie we ued are ummarized in Table II. Synthetic tplgie: the ynthetic tplgie include 2-level hierarchical graph, purely randm graph, and Waxman graph. 2-level hierarchical graph are prduced uing the generatr GT-ITM [12], fr randm graph the prbability f tw edge being cnnected i cntant, and the prbability f having an edge between tw nde in the Waxman graph decay expnentially with the ditance f the nde. Thee tplgie al appear in [13]. Abilene tplgy: the tplgy f the Abilene netwrk and a meaured traffic matrix are ued. We ue the true edge capacitie f 10 Gbp. Tier-1 IP backbne: the city-level IP backbne f a tier-1 ISP i ued. In ur imulatin, we ue the real link capacitie and meaured traffic demand. We al btained the link prpagatin delay. Name Tplgy Nde Edge Demand hier50a hierarchical 50 148 2,450 hier50b hierarchical 50 212 2,450 rand50 randm 50 228 2,450 rand50a randm 50 245 2,450 rand100 randm 100 403 9,900 wax50 Waxman 50 169 2,450 wax50a Waxman 50 230 2,450 abilene backbne 11 28 253 tier-1 backbne 50 180 625 TABLE II SYNTHETIC AND REALISTIC NETWORK TOPOLOGIES. The cllectin f netwrk failure S fr the ynthetic tplgie and the Abilene netwrk cntain ingle edge

6 failure and the n-failure cae. Tw experiment with different cllectin f failure are perfrmed n the tier-1 IP backbne. In the firt experiment, ingle edge failure are ued. In the ecnd experiment, the cllectin f failure al cntain Shared Rik Link Grup (SRLG), link failure that ccur imultaneuly. SRLG were btained frm the netwrk peratr databae that cntain 954 failure with the larget failure affecting 20 link imultaneuly. The failure weight w were et t 0.5 fr the n-failure cae, and the ther failure weight were et equal that the um f all the weight i 1. The et f demand D in the Abilene netwrk and the tier-1 backbne were btained by ampling Netflw data meaured n Nv. 15th 2005 and May 22nd 2009, repectively. Fr the ynthetic tplgie, we che the ame traffic demand a in [13]. T imulate the algrithm in envirnment with increaing cngetin, we repeat all experiment everal time while unifrmly increaing the traffic demand. Fr the ynthetic tplgie we tart with the riginal demand and cale them up t twice the riginal value. A the average link utilizatin in Abilene and the tier-1 tplgy i lwer than in the ynthetic tplgie, we cale the demand in thee realitic tplgie up t three time the riginal value. In ur experiment we ue the piecewie linear penalty functin defined by Φ(0) = 0 and it derivative: Φ (l) = 1 fr 0 l < 0.333 3 fr 0.333 l < 0.667 10 fr 0.667 l < 0.9 70 fr 0.9 l < 1 500 fr 1 l < 1.1 5000 fr 1.1 l < Thi penalty functin wa intrduced in [11]. The functin can be viewed a mdeling retranmiin delay caued by packet le. The ct i mall fr lw utilizatin, increae prgreively a the utilizatin apprache 100%, and explde abve 110%. Our imulatin calculate the bjective value f the ptimal lutin, tate-independent and tate-dependent plitting, and equal plitting. Equal plitting i a variant f tate-independent plitting that plit the flw evenly n the available path. We al calculate the bjective achieved by the hrtet path ruting f OSPF with ptimal link weight. Thee link weight were calculated uing the tate-f-the-art ptimizatin decribed in [13], and thee ptimizatin take int cnideratin the et f failure tate S. T demntrate that ur lutin de nt increae the prpagatin delay ignificantly, we al calculate the average prpagatin delay weighted by the lad n the rute in the tier-1 IP backbne. Our imulatin were perfrmed uing CPLEX verin 11.2 n a 1.5 GHz Intel Itanium 2 prcer. Slving the linear prgram (2) fr a ingle failure cae in the tier-1 tplgy take 4 ecnd, and lving the ptimizatin (3) take abut 16 minute. A tier-1 netwrk peratr culd perfrm all the calculatin required t btain an ptimal et f path and ruter cnfiguratin fr the entire city-level netwrk tplgy in le than 2 hur. bjective value bjective value bjective value 1.2e+06 1e+06 8e+05 6e+05 4e+05 2e+05 8e+06 7e+06 6e+06 5e+06 4e+06 3e+06 2e+06 0 1.2e+07 1e+07 8e+06 6e+06 4e+06 2e+06 OSPF equal plitting tate indep. plitting tate dep. plitting glbal ptimum 1 1.2 1.4 1.6 1.8 2 OSPF equal plitting tate indep. plitting tate dep. plitting glbal ptimum netwrk traffic 1 1.5 2 2.5 3 OSPF equal plitting tate indep. plitting tate dep. plitting glbal ptimum netwrk traffic 1 1.5 2 2.5 3 netwrk traffic Fig. 2. Frm tp t bttm the traffic engineering bjective in the hierarchical tplgy hier50a, tier-1 tplgy with ingle edge failure, and tier-1 tplgy with SRLG, repectively. The perfrmance f the ptimal lutin and tatedependent plitting i nearly identical. B. Perfrmance Evaluatin Aviding cngetin and packet le during planned and unplanned failure i the central gal f traffic engineering. Our traffic engineering bjective meaure cngetin acr

7 1 1 0.8 0.8 0.6 0.6 cdf cdf 0.4 0.4 0.2 0 2 3 4 5 6 7 8 9 10 number f path abilene wax50 tier-1 hier50a tier-1 (SRLG) 0.2 0 2 3 4 5 6 7 8 9 10 number f path traffic=1.0 traffic=1.8 traffic=2.6 Fig. 3. The number f path ued in variu tplgie n the left, and in the tier-1 tplgy with SRLG n the right. The cumulative ditributin functin hw that the number f path i almt independent f the traffic lad in the netwrk, but i larger fr bigger mre well-cnnected tplgie. all the cnidered failure cae. The bjective a a functin f the caled-up demand i depicted in Figure 2. The reult which were btained n the hierarchical and tier-1 tplgie are repreentative, we made imilar bervatin fr all the ther tplgie. In Figure 2, the perfrmance f tate-dependent plitting and the ptimal lutin i virtually inditinguihable in all cae. State-independent plitting i le phiticated and de nt allw cutm lad balancing rati fr ditinct failure, and therefre it perfrmance i wre cmpared t the ptimum. Hwever, the perfrmance cmpare well with that f OSPF. The benefit f tateindependent plitting i that it ue the ame et f divere path a the ptimal lutin. It i nt urpriing that the imple equal plitting algrithm achieve the wrt perfrmance. We berve that OSPF achieve a mewhat wre perfrmance than tate-independent and tate-dependent plitting a the lad increae. We made thi bervatin depite the fact that we btained a cutm et f OSPF link weight fr each netwrk lad we evaluated. A pible explanatin i that OSPF ruting, in which each ruter plit the lad evenly between the mallet weight path, de nt allw much flexibility in ching divere rute and de nt allw uneven plitting rati. Slutin with few path are preferred a they decreae the number f tunnel that have t be managed, and reduce the ize f the ruter cnfiguratin. Hwever, a ufficient number f path mut be available t each cmmdity t avid failure and t reduce cngetin. We berve that the number f path ued by ur algrithm i mall. We recrd the number f path ued by each demand, and plt the ditributin in Figure 3. Nt urpriingly, the number f path i greater fr larger and mre divere tplgie. 92% f the demand in the hierarchical tplgy ue 7 r fewer path, and fewer than 10 path are needed in the tier-1 backbne tplgy fr almt all demand. Further, Figure 3 hw that the number f path nly increae lightly a we cale up the amunt f traffic in the netwrk. Thi mall increae i caued by hifting me Algrithm Single edge failure SRLG failure avg (m) tdev avg (m) tdev Optimum 30.99 0.23 31.03 0.22 State dep. plitting 30.86 0.21 30.96 0.17 State indep. plitting 31.00 0.23 31.11 0.22 Equal plitting 33.82 0.22 39.70 0.69 OSPF (ptimized) 30.70 0.54 30.71 0.50 OSPF (current) 28.45 0 28.49 0 TABLE III PROPAGATION DELAY (AVERAGE AND STANDARD DEVIATION) IN THE TIER-1 BACKBONE NETWORK traffic t lnger path a the hrt path becme cngeted. Minimizing the delay experienced by the uer i ne f the imprtant gal f netwrk peratr. Therefre, we calculated the average prpagatin delay f all the evaluated algrithm. Thee reult, which exclude cngetin delay, are ummarized in Table III. We berve that the delay f OSPF with ptimized link weight, tate-dependent and tate-independent plitting i almt identical at arund 31 m. Thee value wuld atify the 37 m requirement pecified in the SLA f the tier-1 netwrk. Mrever, we demntrate that thee value are nt ignificantly higher than thee experienced by the netwrk uer tday. We repeated ur imulatin n the tier-1 tplgy uing the real OSPF weight which are ued by the netwrk peratr. Thee value are chen t prvide a tradeff between traffic engineering and hrtet delay ruting, and reulted in average delay f 28.45 and 28.49 m fr the tw tier-1 failure et. In um, we berve that the bjective value f tatedependent plitting very clely track the ptimal bjective. Fr thi rean, thi lutin i ur favrite. Althugh tateindependent plitting ha a mewhat wre perfrmance epecially a the netwrk lad increae beynd current level, it culd be attractive due t it implicity.

8 VI. RELATED WORK Mt f the related wrk cnider either failure recvery r traffic engineering alne. Traffic engineering withut failure recvery in the cntext f MPLS i tudied in [14] [18]. [14] utilize traffic plitting t minimize end-t-end delay and l rate. Hwever, an algrithm fr ptimal path electin i nt prvided. [15] and [16] minimize the maximum link utilizatin while atifying the requeted traffic demand. [17] and [18] avid netwrk cngetin by adaptively balancing the lad amng multiple path baed n meaurement and analyi f path cngetin. Lcal and glbal path prtectin in MPLS ha been a fruitful area f reearch. In lcal prtectin the backup path take the hrtet path that avid the utage lcatin frm a pint f lcal repair t the tail-end ruter r t the merge pint with the primary path. The IETF RFC 4090 [6] fcue n defining ignaling extenin t etablih the backup path, but leave the iue f bandwidth reervatin and ptimal rute electin pen. In [1] the hrtet path that avid the failure i ued, and [19] and [20] attempt t find an ptimal backup path with the gal f reducing netwrk verbuild. While thee prpal achieve certain ucce in reducing the netwrk verbuild, lcal prtectin i necearily le effective at reducing verbuild than glbal prtectin becaue it de nt allw prper lad balancing n end-t-end path. Glbal path prtectin in MPLS allw reruting n endt-end path a i utlined in IETF RFC 3469 [7]. Wrk that decribe hw t manage retratin bandwidth and elect ptimal path i [21], [22] and [23]. While ur lutin al ue glbal prtectin t rerute arund failure, the bigget difference i that mt f the related wrk ditinguihe primary and backup path and nly ue a backup path when the primary path fail. In cntrat, ur lutin balance the lad acr many path even befre failure ccur. The nly attempt t integrate failure recvery and lad balancing acr multiple path either nly ue alternate path when primary rute d nt wrk [24], r they require explicit cngetin feedback frm the netwrk and d nt prvide algrithm t find the ptimal et f path [25], [26]. Cmputatinal cmplexity reult f the ptimizatin prblem related t failure recvery are f great interet bth t the netwrk algrithm deigner and t the thery cmmunity. NPcmpletene f ptimizatin prblem with failure recvery have been tudied, e.g., in [27] and [28]. VII. CONCLUSION In thi paper we prpe a mechanim that cmbine path prtectin and traffic engineering t enable reliable data delivery in the preence f link failure. We frmalize the prblem by prviding everal ptimizatin theretic frmulatin that differ in the capabilitie they require f the netwrk ruter. Fr each f the frmulatin, we preent algrithm and heuritic that allw the netwrk peratr t find a et f ptimal end-t-end path and lad balancing rule. Our extenive imulatin n the IP backbne f a tier-1 ISP and n a range f ynthetic tplgie demntrate the attractive prpertie f ur lutin. Firt, tate-dependent plitting achieve lad balancing perfrmance cle t the theretical ptimum, while tate-independent plitting ften ffer cmparable perfrmance and a very imple etup. Secnd, uing ur lutin de nt ignificantly increae prpagatin delay cmpared t the hrtet path ruting f OSPF. We are currently extending ur imulatin t include a range f meaured traffic matrice, and t evaluate the lutin n a realitic datacenter tplgy. In additin t failure reilience and favrable traffic engineering prpertie which we demntrate, ur architecture ha the ptential t implify ruter deign and reduce peratin ct fr ISP a well a peratr f datacenter and enterprie netwrk. REFERENCES [1] J.-P. Vaeur, M. Pickavet, and P. Demeeter, Netwrk Recvery: Prtectin and Retratin f Optical, SONET-SDH, IP, and MPLS, pp. 397 422. San Francic, CA: Mrgan Kaufmann Publiher Inc., 2004. [2] E. Ren, A. Viwanathan, and R. Calln, Multiprtcl label witching architecture, 2001. IETF RFC 3031. [3] JUNOS: MPLS fat rerute lutin, netwrk peratin guide, 2007. [4] E. Obrne and A. Simha, Traffic Engineering with MPLS. Indianapli, IN: Cic Pre, 2002. [5] I. P. Kaminw and T. L. Kch, The Optical Fiber Telecmmunicatin IIIA. New Yrk: Academic Pre, 1997. [6] P. Pan, G. Swallw, and A. Atla, Fat rerute extenin t RSVP-TE fr LSP tunnel, 2005. IETF RFC 4090. [7] V. Sharma and F. Helltrand, Framewrk fr multi-prtcl label witching (MPLS)-baed recvery, 2003. IETF RFC 3469. [8] A. Greenberg et al., VL2: A calable and flexible data center netwrk, in Prceeding f ACM SIGCOMM, 2009. T appear. [9] D. Katz and D. Ward, Bidirectinal frwarding detectin. IETF Internet Draft, February 2009. [10] A. Markpulu, G. Iannaccne, S. Bhattacharyya, C.-N. Chuah, Y. Ganjali, and C. Dit, Characterizatin f failure in an peratinal IP backbne netwrk, IEEE/ACM Tran. Netw., vl. 16, n. 4, pp. 749 762, 2008. [11] B. Frtz and M. Thrup, Increaing Internet capacity uing lcal earch, Cmputatinal Optimizatin and Applicatin, vl. 29, n. 1, pp. 13 48, 2004. [12] E. W. Zegura, GT-ITM: Gergia Tech internetwrk tplgy mdel (ftware), 1996. [13] B. Frtz and M. Thrup, Optimizing OSPF/IS-IS weight in a changing wrld, IEEE Jurnal n Selected Area in Cmmunicatin, vl. 20, pp. 756 767, May 2002. [14] E. Dinan, D. Awduche, and B. Jabbari, Analytical framewrk fr dynamic traffic partitining in MPLS netwrk, in IEEE Internatinal Cnference n Cmmunicatin, vl. 3, pp. 1604 1608, 2000. [15] Y. Sek, Y. Lee, Y. Chi, and C. Kim, Dynamic cntrained multipath ruting fr MPLS netwrk, in Internatinal Cnference n Cmputer Cmmunicatin and Netwrk, pp. 348 353, 2001. [16] Y. Lee, Y. Sek, Y. Chi, and C. Kim, A cntrained multipath traffic engineering cheme fr MPLS netwrk, in IEEE Internatinal Cnference n Cmmunicatin, vl. 4, pp. 2431 2436, 2002. [17] A. Elwalid, C. Jin, S. Lw, and I. Widjaja, MATE: MPLS adaptive traffic engineering, in Prceeding f INFOCOM, vl. 3, pp. 1300 1309, 2001. [18] J. Wang, S. Patek, H. Wang, and J. Liebeherr, Traffic engineering with AIMD in MPLS netwrk, in IEEE Internatinal Wrkhp n Prtcl fr High Speed Netwrk, pp. 192 210, 2002. [19] H. Sait and M. Yhida, An ptimal recvery LSP aignment cheme fr MPLS fat rerute, in Internatinal Telecmmunicatin Netwrk Strategy and Planning Sympium (Netwrk), pp. 229 234, 2002. [20] D. Wang and G. Li, Efficient ditributed bandwidth management fr MPLS fat rerute, IEEE/ACM Tran. Netw., vl. 16, n. 2, pp. 486 495, 2008.

9 [21] M. Kdialam and T. V. Lakhman, Dynamic ruting f retrable bandwidth-guaranteed tunnel uing aggregated netwrk reurce uage infrmatin, IEEE/ACM Tran. Netw., vl. 11, n. 3, pp. 399 410, 2003. [22] G. Li, D. Wang, C. Kalmanek, and R. Dverpike, Efficient ditributed retratin path electin fr hared meh retratin, IEEE/ACM Tran. Netw., vl. 11, n. 5, pp. 761 771, 2003. [23] Y. Liu, D. Tipper, and P. Siripngwutikrn, Apprximating ptimal pare capacity allcatin by ucceive urvivable ruting, IEEE/ACM Tran. Netw., vl. 13, n. 1, pp. 198 211, 2005. [24] H. Sait, Y. Miya, and M. Yhida, Traffic engineering uing multiple multipint-t-pint LSP, in Prceeding f INFOCOM, vl. 2, pp. 894 901, 2000. [25] B. A. Mvichff, C. M. Laga, and H. Che, End-t-end ptimal algrithm fr integrated QS, traffic engineering, and failure recvery, IEEE/ACM Tran. Netw., vl. 15, pp. 813 823, Nv. 2007. [26] C. M. Laga, H. Che, and B. A. Mvichff, Adaptive cntrl algrithm fr decentralized ptimal traffic engineering in the Internet, IEEE/ACM Tran. Netw., vl. 12, n. 3, pp. 415 428, 2004. [27] A. Tmazewki, M. Pir, and M. Ztkiewicz, On the cmplexity f reilient netwrk deign, Netwrk, 2009 (in pre). [28] D. Cudert, P. Datta, S. Perenne, H. Rivan, and M.-E. Vge, Shared rik reurce grup: Cmplexity and apprximability iue, Parallel Prceing Letter, vl. 17, n. 2, pp. 169 184, 2007. [29] S. Frtune, J. Hpcrft, and J. Wyllie, The directed ubgraph hmemrphim prblem, Ther. Cmput. Sci., vl. 10, n. 2, pp. 111 121, 1980. APPENDIX In thi Appendix, we hw that tw prblem are NP-hard: FAILURE STATE DISTINGUISHING INSTANCE: A directed graph G = (V, E), a urce and detinatin vertice u, v V, and tw et, E. QUESTION: I there a imple directed path P frm u t v that cntain edge frm ne and nly ne f the et and? BOUNDED PATH LOAD BALANCING INSTANCE: A directed graph G = (V, E) with a pitive ratinal capacity c e fr each edge e E, a cllectin S f ubet E f failure tate with a ratinal weight w fr each S, a et f triple (u d, v d, h d ), 1 d k, crrepnding t demand, where h d unit f demand d need t be ent frm urce vertex u d V t detinatin vertex v d V, an integer bund J n the number f path that can be ued between any urce-detinatin pair, a piecewielinear increaing ct functin Φ(l) mapping edge lad l t ratinal, and an verall ct bund B. QUESTION: Are there J (r fewer) path between each urcedetinatin pair uch that the given demand can be partitined between the path in uch a way that the the ttal ct (um f Φ(l) ver all edge and weighted failure tate a decribed in the text) i B r le? T prve that a prblem X i NP-hard, we mut hw that fr me knwn NP-hard prblem Y, any intance y f Y can be tranfrmed int an intance x f X in plynmial time, with the prperty that the anwer fr y i ye if and nly if the anwer fr x i ye. Bth ur prblem can be prved NPhard by tranfrmatin frm the fllwing prblem, prved NP-hard by Frtune, Hpcrft, and Wyllie [29]. DISJOINT DIRECTED PATHS INSTANCE: A directed graph G(V, E) and ditinguihed vertice u 1, v 1, u 2, v 2 V. QUESTION: Are there directed path P 1 frm u 1 t v 1 and P 2 frm u 2 t v 2 uch that P 1 and P 2 are vertex-dijint? Therem 1: The FAILURE STATE DISTINGUISHING prblem i NP-hard. Prf. Suppe we are given an intance G = (V, E), u 1, v 1, u 2, v 2 f DISJOINT DIRECTED PATHS. Our cntructed intance f FAILURE STATE DISTINGUISHING cnit f the graph G = (V, E ), where E = E {(v 1, u 2 )}, with u = u 1, v = v 2, = φ, and = {(v 1, u 2 )}. Given thi chice f and, a imple directed path frm u t v that ditinguihe the tw tate mut cntain the edge (v 1, u 2 ). We claim that uch a path exit if and nly if there are vertex-dijint directed path P 1 frm u 1 t v 1 and P 2 frm u 2 t v 2. Suppe a ditinguihing path P exit. Then it mut cnit f f three egment: a path P 1 frm u = u 1 t v 1, the edge (v 1, u 2 ), and then a path P 2 frm u 2 t v = v 2. Since it i a imple path, P 1 and P 2 mut be vertex-dijint. Cnverely, if vertex-dijint path P 1 frm u 1 t v 1 and P 2 frm u 2 t v 2 exit, then the path P that cncatenate P 1 fllwed by (v 1, u 2 ) fllwed by P 2 i ur deired ditinguihing path. Therem 2: The BOUNDED PATH LOAD BALANCING prblem i NP-hard even if there are nly tw cmmditie (k = 2), nly ne path i allwed fr each (J = 1), and there i nly ne failure tate. Prf. Fr thi reult we ue the variant f DISJOINT DIRECTED PATHS in which we ak fr edge-dijint rather than vertex-dijint path. The NP-hardne f thi variant i eay t prve, uing a cntructin in which each vertex x f G i replaced by a pair f new vertice in x and ut x, and each edge (x, y) i replaced by the edge (ut x, in y ). Suppe we are given an intance G = (V, E), u 1, v 1, u 2, v 2 f the edge-dijint variant f DISJOINT DIRECTED PATHS. Our cntructed intance f BOUNDED PATH LOAD BALANC- ING i baed n the ame graph, with each edge e given capacity c e = 1, with the ingle failure tate = φ (i.e., the tate with n failure), with w = 1, and with demand repreented by the triple (u 1, v 1, 1) and (u 2, v 2, 1). The ct functin Φ ha derivative Φ (l) = 1, 0 l 1, and Φ (l) = E, l > 1. Our target verall ct bund i B = E. Nte that if the deired dijint path exit, then we can ue P 1 t end the required unit f traffic frm u 1 t v 1, and P 2 t end the required unit f traffic frm u 2 t v 2. Since the path are edge-dijint, n edge will carry mre than ne unit f traffic, the ct per edge ued will be 1, and the ttal number f edge ued can be at mt E. Thu the pecified ct bund B = E can be met. On the ther hand, if n uch pair f path exit, then we mut che path P 1 and P 2 that hare at leat ne edge, which will carry tw unit f flw, fr an verall ct f at leat E + 1, jut fr that edge. Thu if there i a lutin with ct E r le, the deired dijint path mut exit. It i nt difficult t ee that adding mre path, failure tate, r cmmditie cannt make the prblem eaier. Nte, hwever, that thi de nt imply that the prblem fr the precie ct functin Φ preented in the text i NP-hard. It de, hwever, mean that, auming P NP, any efficient algrithm fr that Φ wuld have t explit the particular feature f that functin.