Towards Realizing a Low Cost and Highly Available Datacenter Power Infrastructure



Similar documents
Comparing Availability of Various Rack Power Redundancy Configurations

Comparing Availability of Various Rack Power Redundancy Configurations

Software Engineering and Development

Questions & Answers Chapter 10 Software Reliability Prediction, Allocation and Demonstration Testing

STUDENT RESPONSE TO ANNUITY FORMULA DERIVATION

Channel selection in e-commerce age: A strategic analysis of co-op advertising models

High Availability Replication Strategy for Deduplication Storage System

An Infrastructure Cost Evaluation of Single- and Multi-Access Networks with Heterogeneous Traffic Density

Converting knowledge Into Practice

Concept and Experiences on using a Wiki-based System for Software-related Seminar Papers

Chapter 3 Savings, Present Value and Ricardian Equivalence

A Comparative Analysis of Data Center Network Architectures

est using the formula I = Prt, where I is the interest earned, P is the principal, r is the interest rate, and t is the time in years.

Efficient Redundancy Techniques for Latency Reduction in Cloud Systems

Things to Remember. r Complete all of the sections on the Retirement Benefit Options form that apply to your request.

An Approach to Optimized Resource Allocation for Cloud Simulation Platform

HEALTHCARE INTEGRATION BASED ON CLOUD COMPUTING

The transport performance evaluation system building of logistics enterprises

Automatic Testing of Neighbor Discovery Protocol Based on FSM and TTCN*

Data Center Demand Response: Avoiding the Coincident Peak via Workload Shifting and Local Generation

An Epidemic Model of Mobile Phone Virus

9:6.4 Sample Questions/Requests for Managing Underwriter Candidates

Energy Efficient Cache Invalidation in a Mobile Environment

Evaluating the impact of Blade Server and Virtualization Software Technologies on the RIT Datacenter

INITIAL MARGIN CALCULATION ON DERIVATIVE MARKETS OPTION VALUATION FORMULAS

Cloud Service Reliability: Modeling and Analysis

Promised Lead-Time Contracts Under Asymmetric Information

Database Management Systems

An Analysis of Manufacturer Benefits under Vendor Managed Systems

Financing Terms in the EOQ Model

ON THE (Q, R) POLICY IN PRODUCTION-INVENTORY SYSTEMS

An Introduction to Omega

An application of stochastic programming in solving capacity allocation and migration planning problem under uncertainty

The Supply of Loanable Funds: A Comment on the Misconception and Its Implications

Office of Family Assistance. Evaluation Resource Guide for Responsible Fatherhood Programs

Carter-Penrose diagrams and black holes

Uncertain Version Control in Open Collaborative Editing of Tree-Structured Documents

Towards Automatic Update of Access Control Policy

Distributed Computing and Big Data: Hadoop and MapReduce

Memory-Aware Sizing for In-Memory Databases

Scheduling Hadoop Jobs to Meet Deadlines

IBM Research Smarter Transportation Analytics

Optimizing Content Retrieval Delay for LT-based Distributed Cloud Storage Systems

Effect of Contention Window on the Performance of IEEE WLANs

Reduced Pattern Training Based on Task Decomposition Using Pattern Distributor

Ilona V. Tregub, ScD., Professor

The impact of migration on the provision. of UK public services (SRG ) Final Report. December 2011

Modeling and Verifying a Price Model for Congestion Control in Computer Networks Using PROMELA/SPIN

Advanced Control of Active Filters. in a Battery Charger Application. Martin Bojrup

Model-Driven Engineering of Adaptation Engines for Self-Adaptive Software: Executable Runtime Megamodels

Approximation Algorithms for Data Management in Networks

THE DISTRIBUTED LOCATION RESOLUTION PROBLEM AND ITS EFFICIENT SOLUTION

ENABLING INFORMATION GATHERING PATTERNS FOR EMERGENCY RESPONSE WITH THE OPENKNOWLEDGE SYSTEM

SELF-INDUCTANCE AND INDUCTORS

Continuous Compounding and Annualization

Research on Risk Assessment of the Transformer Based on Life Cycle Cost

Give me all I pay for Execution Guarantees in Electronic Commerce Payment Processes

How to recover your Exchange 2003/2007 mailboxes and s if all you have available are your PRIV1.EDB and PRIV1.STM Information Store database

The Role of Gravity in Orbital Motion

Analyzing Ballistic Missile Defense System Effectiveness Based on Functional Dependency Network Analysis

Review Graph based Online Store Review Spammer Detection

A framework for the selection of enterprise resource planning (ERP) system based on fuzzy decision making methods

How to create RAID 1 mirroring with a hard disk that already has data or an operating system on it

Self-Adaptive and Resource-Efficient SLA Enactment for Cloud Computing Infrastructures

Define What Type of Trader Are you?

Optimal Peer Selection in a Free-Market Peer-Resource Economy

Pessu Behavior Analysis for Autologous Fluidations

883 Brochure A5 GENE ss vernis.indd 1-2

How Much Should a Firm Borrow. Effect of tax shields. Capital Structure Theory. Capital Structure & Corporate Taxes

Over-encryption: Management of Access Control Evolution on Outsourced Data

An Efficient Group Key Agreement Protocol for Ad hoc Networks

A Capacitated Commodity Trading Model with Market Power

Multicriteria analysis in telecommunications

How To Use A Network On A Network With A Powerline (Lan) On A Pcode (Lan On Alan) (Lan For Acedo) (Moe) (Omo) On An Ipo) Or Ipo (

Chapter 2 Valiant Load-Balancing: Building Networks That Can Support All Traffic Matrices

Alarm transmission through Radio and GSM networks

Financial Planning and Risk-return profiles

CONCEPT OF TIME AND VALUE OFMONEY. Simple and Compound interest

Questions for Review. By buying bonds This period you save s, next period you get s(1+r)

Exam #1 Review Answers

Spirotechnics! September 7, Amanda Zeringue, Michael Spannuth and Amanda Zeringue Dierential Geometry Project

Secure Smartcard-Based Fingerprint Authentication

30 H. N. CHIU 1. INTRODUCTION. Recherche opérationnelle/operations Research

Strength Analysis and Optimization Design about the key parts of the Robot

METHODOLOGICAL APPROACH TO STRATEGIC PERFORMANCE OPTIMIZATION

Chapter 11: Aggregate Demand II, Applying the IS-LM Model Th LM t

Loyalty Rewards and Gift Card Programs: Basic Actuarial Estimation Techniques

Supply chain information sharing in a macro prediction market

Semipartial (Part) and Partial Correlation

A formalism of ontology to support a software maintenance knowledge-based system

Ignorance is not bliss when it comes to knowing credit score

Timing Synchronization in High Mobility OFDM Systems

Definitions and terminology

Mining Relatedness Graphs for Data Integration

Top K Nearest Keyword Search on Large Graphs

Voltage ( = Electric Potential )

HIGH AVAILABILITY SOLUTION: RESOURCE USAGE MANAGEMENT IN VIRTUALIZED SOFTWARE AGING

FI3300 Corporate Finance

Adaptive Queue Management with Restraint on Non-Responsive Flows

The LCOE is defined as the energy price ($ per unit of energy output) for which the Net Present Value of the investment is zero.

Transcription:

Towads Realizing a Low Cost and Highly Available Datacente Powe Infastuctue Siam Govindan, Di Wang, Lydia Chen, Anand Sivasubamaniam, and Bhuvan Ugaonka The Pennsylvania State Univesity. IBM Reseach Zuich {sgovinda,diw5108}@cse.psu.edu,yic@zuich.ibm.com,{anand,bhuvan}@cse.psu.edu Abstact. Realizing highly available datacente powe infastuctue is an extemely expensive poposition with costs moe than doubling as we move fom thee 9 s (Tie-1) to six 9 s (Tie-4) of availability. Existing appoaches only conside the cost/availability tade-off fo a esticted set of powe infastuctue configuations, elying mainly on component edundancy. A numbe of additional knobs such as centalized vs. distibuted component placement and powe-feed inteconnect topology also exist, whose impact has only been studied in limited foms. In this pape, we develop detailed datacente availability models using Continuous-time Makov Chains and Reliability Block Diagams to quantify the cost-availability tade-off offeey these powe infastuctue knobs. 1. INTRODUCTION AND MOTIVATION It is now widely ecognized that powe consumption oatacentes is a seious and gowing poblem fom the cost, scalability and eco-footpint viewpoints. EPA has pojected the electicity cost of poweing the nation s datacentes at $7.4 billion fo 2011. Ove aneyond the electicity consumption, powe also plays a dominant ole in capital expenditues fo povisioning the infastuctue to sustain the peak daw by the datacente. Fo instance, povisioning the powe infastuctue fo a 10 MW datacente costs aound $150 Million [3, 6] - in fact amotizing this monthly oveshadows the electicity bill. A oot cause fo this high cost in the powe infastuctue is the necessity of poviding edundancy in case of any failues in ode to ensue uninteupted opeation of the IT equipment(it equipment include seves, stoage and netwok devices). Table 1 illustates the powe infastuctue cost (shown on a pe ack basis) incease as we pogessively move fom a basic Tie-1 datacente Pemission to make digital o had copies of all o pat of this wok fo pesonal o classoom use is ganted without fee povided that copies ae not made oistibuted fo pofit o commecial advantage and that copies beathisnoticeandthefullcitationonthefistpage. Tocopyothewise,to epublish, to post on seves o to edistibute to lists, equies pio specific pemission and/o a fee. HotPowe 11, Octobe 23, 2011, Cascais, Potugal. Copyight 2011 ACM 978-1-4503-0981-3/11/10...$5.00. (with little/no edundancy) to a highly edundant Tie-4 datacente, whee the cost moe than doubles (souce: [2]). Thegoalofthispapeistoundestandandanalyze the cost amifications of powe infastuctue availability, and use this undestanding to answe the question can we attain the availability of a highe Tieatacente at a substantially lowe cost?" Befoe getting to the IT equipment, powe flows though seveal infastuctual components each seving a diffeent pupose. These components include souces (e.g. diesel geneatos), batteies/units, and tansfomes. Taditional appoach of eplicating these expensive components to povide high availability amplifies the costsubstantially. Itisnotcleawhethethisisthemost cost-effective way of ealizing the equied availability taget. Instead, it is impotant to be able to systematically define and analyze diffeent powe infastuctue configuations to quantify the cost-availability tadeoffs. Tie # Availability Cost/Rack Tie-1 0.999200 $18000 Tie-2 0.999300 $24000 Tie-3 0.999989 $30000 Tie-4 0.999999 $42000 Table 1: Datacente powe infastuctue cost moe than doubles while tansitioning fom a low availability datacente to a highly available datacente. Apat fom the conventional appoach of intoducing edundancy in the components, thee ae 3 main mechanisms/knobs fo configuing the powe infastuctue, each of which has an impact on the cost, complexity and esulting availability. The fist consideation is the issue of whee in the powe infastuctue hieachy to place each of these components. Fo instance, most existing powe infastuctue uses centalized units, while thee ae datacentes(such as those at Google[5]) which choose to place units at each seve instead. The second consideation is elated to the capacity of thesecomponents-afewofhighcapacityomanywith lowe capacity? Finally, the connectivity between successive stages of the hieachy can also have a cucial

(a) Centalized Utility ATS Diesel Geneato PDU PDU PDU PDU Rack cluste (b) PDU-level - wapped Utility ATS Diesel Geneato PDU PDU PDU PDU Rack cluste Figue 1: (a) Centalized with 1+1 edundancy. PDUs ae connected to thei ack clustes using the oneone powe-feed topology(b) Distibuted at PDU-level. to Rack cluste is connected using wapped topology. effect on cost-availability tade-off (e.g. [9]). Futhe, each of these knobs is not independent, and can inteact in complex ways with the othes, having a futhe consequence on availability. What ae the powe infastuctue paametes that impact availability? By how much? At what cost? Can we come up with a powe infastuctue blue-pint to meet the availability tagets? Itisveyimpotanttobeabletoquantifytheavailability of a datacente leveaging these knobs to meet the equiements in a cost-effective manne. To ou knowledge, thee has been no pio wok on developing a set of tools to compehensively evaluate this design space quantitatively, even though thee have been a few isolated studies pointing out eliability issues with specific configuations [2, 8, 9]. In this pape, we develop detailed Makov-chain and Reliability Block Diagam(RBD) based availability models to systematically evaluate the cost-availability tade-off involved in constucting a datacente powe infastuctue using these ich set of knobs. 2. MODELING THE AVAILABILITY OF POWER INFRASTRUCTURE 2.1 Datacente Powe Hieachy As shown in Figue 1(a), powe entes the datacente though a utility substation which seves as its pimay powe souce. Datacentes also employ a Diesel Geneato unit (DG) which acts as a seconday backup powe souce upon a utility failue. An Automatic Tansfe Switch (ATS) is employed to automatically switch between these two souces. Upon a utility failue, it takes about 10-20 seconds (the statup time) fo the DG to get activated, befoe it can supply powe. Uninteupted Powe Supply () units ae typically employed to bidge this time gap between utility failue and DG activation. batteies typically have a untime(eseve chage level) of about 10 minutes to powe the datacente. Datacentes typically employ double-convesion es, which have zeo tansfe-time (unlike standby ) to batteies upon a utility failue. Since the units ae always involved in the double-convesion po- cess(evenwhentheyaenotusedtopowethedatacen- te), thei failue will ende the whole datacentenavailable. Powe fom the units is fed to seveal Powe Distibution Units (PDUs) which oute powe to seveal acks that host IT equipment. We efe to the set of acks associated with a given PDU as aack cluste in Figue 1. The powe infastuctue with all these components is often viewed as a hieachy oiffeent levels, e.g., in Figue 1(a), utility and DG fom the topmost level, ATS foms the next level, the 2 units fom the thid level, the 4 PDUs fom the fouth level and finally, the ack clustes fom the last level. Failue of one o moe of these powe infastuctue components may esult in powenavailability to the IT equipment. A numbe of factos impact the availability of powe infastuctue components, which we discuss below, Component Redundancy: Redundancy in the powe infastuctue components is typically incopoated to toleate one o moe failues. Let N denote the numbe of components that ae equied at a paticula level of the powe hieachy to sustain the oveall datacente powe load. Then N+M denotes the edundancy configuation whee M component failues can be toleated out of atotal of N+M components. Placement: Placement of units in the powe hieachy can have a significant impact on datacente availability. Centalized units ae connected using a paallel bus and ae placed above the PDUs as shown in Figue 1(a). In this pape, we also conside a vaiety oistibuted placements. Figues 1(b) shows a PDU-level distibuted placement. Similaly acklevel and seve-level placements employ units at each ack and each seve, espectively. Powe-feed topology: The connectivity configuation between components placed in two consecutive levels of the powe hieachy impacts availability. In geneal, dense connectivity, accompaniey lage component capacity esults in impoved availability. In this pape, we conside fou powe-feed topologies,(i) one-one(between PDU and ack cluste in Figue 1(a)),(ii) wapped (between and ack cluste in Figue 1(b)), (iii) sepentine, and,(iv) fully-connected. We efe the eade to [9] fo moe details about these topologies. 2.2 Makov Availability Model The key non-tivial aspect of powe infastuctue availability modeling aises fom cetain idiosyncasies of the inteactions between utility, DG, and. Apat fom the steady-state failues associated with powe infastuctue components (see Table 2), the has a second fom of failue which may happen when its battey becomes completely dischaged. This can happen if the gets completely dischaged when poweing IT in the following two scenaios: (i) both utility and DG

(N-i)* b C 1 (i+1)*f b 1,2,(i+1,1) c 1,2,(i+1,0) b f u f C u 2 0,2,(i+1,1) 0,2,(i+1,0) s u d s d C 3 0,1,(i+1,1) 0,1,(i+1,0) (N-i)* b (i+1)*f b 1,2,(i,1) 1,2,(i,0) f u f u 0,2,(i,1) 0,2,(i,0) s d s d 0,1,(i,1) 0,1,(i,0) Component Reliability paametes Utility f u =3.89E-03, =30.48 DG =1.03E-04, =0.25 f b =3.64E-05, b =0.12 ATS f ats =9.79E-06, ats =0.17 PDU f pdu =1.80E-06, pdu =0.016 Table2: Failueates(f x )andecoveyates( x )ofpowe infastuctue components. The ates pesented indicate failue/ecovey events pe hou. C d 4 0,0,(i+1,1) f u C u 5 1,0,(i+1,1) 0,0,(i+1,0) f u 1,0,(i+1,0) 0,0,(i,1) f u 1,0,(i,1) 0,0,(i,0) f u 1,0,(i,0) Figue 2: Continuous-time Makov Chain captues the inteaction between Utility, DG and units. Shown is the tansition between (i+1) active units and i active units. Thefailueandecoveyatesofunitsae pesented only fo the states in the top ow fo claity, but thosetansitionsexist in allthelowe ows as well. havefailedo(ii)utilityhasfailedanddgisinthepocess of stating up. It can be seen that these special failue scenaios elated to dischage ae conditional on the utility being unavailable and the DG eithe failing o taking too long to stat up. Additionally, the amount of time the can powe IT available unde these scenaios depends on the amount of chage in its battey, indicatey the battey untime (in minutes). This suggests that a modeling technique to captue the impact of these inteactions on oveall availability should emembe utility/dg failues and chage level. Continuous-time Makov Chains (MC) fit this equiement and have been usedinsome existing eseach [2]. We conside a continuous-time MC-based model fo a powe infastuctue with one utility, one DG unit, and N identical units b 1 b N (b fo battey). We assume exponential failue and ecovey pocesses fo utility, DG, and units with ates {f u, }, {, }, and {f b, b },espectively. Wealsoassumeexponentially distibuted ates s d fo DG statup, fo battey dischaging, and fo battey chaging. The states within ou MC ae 3-tuples of the fom {u,d,b}, with u,d,b epesenting states of the utility, DG, and units, espectively. u {0,1}: 0 means utility failed and 1 means utility available and poweing. d {0,1,2}: 0 means DG failed, 1 means DG available and poweing, and 2 means DG available but not poweing. Finally, b {(n, 0/1)} denotes thestateoftheunitsandtheibatteies: 0 n N denotes the numbe of available units, while 0 and 1 epesent whethe these n units ae fully dischaged o chaged, espectively. Fo claity, we pesent ou Makov states only fo two battey chage states, eithe fully chaged o fully dischaged, though ou actual model consides discete chage states (one state fo evey one minute of battey untime). Figue 2 pesents the tansitions among states coesponding to (i + 1) and i available units (0 i N 1). States in a given column all possess the same numbe of available units anattey chage(with utility and DG states vaying),whilestateswithinagivenowallpossessthesame utility and DG states (with states vaying). Consequently, tansitions among states within a given column captue failue/ecovey of utility and DG, wheeas those among states within a given ow captue events petaining to failue/ecovey and chage/dischage of units. The tansitions between the states ae selfexplanatoy and details ae omitted fo space. We combine the availability of Utility-DG- units obtainedusingtheabovemakovmodelwiththatofthe PDU and ATS units using simple Reliability Block Diagams (RBDs). We obtain failue and ecovey ates of the powe infastuctue components fom the IEEE Gold-book [4] and pesent those in Table 2. 3. EVALUATION We conside a 4MW datacente with 32 PDUs, 256 acks and 8192 seves fo ou evaluation. We vay the numbe and capacity of units depending on thei placement within the powe hieachy. We also vay the capacity of and PDU units depending on the ove-povisioning capacity associated with the powefeed topology[9]. Only selective models allow fo highe capacity than 512KW- those that exist offe only 1MW and thei cost numbes ae not known fos to make useful compaison [1]. Consequently, we assume any lage capacity to be obtained using multiple 512KW units connected to the same paallel bus. Since the and PDU subsystem constitutes to a significant potion of oveall powe infastuctue costs [2, 7], we only conside the cost of these two components fo ou evaluation (cost numbes obtained fom APC Website ae pesented in Figue 3). We use the pevalent notation of epesenting availability as the numbe of leading nines - e.g., 0.999193 would simply be efeed to as thee 9 s of availability.

1000000 100000 Cost ($) 10000 1000 cost ($) PDU cost ($) Configuation Cost (Million $) # of 9 s of availability Cent. Tie-2 4.04 5 Dist. Seve-level 2.57 0 Dist. 2N Seve-level 3.8 3 Dist. Rack-level 4.40 1 Dist. PDU-level 4.50 2 Hybid 2.68 6 100 0.5 0.6 0.75 6 16 24 128 192 512 1 4 Capacity (KW) Figue3: CostofandPDUunitsfodiffeentcapacitiesthatweexploe(y-axisisinlogscale). Costpeunitcapacity ($/W) is much lowe fo seve-level (0.5 KW) units compaed to centalized es (512 KW). Configuation Cost(Million $) # of 9 s of availability Cent N, 1-1 PDU 3.42 2 Cent N+1, 1-1 PDU 3.72 2 Cent N+1, wapped PDU 4.04 5 Cent N+2, wapped PDU 4.34 6 Cent 2N, wapped PDU 5.82 6 Table 3: Cost and availability oiffeent centalized configuations. While the $ pe 9 fo scaling fom two 9 s to five 9 s is just $100000, the incemental cost fo scaling fom five 9 s tosix9 s becomes $300000. 3.1 Availability/Cost fo Centalized In this section, we discuss the cost-availability tadeoffs associated with diffeent centalized configuationsbyvayingtwoknobs. Thefistknobisthenumbe of units connected tothe paallel bus. The second knob is the powe-feed topology (we only need to conside the topology connecting PDUs to ack clustes since the centalized units ae connected to PDUs viaapaallelbus). Table3pesentouesults. Theconfiguation shown as Cent. N, 1-1 PDU epesent Tie- 1datacentes[10] 1 withnoedundancyinits/pdu levels. Its level consists of 8 units of 512KW each fo a total of 4MW and the topology connecting PDU and ack cluste is one-one. This configuation offes onlytwo9 sofavailability,sinceitequiesall32pdus and 8 units tobe available. Usingoufistknobofaddingedundancyatthe level, we obtain the configuation Cent. N+1, 1-1 PDU with (8+1) units of 512KW each. The availability isstillbottleneckedbythepdulevelwithjusttwonines since it equies all 32 PDUs to be available. Ou second knob helps addess this. Using the wapped topology between PDU and ack cluste(indicated as Cent. N+1, wapped PDU in Table 3), the availability inceases fom two to five 9 s. This configuation coesponds to that employed in many of today s Tie-2 data centes. Next, we conside how the availability can be impoveeyond five 9 s. Fo this, we investigate 1 Note that the availability numbes we epot fo the diffeent Tie configuations ae specific to ouatacente size and can vay widely acoss diffeent datacente sizes. Table 4: Cost and availability foiffeent distibuted placement configuations. Ou hybid scheme that employs a combination of seve-level es and thee exta es pe ack achieves the availability of six 9 s at 33% lowe cost than centalized. the effect of using the fist knob to incease edundancy to N+2 and 2N, while keeping the wapped topology at PDU level. We find that both N+2 and 2N achieve six 9 s. The table also suggests any edundancy beyond N+2 becomes unnecessay. It is inteesting to note that while inceasing availability fom(tie-1) two 9 s to (Tie-2) five 9 s incus only a small incemental cost ($100000 pe 9 between two and five 9 s), futhe impovements involve significant investments($300000 between five and six 9 s). We also find that poviding dense inteconnect at the PDU level (sepentine and fully-connected) do not esult in any futhe impovement in availability and theefoe we assume wapped at the PDU-level thoughout this section. Key insights: (i) Wapped PDU suffices (achieves five to six 9 s); (ii) N+1 is good enough fo centalized (five 9 sat a smalladditional cost); 3.2 Availability/Cost fo Distibuted Wapped PDU topology PDU 1 Rack-level Tansfe Switch Seconday powe souce Redundant seve powe feeds Seve switch powe to seve PDU 2 Rack-level tansfe switch allow seves to toleate one PDU failue Seve-level Pimay powe souce Switches to seconday powe souce upon failue of seve-level Figue 4: Illustation of ou hybid placement. It has one pe seve and a ack-level module which is showntohavetheeunitsconnectedtoapaallelbus. This hybid configuation can toleate failue of at most thee seve-level es within each ack. In this section, we study availability offeey distibuted placements and compae it with that of centalized N+1(Tie-2) configuation discussed above, denoted as Cent. Tie-2 in Table 4. We see that as we move fom centalized to PDU-level to ack-level to seve-level, the availability deceases. This is due

to incease in numbe of components - 8+1 (centalized), 32 (PDU-level), 256 (ack-level), and 8192 (seve-level)- with accompanying incease in pobability of at least one unit failing. This is evident fo the configuation with 8192 units (labeled Dist. seve-level ) which has only zeo 9 s, due to a elatively high pobability (about 0.9) of at least one unitbeingdownatagiventime. Theconfiguationwith 2 es pe seve (labeled Dist. 2N seve-level ) inceases the availability to thee 9 s. Table 4 shows that distibuted seve-level placement(even with 2N edundancy) is much cheape (36% lowe cost) than Tie- 2 centalizeut has poo availability. The table also shows that PDU-level and ack-level placements ae undesiable fom both cost and availability dimensions. We also find that incopoating powe-feed connectivity (wapped and othes) at the level, though incease availability, does so with significant cost additions, making them less likely to be adopted in pactice. Based on the insights gained fom the above analysis, we now popose hybid placement schemes that combine the high availability offeey centalized placement with the cheape cost offeey seve-level placement. These hybid schemes, in addition to the unit at each seve (as in Dist. seve-level ), add one o moe units pe-ack with capacity same as that of seve-level es. We find that placing thee such additional units pe-ack exceeds the availability of Cent. Tie-2 configuation (see Table 4). Ou poposed hybid configuation is shown in Figue 4. Eachsevehasoneofitsdualpowefeedsconnectedto its local unit, and the othe connected to ack-level units though a paallel bus. Duing nomal opeation, each seveaws powe though its local. Upon a failue of its local, a seve stats to daw powe fom the ack-level units. Both the sevelevel and ack-level units ae connected to both the PDUs (wapped PDU shown) though a ack tansfe switch. Otheesiable levels of availability may then be obtainey vaying the numbe of these edundant ack-level units. Key insights: Hybid schemes that combine the high availability of centalized and lowe cost oistibuted seve-level allow significantly bette availability/cost tadeoffs than existing configuations. 4. DISCUSSION AND FUTURE WORK We have developed a systematic Makov/RBD based availability model to evaluate the cost-availability tadeoff associated with diffeent datacente powe hieachy configuations. We have shown that a hybid technique combining seve-level with ack-level units can achieve availability as high as cuent centalized placement at just two-thids of its cost. Thee ae seveal inteesting diections fo futue wok, Most existing wok on datacente availability focuses solely on how likely a datacente s powe infastuctue is to suppot its entie IT equipment. But in geneal, diffeent powe infastuctue-induced failue scenaios can ende vaying factions of the oveall IT equipment unavailable. Fo example, while failue of centalized unit may esult in unavailability of the entie IT equipment, failue oistibuted seve/ack/pdu-level units esult only in patial IT equipment failue. In fact, datacentes willing to toleate few IT equipment failues can be constucted with much lowe cost. Using ou availability model, we find that seve-level placement("dist. Seve-level") which is much cheape than conventional centalized placement achieve 6 nines of availability fo handling 99% of the IT load,(compae it with 0 nines at 100% IT load in Table 4). Inteesting wokload placement stategies can be developed to leveage of such factional IT availability, (i) PDU failues in Figue 1(b) need not necessaily esult in IT unavailability since the distibuted units can be leveaged to migate (live migation takes only 1-2 minutes) the wokload to the active PDUs. (ii) Load-balances canbetunedtodiectclientequeststoactivepatofthe powe infastuctue. Captuing such effects to have a wokload and migation policy awae availability model is an impotant futue eseach diection. We would also like to study in detail the feasibility oistibuted and hybid seve/ack-level placements, especially on aised-floo eal estate and associated cooling inefficiencies as pat of ou futue wok. 5. REFERENCES [1] APC. http://www.apc.com/poducts/. [2] APC White pape 75: Compaing System Design Configuations, 2008. [3] L.A. Baosoand U.Holzle. The Datacente asa Compute: Design of Waehouse-Scale Machines. Mogan and Claypool Publishes, 2009. [4] Gold Book, IEEE Recommended Pactice fo the Design of Reliable Industial and Commecial Powe Systems, 1998. [5] Google Seves-level Batteies. news.cnet.com/8301-1001_3-10209580-92.html. [6] J. Hamilton. Intenet-scale Sevice Infastuctue Efficiency, ISCA Keynote, 2009. [7] Liebet White pape: Choosing The Right fo Small and Midsize Datacentes, 2004. [8] M. Mawah, P. Maciel, A. Shah, R. Shama, T. Chistian, V. Almeida, C. Aaújo, E. Souza, G. Callou,B.Silva,S.Galdino, and J.Pies. Quantifying the sustainability impact oata cente availability. SIGMETRICS Pefom. Eval. Rev., 2010. [9] S. Pelley, D. Meisne, P. Zandevakili, T. F. Wenisch, and J. Undewood. Powe Routing: Dynamic Powe Povisioning in the Data Cente. In Poceedings of the Confeence on Achitectual Suppot fo Pogamming Languages and Opeating Systems (ASPLOS), 2010. [10] K. G.B.W. P.Tune, J.H. Seade. Tie Classifications Define Site Infastuctue Pefomance, 2008.