A SOFTWARE RELIABILITY MODEL FOR CLOUD-BASED SOFTWARE REJUVENATION USING DYNAMIC FAULT TREES



Similar documents
Exotic Options Pricing under Stochastic Volatility

Numerical Algorithm for the Stochastic Present Value of Aggregate Claims in the Renewal Risk Model

Many quantities are transduced in a displacement and then in an electric signal (pressure, temperature, acceleration). Prof. B.

Brussels, February 28th, 2013 WHAT IS

GENETIC ALGORITHMS IN SEASONAL DEMAND FORECASTING

QUALITY OF DYING AND DEATH QUESTIONNAIRE FOR NURSES VERSION 3.2A

Section 7.4: Exponential Growth and Decay

Estimating Powers with Base Close to Unity and Large Exponents

Term Structure of Interest Rates: The Theories

MTBF: Understanding Its Role in Reliability

Derivations and Applications of Greek Letters Review and

Rapid Estimation of Water Flooding Performance and Optimization in EOR by Using Capacitance Resistive Model

Transient Thermoelastic Behavior of Semi-infinite Cylinder by Using Marchi-Zgrablich and Fourier Transform Technique

Subject: Quality Management System Requirements SOP

Who uses our services? We have a growing customer base. with institutions all around the globe.

You can recycle all your cans, plastics, paper, cardboard, garden waste and food waste at home.

CHAPTER 4c. ROOTS OF EQUATIONS

A Note on Approximating. the Normal Distribution Function

by John Donald, Lecturer, School of Accounting, Economics and Finance, Deakin University, Australia

Technological Entrepreneurship : Modeling and Forecasting the Diffusion of Innovation in LCD Monitor Industry

The Laplace Transform

ISSeG EGEE07 Poster Ideas for Edinburgh Brainstorming

Adverse Selection and Moral Hazard in a Model With 2 States of the World

Foreign Exchange Markets and Exchange Rates

Virtual Sensors

Dr David Dexter The Parkinson s UK Brain Bank

Solving the real business cycles model of small-open economies by a sample-independent approach

Introduction to Measurement, Error Analysis, Propagation of Error, and Reporting Experimental Results

A negotiation-based Multi-agent System for Supply Chain Management

New Basis Functions. Section 8. Complex Fourier Series

Why An Event App... Before You Start... Try A Few Apps... Event Management Features... Generate Revenue... Vendors & Questions to Ask...

EXTRACTION OF FINANCIAL MARKET EXPECTATIONS ABOUT INFLATION AND INTEREST RATES FROM A LIQUID MARKET. Documentos de Trabajo N.

UNIVERSITÉ PARIS I PANTHÉON-SORBONNE MASTER MMMEF

Investment Grade Fixed Income Fiduciary Services Cincinnati Asset Management

IBM Healthcare Home Care Monitoring

Acceleration Lab Teacher s Guide

Ref No: Version 5.1 Issued: September, 2013

SIF 8035 Informasjonssystemer Våren 2001

CONTINUOUS TIME KALMAN FILTER MODELS FOR THE VALUATION OF COMMODITY FUTURES AND OPTIONS

Uniplan REIT Portfolio Fiduciary Services Uniplan Investment Counsel, Inc.

Estimating Private Equity Returns from Limited Partner Cash Flows

CEO Björn Ivroth. Oslo, 29 April Q Presentation

ME 612 Metal Forming and Theory of Plasticity. 6. Strain

Unit 2. Unit 2: Rhythms in Mexican Music. Find Our Second Neighborhood (5 minutes) Preparation

Removal of Cu(II) from Water by Adsorption on Chicken Eggshell

Sharp bounds for Sándor mean in terms of arithmetic, geometric and harmonic means

Econ 371: Answer Key for Problem Set 1 (Chapter 12-13)

Repulsive Force

Full-wave rectification, bulk capacitor calculations Chris Basso January 2009

An Broad outline of Redundant Array of Inexpensive Disks Shaifali Shrivastava 1 Department of Computer Science and Engineering AITR, Indore

Basis risk. When speaking about forward or futures contracts, basis risk is the market

Morningstar Investor Return

4 Convolution. Recommended Problems. x2[n] 1 2[n]

Appendix A: Area. 1 Find the radius of a circle that has circumference 12 inches.

Maintain Your F5 Solution with Fast, Reliable Support

Transistor is a semiconductor device with fast respond and accuracy. There are two types

Calculation of variable annuity market sensitivities using a pathwise methodology

Uniplan REIT Portfolio Select UMA Uniplan Investment Counsel, Inc.

The Sensitivity of Beta to the Time Horizon when Log Prices follow an Ornstein- Uhlenbeck Process

How Much Can Taxes Help Selfish Routing?

1.- L a m e j o r o p c ió n e s c l o na r e l d i s co ( s e e x p li c a r á d es p u é s ).

Question 3: How do you find the relative extrema of a function?

Western Asset Core Plus Portfolios Select UMA Western Asset Management

81-1-ISD Economic Considerations of Heat Transfer on Sheet Metal Duct

Cumulative effects of idalopirdine, a 5-HT 6 antagonist in advanced development for the treatment of mild and moderate Alzheimer s disease

College of Medicine, Nursing and Health Sciences

Investment Grade Fixed Income Select UMA Cincinnati Asset Management

CAFA DIVERSITY JURISDICTION

High Quality High Yield Select UMA Seix Advisors

PSYCHOLOGY AT SUNY POTSDAM

Traffic Flow Analysis (2)

2.4 Network flows. Many direct and indirect applications telecommunication transportation (public, freight, railway, air, ) logistics

Large Cap Equity Fiduciary Services Fayez Sarofim & Co.

Lateef Investment Management, L.P. 300 Drakes Landing Road, Suite 210 Greenbrae, California 94904

US Large Cap. $15.6 billion $5.4 million

THE STUDY OF BARRIERS TO ENTREPRENEURSHIP IN MEN AND WOMEN

Krebs (1972). A group of organisms of the same species occupying a particular space at a particular time

CFD-Calculation of Fluid Flow in a Pressurized Water Reactor

a seed career program in the s indus tr career handbook for school counselors and college advisors

SPECIAL VOWEL SOUNDS

Mid Cap Growth Select UMA Congress Asset Management Company

Continuity Cloud Virtual Firewall Guide

(Analytic Formula for the European Normal Black Scholes Formula)

5.4 Exponential Functions: Differentiation and Integration TOOTLIFTST:

Planning and Managing Copper Cable Maintenance through Cost- Benefit Modeling

Option Pricing with Constant & Time Varying Volatility

Lecture 20: Emitter Follower and Differential Amplifiers

A Novel Approach For Generating Rules For SMS Spam Filtering Using Rough Sets

World Class Payments in the UK Enhancing the payments experience

cooking trajectory boiling water B (t) microwave time t (mins)

Transcription:

Inrnaional Journal of Sofwar Enginring and Knowldg Enginring World Scinific ublihing Company A SOTWARE RELIABILITY MODEL OR CLOUD-BASED SOTWARE REJUVENATION USING DYNAMIC AULT TREES JEAN RAME and AIING XU Compur and Informaion Scinc Dparmn, Univriy of Maachu Darmouh Norh Darmouh, MA 747, USA jrahm@umad.du hxu@umad.du Rcivd 8 Augu 5 Rvid 8 Ocobr 5 Accpd Day Monh Yar Corrcly mauring h rliabiliy and availabiliy of a cloud-bad ym i criical for valuaing i ym prformanc. Du o h promid high rliabiliy of phyical facilii providd for cloud rvic, ofwar faul hav bcom on of h major facor for h failur of cloud-bad ym. In hi papr, w focu on h ofwar aging phnomnon whr ym prformanc may b progrivly dgradd du o xhauion of ym rourc, fragmnaion and accumulaion of rror. W u a proaciv chniqu, calld ofwar rjuvnaion, o counrac h ofwar aging problm. Th dynamic faul r DT formalim i adopd o modl h ym rliabiliy bfor and during a ofwar rjuvnaion proc in an aging cloud-bad ym. A novl analyical approach i prnd o driv h rliabiliy funcion of a cloud-bad o Sar S ga, which i furhr vrifid uing Coninuou Tim Markov Chain CTMC for i corrcn. W u a ca udy of a cloud-bad ym o illura h validiy of our approach. Bad on h rliabiliy analyical rul, w how how co-ffciv ofwar rjuvnaion chdul can b crad o kp h ym rliabiliy coninly aying abov a prdfind criical lvl. Kyword: Sofwar aging; ofwar rjuvnaion; rliabiliy analyi; dynamic faul r DT; ho par S ga; Markov chain; chduling.. Inroducion Du o rcn advanc in cloud compuing chnologi, cloud rvic hav bn ud in many diffrn ara uch a raffic conrol, ral-im nor nwork, halhcar, and mobil cloud compuing. Cloud rvic providr hav rid o dlivr produc wih high qualiy of rvic QoS, which provid ur faul-olran hardwar and rliabl ofwar plaform for dploying cloud-bad applicaion [][]. owvr, cloud ouag ar ill vry common du o componn failur, which can affc qui ngaivly h rvnu of cloud-bad ym. rviou rarch on h rliabiliy of compur-bad ym ha focud on hardwar rliabiliy and availabiliy; conqunly, h hardwar faul olranc and faul managmn ar wll undrood and dvlopd [3]. Wih h promid high rliabiliy and availabiliy of phyical facilii, Corrponding auhor: Dr. aiping Xu, Aocia rofor, Compur and Informaion Scinc Dparmn, Univriy of Maachu Darmouh, Email: hxu@umad.du.

J. Rahm &. Xu including h hardwar facilii and hir aociad rdundancy mchanim, ofwar faul hav now bcom on of h major facor of failur in a cloud-bad ym. Sinc ofwar rliabiliy i conidrd on of h wak poin in ym rliabiliy, ofwar faul olranc and failur forcaing rquir mor anion han hardwar faul olranc in modrn compur ym [4, 5]. Thi work i moivad o dal wih h ofwar faul in cloud compuing in ordr o aur high rliabiliy and availabiliy of cloud-bad ofwar ym. In many afy-criical compur-bad ym, failur of h ofwar ym may lad o unrcovrabl lo uch a human lif [6]. Such ym ar rquird o b prfcly rliabl and nvr fail bad on h diciplin of faul-olran and rliabl compuing. Rliabiliy and availabiliy ar wo common way o xpr ym faul olranc in indury. A rliabl compur-bad ym ypically ha high availabiliy if unrliabiliy i h major cau for unavailabiliy. In hi papr, w focu on analyzing h rliabiliy of cloud-bad ym for ofwar faul olranc in ofwar rliabiliy nginring SRE. Tradiional SRE ha bn bad on analyi of ofwar dfc and bug uch a Bohrbug or inbug wihou conidring ofwar aging rlad bug [4]. Bohrbug ar mainly dign dfc ha can b liminad by dbugging or adoping dign divriy; whil inbug ar dfind a faul ha would op cauing failur whn on amp o iola hm. Th concp of ofwar aging phnomnon wa inroducd in h middl 9, which xplain ha h ym rourc ud by h ofwar dgrad gradually a a funcion of im [7]. Sofwar aging ar o how up du o mulipl facor uch a mmory bloaing, mmory lak, unrminad hrad, daa corrupion, unrlad fil-lock, fragmnaion in orag pac, and accumulaion of round-off rror whn running a pic of ofwar. I ha conidrably changd h SRE fild of udy, and bcom a major facor for h rliabiliy of fully d and dployd ofwar ym. To dal wih h ofwar aging problm and o aur ofwar faul olranc, ofwar rjuvnaion proc ha bn inroducd a a proaciv approach o counracing ofwar aging and mainaining a rliabl ofwar ym [8]. Sofwar rjuvnaion involv acion uch a opping h running ofwar occaionally, and claning i inrnal a.g., garbag collcion, fluhing opraing ym krnl abl, and riniializing inrnal daa rucur. Th impl way o prform ofwar rjuvnaion i o rar h ofwar componn ha cau h aging problm, or o rboo h whol ym. Du o h vr-growing cloud compuing chnology and i va mark, h workload of cloud-bad ym ha incrad dramaically. A havy workload of a cloud-bad ym will inviably lad o mor ofwar aging problm. In hi papr, w propo o u cloud-bad par componn a major ofwar componn in a compur-bad ym o nhanc i ym rliabiliy, and inroduc an analyicalbad approach o dvloping rjuvnaion chdul for cloud-bad ym in ordr o mainain hir high ym rliabiliy and nur a zro-downim rjuvnaion proc. Dynamic aul Tr DT ar adopd o modl h rliabiliy of a cloud-bad ym, and a novl analyical approach i prnd o driv h rliabiliy funcion of a major

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 3 yp of dynamic ga in DT modl, calld o Sar S ga. Th analyical approach i hn formally vrifid uing a Coninuou Tim Markov Chain CTMC modl o nur i corrcn. A h CTMC approach ha i inrinic limiaion of only upporing componn wih conan failur ra, o h b of our knowldg, our propod analyical approach i h fir formal way o corrcly driv h rliabiliy funcion of an S ga wihou uch a limiaion. To dmonra h pracical uag of our approach in valuaing h ym rliabiliy of a cloud-bad ym, w aum a rliabiliy hrhold for h ym undr conidraion. Whn h hrhold i rachd, h ofwar rjuvnaion proc i riggrd, and h rliabiliy of h cloud-bad ym i bood o i iniial a. Our ca udy how ha ofwar rjuvnaion chduling bad on h rliabiliy analyi of a cloud-bad ym can ignificanly nhanc i ym rliabiliy and availabiliy. Thi work xnd our prviouly propod approach o producing a rliabiliy-bad ofwar rjuvnaion chdul for cloud-bad ym [9]. In our prviou work, w u CTMC o driv h rliabiliy funcion of an S ga for cloud-bad ym. To ovrcom h limiaion of h CTMC approach, in hi papr, w prn a nw analyical approach, which i mor gnral and inuiiv, and may ponially uppor ofwar componn wih non-conan failur ra in our fuur rarch. Th r of h papr i organizd a follow. Scion dicu prviou work rlad o our rarch. Scion 3 prn a moivaing xampl for rjuvnaion of cloud-bad componn. Scion 4 dcrib how o modl and analyz h rliabiliy of cloud-bad ym uing DT. Scion 5 prn a ca udy o dmonra h validiy of our approach, and Scion 6 conclud h papr and mnion fuur work.. Rlad Work In 995, rarchr inroducd h o-calld ofwar rjuvnaion chniqu o dal wih aging-rlad ofwar faul [8]. Thi chniqu, in conra o raciv approach wih acion akn only afr a ofwar failur, i conidrd a proaciv approach ha prmpivly rar h aging applicaion and clan ofwar aging rlad bug [, ]. rviou udi on ofwar aging and ofwar rjuvnaion for prdicing a rjuvnaion chdul can b claifid ino wo cagori, namly analyical-bad and maurmn-bad approach []. In an analyical-bad approach, a failur diribuion i aumd for ofwar faul rlad o h ofwar aging phnomnon, and ofwar rjuvnaion i xcud a a fixd inrval bad on h analyical rul of h ym rliabiliy and availabiliy. Svral analyic modl hav bn propod o drmin h opimal im for rjuvnaion. Bobbio al. propod a fin-graind ofwar dgradaion modl for opimal rjuvnaion polici []. Bad on h aumpion ha h currn dgradaion lvl of h ym can b idnifid, hy prnd wo diffrn ragi o drmin whhr and whn o rjuvna. Vaidyanahan al. prnd an analyical modl of a ofwar ym uing inpcionbad ofwar rjuvnaion [3]. In hir propod approach, hy howd ha inpcion-bad mainnanc wa advanagou in many ca ovr non-inpcion

4 J. Rahm &. Xu bad mainnanc. Dohi al. inroducd a modifid ochaic modl o ima h ofwar rjuvnaion chdul [4]. Th propod modl i bad on mi-markov proc, which can maximiz h ym availabiliy. Koura and lai applid h ofwar rjuvnaion chniqu o clur ym in ordr o achiv hir high availabiliy [5]. In hir approach, ofwar rjuvnaion i carrid ou whn a ofwar dployd on a nod ar o xprinc dgradaion; hu an unchduld rboo may b avoidd. Alhough h abov approach inroduc variou modl for ofwar rjuvnaion, hy ar no inndd o addr complx ym componn bhavior and inracion, uch a dynamic rlaionhip bwn ofwar componn including paring rlaionhip and funcional dpndncy. Diffrn from h xiing analyicalbad approach, w focu on h dynamic bhavior of ofwar componn in h conx of cloud-bad ym. W adop h paring rlaionhip a an xampl o dmonra how dynamic rlaionhip of ofwar componn in a cloud-bad ym can b modld and analyzd uing DT. On h ohr hand, maurmn-bad approach appli aiical analyi o h maurd daa of rourc uag and dgradaion ha may lad o h ofwar aging problm. In a maurmn-bad approach, a monioring program i ud o coninuouly collc h ym prformanc daa, and analyz hm in ordr o ima h ym dgradaion lvl. Whn xhauion rach a criical lvl, h ofwar rjuvnaion proc i riggrd. Machida al. ud Mann-Kndall o dc ofwar aging from rac of compur ym mric [6]. Thy d for xinc of monoonic rnd in im ri, which ar ofn conidrd indicaion of ofwar aging. Grok al. udid h rourc uag in a wb rvr ubjc o an arificial workload [7]. Thy applid non-paramric aiical mhod o dc and ima rnd in h daa for prdicing fuur rourc uag and ofwar aging iu. Guo al. propod a ofwar aging rnd prdicion mhod bad on ur innion [8]. Th approach can b ud o prdic h rnd of ofwar aging bad on h quaniy of ur rqu o ofwar componn whil h ym i funcioning. Th xiing maurmn-bad approach ar faibl way o dc ofwar aging problm in ral-world compur-bad ym, bu hy ypically rquir o proc larg amoun of ym daa. Thu, hy ar no a fficin a analyical-bad approach. owvr, maurmn-bad approach do provid uful inigh abou dynamic ym bhavior and failur diribuion rlad o ofwar aging. A uch, our rarch i complmnary o h xiing rarch ffor on maurmn-bad ofwar rjuvnaion chniqu ha inviga h rlaionhip of ofwar mric and ofwar aging rlad ofwar faul uing aiical analyi [9]. Ohr rlad work ampd o addr h ofwar aging iu in virualizd daacnr. Machida al. propod a ri n bad availabiliy modl for virualizd ym wih im-bad rjuvnaion for virual machin []. Thy compard hr chniqu in rm of ady-a availabiliy, and uggd h opimal combinaion of rjuvnaion riggr inrval for ach rjuvnaion chniqu uing a gradin arch mhod. Thin al. propod an analyical approach ha modl availabiliy for

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 5 applicaion rvr []. Bad on h availabiliy modl, hy prnd poibl combinaion of virualizaion, high availabiliy clur and ofwar rjuvnaion. owvr, h abov approach ar no xplicily bad on ofwar rliabiliy analyi. In conra, our approach analyz ym rliabiliy uing DT modl, and can gnra rjuvnaion chdul ha xplicily aify h prdfind rliabiliy and availabiliy rquirmn of a cloud-bad ym. 3. Rjuvnaion of Cloud-Bad Componn Virualizaion chnology ha bn wll-adopd in cloud compuing, which allow on o har a machin phyical rourc among mulipl virual nvironmn, calld virual machin VM. A hown in ig., A VM i no boundd o h hardwar dircly; rahr i i boundd o gnric drivr ha ar crad by a virual machin managr VMM or a hyprvior. Sinc a VM can b aily crad and droyd, i i paricularly uful in a diar rcovry proc of a cloud-bad ym. In hi papr, a cloud-bad ym i rfrrd o a a ofwar ym ha coni of mulipl VM, whr ach VM i conidrd a ofwar componn wihin h ym. ig.. An xampl of a rliabl cloud-bad ym wih par ofwar componn A a proaciv faul managmn chniqu, ofwar rjuvnaion ha bn ud o rfrh ym inrnal a and prvn h occurrnc of ofwar failur du o ofwar aging. A mniond bfor, a impl way for ofwar rjuvnaion i ym rboo,.g., o rar a VM or all VM in a cloud-bad ym. Th baic ida of our approach i o cra a nw inanc of VM ha rplac h on o b rjuvnad. Sinc h nwly dployd VM inanc ha no y bn affcd by h ofwar aging phnomnon, h rliabiliy of h ofwar componn, afr bing rplacd, i bood back o i iniial condiion. To achiv high faul olranc and rliabiliy, w furhr adop h ofwar rdundancy chniqu uing wo diffrn yp of ofwar andby par, namly Cold Sar CS and S. In h conx of cloud compuing, cold andby man ha a ofwar componn i availabl a an imag of a VM, rahr han an aciv VM inanc. Daa bwn a primary componn and h par on i rgularly mirrord bad on a pcifid chdul,.g., mulipl im a day. Sinc a CS i no up running and do no ak any workload, i rliabiliy qual o wih a conan failur ra. Sinc a CS can b ard vry quickly, h rcovry im uing CS ypically

6 J. Rahm &. Xu ak ju a fw minu o no mor han wo hour. No ha a ofwar-dfind CS i qui diffrnc from a hardwar-bad CS in rm of i co and fficincy. Th co of a ofwar-dfind CS i i orag and vry lil CU im for daa mirroring; whil a hardwar-bad CS i a phyical dvic ha mu b availabl all h im in ordr o aur fa failovr [3]. urhrmor, a ofwar-dfind CS can b ard vry quickly, bu a hardwar-bad CS ypically rquir manual configuraion and adjumn in h vn of parial or oal failur. On h ohr hand, an S in h conx of cloud compuing i a ho andby VM inanc. Thi man ha h ofwar componn rving a an S mu b inalld and dployd, and mu b inanly availabl whn h primary componn fail. Alhough an S i dployd and running along wih h primary componn, i ypically do no ak any workload for procing ur rqu. To nur faul olranc, criical daa of an S i mirrord in nar ral im.g., in h rang of µ from h primary VM inanc. Thi gnrally provid a rcovry im of a fw cond in ca of a failur. Similar o CS, a ofwar-dfind S alo ha much lowr co and work mor fficinly han a hardwar-bad S. In our ym dign, ach criical primary componn mu b quippd wih a la on S and on CS in ordr o mainain h ndd rliabiliy. owvr, whn calculaing h ym rliabiliy, w only nd o conidr h primary componn and i S, bu no i CS, a h CS i no funcioning. A CS i conidrd for rliabiliy analyi only whn i bcom a primary componn or a ho par on. In h following, for impliciy, w dno a primary VM inanc/componn a, which i aciv and ha a full workload, an S a, which i aciv bu do no ak any workload, and a CS a C, which i inaciv and no funcioning a all. In our approach, a rjuvnaion chdul of a cloud-bad ym i crad bad on i rliabiliy modling and h analyical rul. Whn h rliabiliy of a ym componn or h whol ym rach a prdfind hrhold, h rjuvnaion proc i riggrd. W aum h rjuvnaion proc ak abou 3 minu, which i ypically ufficin for aring a CS and ranfr all rqu o h nw VM. A a impl xampl illurad in ig., uppo w hav wo inanc, a primary componn and a ho andby on, which ar dployd on wo diffrn phyical machin. Th wo phyical machin uually blong o wo diffrn zon dnod a Zon and Zon in ig., o a powr/nwork ouag in on zon will no affc h availabiliy of h ohr on []. To rjuvna h whol ym, w can ar wo CS C and C, dnod a and in ig., o rplac and, rpcivly. Alhough in ig., and ar dployd on h am phyical machin whr and ar dployd, rpcivly, in raliy, hi i no ncary and boh and can b dployd on any phyical rvr. Onc h par componn and ar up and running, rv a a nw primary componn and ar o proc nw ur rqu; whil rv a a nw S, which i kp aliv bu do no ak any workload. Manwhil, w allow 3 minu in oal for h old componn and o finih procing hir xiing rqu. Afr 3 minu,

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 7 w hu down and dl h componn and, which hall hav bn uccfully rplacd by and afr h rjuvnaion proc compl. inally, wo nw CS C and C ar crad and mad rady for h nx round of a rjuvnaion proc. No ha in our rjuvnaion ragy, w hav chon o hu down inanc and rahr han rar and ru hm. Thi i bcau diffrn from a phyical machin, a VM can b aily crad and dployd, hu dploying nw inanc and i a much mor fficin way han raring and ruing and. During h rjuvnaion procdur, w nd o conidr wo cnario. On cnario i o rjuvna h major ofwar componn all oghr. In hi ca, w rplica h whol ym whn h ym rliabiliy rach i hrhold. W call hi cnario a ym-pcific rjuvnaion. Th cond cnario i a componn-pcific on, whr ach im w only rjuvna h criical componn who rliabiliy i ypically h low on whn h ym rliabiliy rach i rliabiliy hrhold. A w can from a ca udy prnd in Scion 5, h componn-pcific rjuvnaion would b normally mor co-ffciv han h ym-pcific approach. 4. Modling and Analyi Uing DT In hi cion, w fir brifly inroduc DT, and hn w how how o u DT o modl and analyz h rliabiliy of a cloud-bad ym ubjc o ofwar rjuvnaion. To implify mar, w aum ha h im-o-failur for ach ofwar componn i.., a VM ha a probabiliy dniy funcion pdf ha i xponnially diribud; in ohr word, all VM hav conan failur ra. 4.. S Ga for Cloud-Bad Sym Th faul r modling chniqu wa inroducd in 96 a Bll Tlphon Lab, which provid a concpual modling approach o rprning ym lvl rliabiliy in rm of inracion bwn componn rliabilii [3]. aul r analyi TA i by far h mo commonly ud chniqu for rik and rliabiliy analyi, whr h ym failur i dcribd in rm of h failur of i componn. Sandard faul r ar combinaorial modl and ar buil uing aic ga.g., AND-ga, OR-ga, and K/M-ga and baic vn. A combinaorial modl can only capur h combinaion of vn wihou conidring h ordr of occurrnc of hir failur, hy ar uually inadqua o modl oday complx dynamic ym [3, 4]. DT augmn h andard combinaorial ga of a rgular faul r, and inroduc hr novl modling capabilii, namly par componn managmn and allocaion, funcional dpndncy, and failur qunc dpndncy [5]. Th modling capabilii ar ralizd uing hr main dynamic ga: h par ga, h funcional dpndncy ga, and h prioriy-and ga. Th work don in hi papr u h dynamic par ga, in paricular h S and CS ga. No ha a par ga ha on primary inpu and on or mor alrna inpu i.., h par. Th primary inpu i iniially powrd on, and whn i fail, i i rplacd by an alrna inpu. Th par ga fail whn h primary and all h alrna inpu fail. igur how an S ga wih

8 J. Rahm &. Xu on primary componn dnod a and on ho par componn dnod a. Th S ga fail whn boh of h wo componn and fail. ig.. A S ga wih on primary componn and on ho par componn Suppo h conan failur ra of componn and ar and, rpcivly. Sinc do no ak any workload whn i funcioning, i failur ra i ypically lowr han. Whn fail, ak ovr workload, and bhav a a primary componn. now ha a highr conan failur ra han du o h ofwar aging phnomnon wih full workload. or hi raon, w call h par componn, afr i rol raniion,. No ha and do no hav o b qual bcau and may hav diffrn configuraion. Thr ar wo cnario whn h S ga fail. In h fir cnario, fail bfor fail. Thi ca i illurad a Ca in ig. 3, whr fail a and fail a, wih <. In h cond cnario, fail bfor fail. In hi ca, do no hav a chanc o bhav a a primary componn, and h failur of immdialy lad o h failur of h S ga. Thi ca i illurad a Ca in ig. 3, whr <. Ca Ca ig. 3. Two ca for h failur of an S ga Ca : fail bfor ; Ca : fail bfor W now driv h rliabiliy funcion R of h S ga by conidring h abov wo ca. Ca : fail bfor fail, dnod a p. In hi ca, i i guarand ha do no fail during, ]. Afr fail, ak ovr h workload and bcom. Inuiivly, h diribuion funcion p of h S ga, i.., h probabiliy ha h S ga fail during, ] can b calculad a in Eq..

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 9 p r p T d d owvr, Eq. work only whn, i.., h conan failur ra of do no chang afr i wich i rol from a par componn o a primary on a im. Whn >, a w can from ig. 4, h ingraion of h pdf of from o do no giv h corrc unrliabiliy of h componn a im, bcau i incorrcly aum ha h componn bhav a aring from im. Sinc h componn acually bhav a during, ], h unrliabiliy of a im qual h unrliabiliy of a rahr han h unrliabiliy calculad by h ingraion of h pdf of from o. Thi rquir u o calcula a nw aring ingraion im for uch ha h unrliabiliy of a rprnd by h hadd ara undr h pdf of i qual o h unrliabiliy of a rprnd by h hadd ara undr h pdf of. A h pdf of and ar f and f, rpcivly, uch a rlaionhip bwn and can b dcribd a in Eq.. d Solving Eq., w hav. Sinc fail during a priod of im -, h ingraion rang for now bcom [, ]. Bad on h abov analyi, h probabiliy of fail bfor fail can b calculad a in Eq. 3. d ig. 4. Th iniial unrliabiliy of whn fail i.., h unrliabiliy of a im

J. Rahm &. Xu r d d T p p 3 To implify h ingraion rang for, w can ubiu u for variabl in Eq. 3, and driv h diribuion funcion of h S ga apparing in Ca a in Eq. 4. p r u u d dud dud T p p 4 Ca : fail bfor fail, dnod a p. In hi ca, i i guarand ha do no fail during, ]. Th diribuion funcion of h S ga, i.., h probabiliy ha h S ga fail during, ] can b calculad a in Eq. 5. p r d d d d T p p 5 A h wo ca ar complly indpndn, h unrliabiliy of h S ga a im i h ummaion of h unrliabiliy valu of h wo ca a im. Thu, w driv h unrliabiliy funcion U of h S ga a in Eq. 6.

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr U p p 6 Accordingly, h rliabiliy funcion R of h S ga can b drivd a in Eq. 7. U R 7 I i worh noing ha hr i an obviou bu ubl hird ca, whr componn and fail xacly a h am im, dnod a. A h probabiliy of failur aociad wih h vn [T ] i, i.., h probabiliy ha ihr or fail during [, ] i, h unrliabiliy of h S ga in h hird ca mu qual. Thi rul can b aily drivd a in Eq. 8, whr fail a im during, ], and fail xacly a h am im whn fail. r d d d T 8 4.. Vrifying h Rliabiliy uncion Uing CTMC To formally vrify h corrcn of h rliabiliy funcion R of h S ga drivd in Scion 4., w now u a CTMC modl and olv i a quaion. igur 5 how h CTMC modl corrponding o h S ga givn in ig.. Thr ar four a o 4 dfind in h CTMC modl, which ar dnod a,,, and AILURE, rpcivly. Th a Sa rfr o h on in which boh h primary componn and h ho par on ar funcioning. Whn h ho par componn or h primary on fail, h modl nr i a Sa or a Sa 3, rpcivly. No ha w dno Sa 3 a inad of bcau in Sa 3, h ho par componn ha a diffrn failur ra a h on in Sa. ig. 5. Th CTMC modl of h S ga in ig.

J. Rahm &. Xu L i b h probabiliy of h ym in a i a im, whr i 4, and ij d [Xd j X i] b h incrmnal raniion probabiliy wih random variabl X. Th following marix [ ij d], whr i, j 4, i h incrmnal on-p raniion marix [4] of h CTMC dfind in ig. 5. d [ d] ij d d d d d d 9 Th marix [ ij d], whr i, j 4, i a ochaic marix wih ach row um o. Thi marix provid h probabilii for ach a ihr rmaining whn i j or rani o a diffrn a whn i j during h im inrval d. Givn h iniial probabilii of h a, h marix can b ud o dcrib h a raniion proc complly. rom h marix dfind in Eq. 9, w can driv h following rlaion a in Eq..-.4. d d. p d d d. p d d d.3 3 3 d d d.4 4 3 4 whr h iniial probabilii ar dfind by h probabiliy of h ym bing a Sa. Thu w hav, and 3 4. A d go o, w driv a of linar fir-ordr diffrnial quaion a in Eq..-.4, which ar a quaion of h CTMC modl. d ' lim d d. d ' lim d d. 3 d 3 3 ' lim 3 d d.3 4 d 4 4 ' lim 3 d d.4 Th a quaion dfind in Eq..-.4 can b olvd uing Laplac ranformaion, which allow o ranform a linar fir ordr diffrnial quaion ino a linar algbraic quaion ha i ay o olv. L h Laplac ranformaion of i b i a dfind in Eq.., h Laplac ranformaion of i can b drivd a in Eq...

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 3 } { d L i i i. ' } ' { i i i i d L. Now apply h Laplac ranformaion dfind in Eq..-. o boh id of h Eq..-.4, w can driv Eq. 3.-3.4. 3. 3. 3 3 3 3.3 3 4 4 3.4 Subiuing h iniial probabilii i, whr i 4, ino Eq. 3.-3.4, w can olv, and 3. By furhr applying invr Laplac ranformaion o, and 3, w can olv h original linar fir ordr diffrnial quaion in Eq..-.3 a follow. 3 3 Th rliabiliy funcion R i h ummaion of, and 3, which can b calculad a in Eq. 4, 3 R 4 I i ay o ha Eq. 4 giv xacly h am formula a h on dfind in Eq. 7; hu, i vrifi h corrcn of our propod analyical approach for calculaing h rliabiliy of h S ga a im. No ha 4 i h probabiliy ha h ym i in i AILURE a a im. Thrfor, 4 acually dfin h ym unrliabiliy funcion U 4 - R. 4.3. Modling and Analyi Uing DT in Two ha To modl and analyz h rliabiliy of a cloud-bad ym wih par componn, w conidr wo diffrn pha. ha rprn h pr-rjuvnaion ag whr h rliabiliy analyi i bad on h failur ra of h primary componn and hir S. CS ar no conidrd in hi pha bcau hy canno ak ovr h ym load inanly whn boh h primary and ho par componn fail. W modl h

4 J. Rahm &. Xu ym rliabiliy uing DT, and hn calcula i rliabiliy bad on h rliabiliy funcion of S ga drivd in Scion 4.. ha i h ofwar rjuvnaion pha. Whn h prdfind rliabiliy hrhold i rachd, h ofwar rjuvnaion proc i iniiad, and h ym nr hi pha. A w hav mniond, hr ar wo rjuvnaion cnario, namly h ym-pcific rjuvnaion and h componn-pcific on. To illura h baic ida of calculaing h ym rliabiliy in hi pha, w u h fir cnario a an xampl, whr h whol ym i rjuvnad. In hi cnario, w ar wo CS and o rplac and, rpcivly. During h rjuvnaion priod, all four ofwar componn,, and coxi and ar funcioning. A hown in ig. 6, h dynamic faul r modl i dcompod ino ubr, S and S, which ar all S ga ha ar conncd by an AND-ga. Thi i bcau h ym fail only whn boh of h wo S ga fail, and h failur of a ingl S ga during h rjuvnaion pha will no lad o h failur of h whol ym. Subr S coni of componn and ha ar o b rjuvnad; whil ubr S coni of h nwly dployd componn and, which ar ud o rplac and. A boh S and S ar dfind a S ga, hy can b compud uing h am analyi chniqu a dcribd in ha. ig. 6. A DT modl wih S ga ha Onc w hav h diribuion funcion of S and S, h aic ga, i.., h ANDga, can b aily olvd uing h um-of-dijoin-produc SD mhod [3]. Spcifically, o calcula h rliabiliy of h whol ym in hi pha, w fir calcula h unrliabiliy funcion U S and U S for S and S, rpcivly. Thn h rliabiliy of h AND-ga can b calculad a in Eq. 5. R U U U 5 AND S S In h following ca udy, w will conidr boh of h wo cnario during h rjuvnaion proc, whr Scnario involv rjuvnaion of h whol ym, and in hi ca, w nd o rplica all major ofwar componn whn h ym rliabiliy rach h hrhold. On h ohr hand, Scnario i componn pcific, hu w only rjuvna h mo criical componn who rliabiliy i h low whn h ym rliabiliy rach i hrhold.

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 5 5. Ca Sudy A challnging ak in cloud compuing i o corrcly maur h rliabiliy of a cloudbad ym and mainain i high rliabiliy. In hi ca udy, w how how o modl and analyz h rliabiliy of a cloud-bad ym uing DT, and hn ima an ffciv rjuvnaion chdul ha m h high rliabiliy rquirmn of h ym. W conidr a ypical cloud-bad ym a hown in ig. 7, which coni of an applicaion rvr A and a daaba rvr B. To nhanc h ym rliabiliy, wo ho par componn A and B ar up for A and B, rpcivly, which ar rady o ak ovr h workload onc h primary on fail. No ha ach of h rvr i dployd in diffrn zon for faul-olranc purpo []. A a clarificaion for h rliabiliy analyi in hi ca udy, w viw a VM wih i OS, h rvr ofwar and h dployd rvic a a ingl ofwar componn. In addiion, w only conidr h rliabiliy of h rvr wihin h box drawn wih dahd lin, and aum h proxy rvr rliabiliy i idal. urhrmor, w aum ha h proxy rvr and h applicaion rvr can monior and dc failur of h applicaion rvr and h daaba rvr, rpcivly. roxy Srvr Zon monior monior Zon 3 rplac App Srvr A App Srvr A monior monior Zon Zon 4 rplac DB Srvr B DB Srvr B ig. 7. A cloud-bad ym wih rvr and hir S To nur a high rliabiliy of h ym, w a rliabiliy hrhold of.99, and aum h conan failur ra of h rvr b A.4, A.5, B.5, and B.3. No ha h failur ra of h ho par componn ar lowr han hir corrponding primary on bcau h par componn do no ak any workload whn h primary on ar funcioning. owvr, whn a primary rvr fail, h aociad ho par componn ak ovr h workload; in hi ca, i failur ra will incra accordingly. W aum h ho par componn hav h am configuraion a hir aociad primary on, hu w hav A A.4 and B B.5.

6 J. Rahm &. Xu Thi ca udy involv 8 ofwar componn ha ar pli ino wo group. Th fir group coni of h four rvr hown in ig. 7. Th cond group coni of four CS componn ha ar ud o rplac h rvr in h fir group during h rjuvnaion proc. W nam h vr in h cond group a A, A, B, and B. A h CS componn ar undployd VM imag, hir failur ra ar. Onc dployd, hy will hav h am failur ra a hir corrponding ofwar componn du o h aumd am configuraion. igur 8 how h DT modl of h cloud-bad ym in ha. Bcau h ym fail whn ihr h applicaion rvr or h daaba rvr fail, h wo S ga ar conncd by an OR-ga. Th rliabiliy funcion of h OR-ga can b drivd a in Eq. 6. ig. 8. DT modl of h cloud-bad ym ha R U U U U 6 OR S S S whr U S and U S ar h unrliabiliy funcion of h ubr S and S, rpcivly. According o Eq. 7, U S and U S can b calculad a in Eq. 7 and Eq. 8, rpcivly. No ha Eq. 7-8 hav bn implifid du o h aumd configuraion, whr A A and B B. U U A A A A A S RS A A 7 B B B B B S RS B B 8 In ha, w conidr boh of h cnario mniond in h nd of Scion 4.3, o hir impac on ym rliabiliy a wll a hir conqun rjuvnaion chdul can b compard. igur 9 how h DT modl of h cloud-bad ym in ha bad on Scnario. or h am raon a in ha, h ym rliabiliy can b calculad a in Eq. 9. According o Eq. 5, U S3 and U S4 can b calculad a in Eq. and Eq., rpcivly. R UOR U S 3 U S 3 U S 4 9 U S 3 U S U S' U S 4 U S U S '

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 7 No ha in Eq. -, U S, U S, U S and U S can b calculad in a imilar way a in Eq. 7-8. ig. 9. DT modl of h cloud-bad ym in ha Scnario Th rliabiliy analyical rul for Scnario ar lid in Tabl. Th abl how ha h rliabiliy hrhold.99 i rachd vry 8 day. nc, boh h applicaion and daaba rvr ar rjuvnad a h nd of ha. A ha ha a 3-minu im duraion, w calcula h ym rliabiliy a 5,, and 3 minu in ha o illura how ym rliabiliy may chang during h rjuvnaion proc. rom h abl, w can ha h ym rliabiliy i kp vry high during h raniion. Afr 3 minu, h nwly dployd rvr complly ak ovr h ym, and h rvr o b rjuvnad ar hu down. Whn hi happn, h ym rurn o i iniial a, and ar a nw lif cycl wih a vry high iniial rliabiliy. According o Tabl, w ugg ha h ym hould b rjuvnad vry 8 day in ordr o mainain h ym rliabiliy abov h hrhold. By furhr looking ino Tabl, w noic ha whn h ym rliabiliy rach.99 afr 8 day, h rliabiliy of h daaba rvr ubym i alway lowr han ha of h applicaion rvr ubym. Thi ugg ha w may fir rjuvna h mo criical componn wih h low rliabiliy.g., h daaba rvr in hi ca udy wihou acrificing h ym rliabiliy oo much. Thn w wai unil h ym rliabiliy rach h hrhold again, and rjuvna h applicaion rvr nx, a hy now bcom h mo criical componn. Thi i xacly wha happn in h rjuvnaion chdul of Scnario, whr h applicaion rvr and h daaba rvr ar rjuvnad alrnaivly. igur how h DT modl of h cloud-bad ym in ha for on of h wo ca in Scnario, whr only h daaba rvr ar rjuvnad. In hi ca, h ym rliabiliy can b calculad a in Eq., and U S and U S4 can b calculad in a imilar way a in Eq. 7 and Eq., rpcivly.

8 J. Rahm &. Xu Tabl. Sym rliabiliy wih ofwar rjuvnaion Scnario ha Tim Day App Srvr DB Srvr Rliabiliy Rliabiliy Sym Rliabiliy.9999875.99998.99996756 5.999686.99957.99994568.998745.99885.996834333 8.99644.9944.99778 8.35.99999999999.99999999999.99999999999 8.69.99999999999.99999999999.99999999999 8.39.99999999999.99999999998.99999999997 8.8.99999999998.99999999994.9999999999............... 73.9999875.99998.99996756 77.999686.99957.99994568 8.998745.99885.996834333 9.99644.9944.99778 9.35.99999999999.99999999999.99999999999 9.69.99999999999.99999999999.99999999999 9.39.99999999999.99999999998.99999999997 9.8.99999999998.99999999994.9999999999 9.9999875.99998.99996756 95.999686.99957.99994568.998745.99885.996834333 8.99644.9944.99778 8.35.99999999999.99999999999.99999999999 8.69.99999999999.99999999999.99999999999 8.39.99999999999.99999999998.99999999997 8.8.99999999998.99999999994.9999999999 9.9999875.99998.99996756 3.999686.99957.99994568 8.998745.99885.996834333 ig.. DT modl of h cloud-bad ym in ha Scnario, rjuvna daaba rvr only R UOR U S U S U S 4

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr 9 Th ym rliabiliy for h ohr ca in Scnario, whr only h applicaion rvr ar rjuvnad, can b calculad in a imilar way. Tabl how h rliabiliy analyical rul for Scnario. A h nd of ach ha, h rvr ubym wih i rliabiliy markd by > i h on o b rjuvnad. or xampl, afr 8 day, h daaba rvr ar rjuvnad, and afr 7 day, h applicaion rvr ar rjuvnad. Tabl. Sym rliabiliy wih ofwar rjuvnaion Scnario ha Tim App Srvr Day Rliabiliy DB Srvr Rliabiliy Sym Rliabiliy.9999875.99998.99996756 5.999686.99957.99994568.998745.99885.996834333 8.99644 >.9944.99778 8.35.99645.99999999999.996449999 8.69.9964.99999999999.99649999 8.39.99638.99999999998.9963799998 8.8.99635.99999999994.9963499994.9955.999969.99577465 5.9955.99949.9968856 7 >.994.99844.99 7.35.99999999999.99844.9984499999 7.69.99999999999.998439.99843899999 7.39.99999999999.998437.99843699999 7.8.99999999998.998435.99843499998 3.999884.99765.99749567 35.9999.99468.9937478645 39.9985 >.9994.996464 39.35.9984.99999999999.998399999 39.69.998.99999999999.99899999 39.39.998.99999999997.9989999997 39.8.99899.99999999993.9989899993............... 85.998486.999969.9984687 9.996853.99949.9959597 95.99467.99765.99955748 96.9947.99684.99994669 97 >.99365.9963.99 97.35.99999999999.9963.996899999 97.69.99999999999.99698.996899999 97.39.99999999998.99694.99679799998 97.8.99999999996.9969.99679399996.999884.99468.994588 5.9999 >.9994.9938465 5.35.9999.99999999999.99874399999 5.69.999895.99999999999.9987499999 5.39.99988.99999999997.99873999997 5.8.999868.9999999999.9987379999.9979.99957.99747753 5.99644.99885.9943657574 9 >.9947.9963.9953553

J. Rahm &. Xu W now illura h uggd rjuvnaion chdul for boh Scnario and Scnario a in ig.. In h figur, h ar of rjuvnaion i indicad by a uddn incrmn of h ym rliabiliy. By comparing h wo rjuvnaion chdul, w can ha during 9 day, Scnario ha 6 rjuvnaion proc ha rquir u o rjuvna boh of h applicaion and daaba rvr. On h ohr hand, Scnario ha 9 rjuvnaion proc ha only rquir u o rjuvna ihr h applicaion rvr or h daaba rvr ach im. I i ay o ha Scnario rul in l managmn of h rvr in ordr o kp h ym rliabiliy abov h.99 hrhold during h 9 day. Suppo h rjuvnaion of h applicaion rvr ha h am co a h ha of h daaba rvr, by uing h rjuvnaion chdul dfind in Scnario, h co can b rducd by 6-9/6 5%, comparing o h rjuvnaion chdul ud in Scnario. ig.. Rjuvnaion chduling for h cloud-bad ym Scnario v. Scnario 6. Concluion and uur Work In hi papr, w propo a rliabiliy-bad approach o ablihing co-ffciv ofwar rjuvnaion chdul for cloud-bad ym. Th ym rquir h uag of ho par componn during normal running im, and cold par componn during h rjuvnaion proc in ordr o kp h ym rliabiliy abov a prdfind hrhold. By modling h rliabiliy of a cloud-bad ym uing DT, w ar abl o driv h rliabiliy funcion for ach ofwar componn a wll a h whol ym. W dfin wo pha for h ofwar rjuvnaion, and dicu abou wo cnario of h rjuvnaion proc in ha. Th analyical rul of our ca udy how ha Scnario i mor co-ffciv han Scnario. or fuur work, w will xnd our currn work for componn wih non-conan failur ra. W will adop a maurmn-bad approach o collcing mpirical daa in

A Sofwar Rliabiliy Modl for Cloud-Bad Sofwar Rjuvnaion Uing Dynamic aul Tr ordr o drmin h pdf of h major ofwar componn, h rliabiliy of which i affcd by ofwar aging. Sofwar ool will b dvlopd for modling and analyzing h rliabiliy of cloud-bad ym, a wll a driving ffciv rjuvnaion chdul. In addiion, w will xpand and apply our propod approach in mor complx cloud nvironmn, uch a cloud-bad ym uing Amazon Wb Srvic AWS. Comparaiv analyi of ym prformanc will b conducd for our propod approach a wll a xiing faul-olran ragi ha improv h rliabiliy of cloud applicaion [6]. inally, w nviion modling and analyzing cloud-bad ym wih aciv andby par componn, which can har workload wih h primary on [7], a a fuur, and mor ambiiou rarch dircion. Acknowldgmn W hank all anonymou rfr for h carful rviw of hi papr, and h many uful commn and uggion ha graly hlpd u o improv h prnaion and h qualiy of h papr. Rfrnc. K. V. Vihwanah and N. Nagappan, Characrizing cloud compuing hardwar rliabiliy, in roc. of h ACM ympoium on Cloud compuing SoCC, Indianapoli, IN, USA, Jun -,, pp. 93-4.. D. ich and. Xu, A RAID-bad cur and faul-olran modl for cloud informaion orag, Inrnaional Journal of Sofwar Enginring and Knowldg Enginring IJSEKE 35 3 67-654. 3. M. Rauand and A. øyland, Sym Rliabiliy Thory: Modl, Saiical Mhod, and Applicaion, Scond Ediion, obokn, Nw Jry, USA, John Wily & Son, Inc., 4. 4.. ham, Sym Sofwar Rliabiliy, Springr Sri in Rliabiliy Enginring, Springr- Vrlag London, 6. 5. A. Somani and N. Vaidya, Undranding faul olranc and rliabiliy, IEEE Compur 34 997 45-5. 6. E. Marhall, aal rror: how pario ovrlookd a cud, Scinc 5555 99 347. 7. M. Gro, R. Maia and K. S. Trivdi, Th fundamnal of ofwar aging, in roc. of h Inrnaional Workhop on Sofwar Aging and Rjuvnaion WoSAR 8, ISSRE, Sal, WA, USA, Novmbr -4, 8, pp. -6. 8. Y. uang, C. Kinala, N. Koli and N. ulon, Sofwar rjuvnaion: analyi, modul and applicaion, in roc. of h Twny-ifh Inrnaional Sympoium on aul-tolran Compuing TCS 95, aadna, CA, USA, Jun 7-3, 995, pp. 38-39. 9. J. Rahm and. Xu, Rliabiliy-bad ofwar rjuvnaion chduling for cloud-bad ym, in roc. of h 7h Inrnaional Confrnc on Sofwar Enginring and Knowldg Enginring SEKE 5, iburgh, USA, July 6-8, 5, pp. 98-33.. V. Calli, R.E. arpr and. idlbrgr, al., roaciv managmn of ofwar aging, IBM Journal of Rarch and Dvlopmn 45 3-33.. L. Jiang and G. Xu, Modling and analyi of ofwar aging and ofwar failur, Journal of Sym and Sofwar 84 7 59-595.. A. Bobbio, M. Srno and C. Anglano, in graind ofwar dgradaion modl for opimal rjuvnaion polici, rformanc Evaluaion 46 45-6.

J. Rahm &. Xu 3. K. Vaidyanahan, D. Slvamuhu and K. S. Trivdi, Analyi of inpcion-bad prvniv mainnanc in opraional ofwar ym, in roc. of h IEEE Sympoium on Rliabl Diribud Sym SRDS, Suia, Japan, Ocobr 3-6,, pp. 86-95. 4. T. Dohi, K. Gova-opojanova and K. S. Trivdi, Saiical non-paramric algorihm o ima h opimal ofwar rjuvnaion chdul, in roc. of Inrnaional Sympoium on Dpndabl Compuing, Lo Angl, CA, USA, Dcmbr, pp. 77-84. 5. V.. Koura and A. N. lai, Applying ofwar rjuvnaion in a wo nod clur ym for high availabiliy, in roc. of h Inrnaional Confrnc on Dpndabiliy of Compur Sym, Szklarka, orba, May 5-7, 6, pp. 75-8. 6.. Machida, A. Andrzjak, R. Maia and E. Vicn, On h ffcivn of Mann-Kndall for dcion of ofwar aging, in roc. of h IEEE Inrnaional Sympoium on Sofwar Rliabiliy Enginring Workhop ISSREW, aadna, CA, USA, Novmbr 4-7, 3, pp. 69-74. 7. M. Grok, L. Li, K. Vaidyanahan and K. S. Trivdi, Analyi of ofwar aging in a wb rvr, IEEE Tran. on Rliabiliy 553 6 4-4. 8. J. Guo, Y. Ju, Y. Wang and X. Li, Th prdicion of ofwar aging rnd bad on ur innion, in roc. of h IEEE Youh Confrnc on Informaion Compuing and Tlcommunicaion YC-ICT, Bijing, China, Novmbr 8-3,, pp. 6-9. 9. D. Corono, R. Nalla and R. iranuono, I ofwar aging rlad o ofwar mric? in roc. of h IEEE Scond Inrnaional Workhop on Sofwar Aging and Rjuvnaion WoSAR, San Jo, CA, USA, Novmbr,, pp. -6... Machida, D. Kim and K. Trivdi, Modling and analyi of ofwar rjuvnaion in a rvr virualizd ym, in roc. of h IEEE Scond Inrnaional Workhop Sofwar Aging and Rjuvnaion WoSAR, San Jo, CA, USA, Novmbr,, pp. -6.. T. Thin, S.-D. Chi and J. S. ark, Availabiliy modling and analyi on virualizd cluring wih rjuvnaion, Inrnaional Journal of Compur Scinc and Nwork Scuriy IJCNS 89 8 7-8.. J. Barr, A. Narin and J. Varia, Building faul-olran applicaion on AWS, Amazon Wb Srvic AWS, Amazon, Ocobr, rrivd on July 5, 5, from hp://mdia.amazonwbrvic.com/aws_building_aul_tolran_applicaion.pdf 3.. Xu, L. Xing and R. Robidoux, DRBD: dynamic rliabiliy block diagram for ym rliabiliy modling, Inrnaional Journal of Compur and Applicaion IJCA 3 9 3-4. 4. R. Robidoux,. Xu, L. Xing and M.C. Zhou, Auomad modling of dynamic rliabiliy block diagram uing colord ri n, IEEE Tran. on Sym, Man, and Cybrnic, ar A: Sym and uman SMC-A 4 337-35. 5. J. B. Dugan, S. J. Bavuo and M. A. Boyd, Dynamic faul-r modl for faul-olran compur ym, IEEE Tran. on Rliabiliy 43 99 363-377. 6. M. Lu and. Yu, A faul olran ragy in hybrid cloud bad on QN prformanc modl, in roc. of h Inrnaional Confrnc on h Informaion Scinc and Applicaion ICISA, aaya, Thailand, Jun 4-6, 3, pp. -7. 7. L. uang and Q. Xu, Lifim rliabiliy for load-haring rdundan ym wih arbirary failur diribuion, IEEE Tran. on Rliabiliy 59 39-33.