How To Calculate Backup From A Backup From An Oal To A Daa



Similar documents
Spline. Computer Graphics. B-splines. B-Splines (for basis splines) Generating a curve. Basis Functions. Lecture 14 Curves and Surfaces II

The Virtual Machine Resource Allocation based on Service Features in Cloud Computing Environment

Capacity Planning. Operations Planning

Methodology of the CBOE S&P 500 PutWrite Index (PUT SM ) (with supplemental information regarding the CBOE S&P 500 PutWrite T-W Index (PWT SM ))

MORE ON TVM, "SIX FUNCTIONS OF A DOLLAR", FINANCIAL MECHANICS. Copyright 2004, S. Malpezzi

Lecture 40 Induction. Review Inductors Self-induction RL circuits Energy stored in a Magnetic Field

An Architecture to Support Distributed Data Mining Services in E-Commerce Environments

Pedro M. Castro Iiro Harjunkoski Ignacio E. Grossmann. Lisbon, Portugal Ladenburg, Germany Pittsburgh, USA

HEURISTIC ALGORITHM FOR SINGLE RESOURCE CONSTRAINED PROJECT SCHEDULING PROBLEM BASED ON THE DYNAMIC PROGRAMMING

PerfCenter: A Methodology and Tool for Performance Analysis of Application Hosting Centers

GUIDANCE STATEMENT ON CALCULATION METHODOLOGY

Linear Extension Cube Attack on Stream Ciphers Abstract: Keywords: 1. Introduction

An Anti-spam Filter Combination Framework for Text-and-Image s through Incremental Learning

HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD

APPLICATION OF CHAOS THEORY TO ANALYSIS OF COMPUTER NETWORK TRAFFIC Liudvikas Kaklauskas, Leonidas Sakalauskas

Genetic Algorithm with Range Selection Mechanism for Dynamic Multiservice Load Balancing in Cloud-Based Multimedia System

12/7/2011. Procedures to be Covered. Time Series Analysis Using Statgraphics Centurion. Time Series Analysis. Example #1 U.S.

MULTI-WORKDAY ERGONOMIC WORKFORCE SCHEDULING WITH DAYS OFF

The Rules of the Settlement Guarantee Fund. 1. These Rules, hereinafter referred to as "the Rules", define the procedures for the formation

Anomaly Detection in Network Traffic Using Selected Methods of Time Series Analysis

Estimating intrinsic currency values

CLoud computing has recently emerged as a new

Analysis of intelligent road network, paradigm shift and new applications

Ground rules. Guide to the calculation methods of the FTSE Actuaries UK Gilts Index Series v1.9

Network Effects on Standard Software Markets: A Simulation Model to examine Pricing Strategies

RESOLUTION OF THE LINEAR FRACTIONAL GOAL PROGRAMMING PROBLEM

Social security, education, retirement and growth*

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS. Exponential Smoothing for Inventory Control: Means and Variances of Lead-Time Demand

Proceedings of the 2008 Winter Simulation Conference S. J. Mason, R. R. Hill, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds.

Index Mathematics Methodology

An Ensemble Data Mining and FLANN Combining Short-term Load Forecasting System for Abnormal Days

Multiple Periodic Preventive Maintenance for Used Equipment under Lease

SPC-based Inventory Control Policy to Improve Supply Chain Dynamics

Insurance. By Mark Dorfman, Alexander Kling, and Jochen Russ. Abstract

(Im)possibility of Safe Exchange Mechanism Design

THE USE IN BANKS OF VALUE AT RISK METHOD IN MARKET RISK MANAGEMENT. Ioan TRENCA *

A Background Layer Model for Object Tracking through Occlusion

INTERNATIONAL JOURNAL OF STRATEGIC MANAGEMENT

Cost- and Energy-Aware Load Distribution Across Data Centers

Prices of Credit Default Swaps and the Term Structure of Credit Risk

Optimization of Nurse Scheduling Problem with a Two-Stage Mathematical Programming Model

SHIPPING ECONOMIC ANALYSIS FOR ULTRA LARGE CONTAINERSHIP

Kalman filtering as a performance monitoring technique for a propensity scorecard

A binary powering Schur algorithm for computing primary matrix roots

Testing techniques and forecasting ability of FX Options Implied Risk Neutral Densities. Oren Tapiero

Time Series. A thesis. Submitted to the. Edith Cowan University. Perth, Western Australia. David Sheung Chi Fung. In Fulfillment of the Requirements

A Heuristic Solution Method to a Stochastic Vehicle Routing Problem

Market-Clearing Electricity Prices and Energy Uplift

The Definition and Measurement of Productivity* Mark Rogers

Currency Exchange Rate Forecasting from News Headlines

Evaluation of the Stochastic Modelling on Options

A Hybrid Method for Forecasting Stock Market Trend Using Soft-Thresholding De-noise Model and SVM

Distributed Load Balancing in a Multiple Server System by Shift-Invariant Protocol Sequences

A GENERALIZED FRAMEWORK FOR CREDIT RISK PORTFOLIO MODELS

Analyzing Energy Use with Decomposition Methods

Y2K* Stephanie Schmitt-Grohé. Rutgers Uni ersity, 75 Hamilton Street, New Brunswick, New Jersey

Linear methods for regression and classification with functional data

Distribution Channel Strategy and Efficiency Performance of the Life insurance. Industry in Taiwan. Abstract

ANALYSIS OF SOURCE LOCATION ALGORITHMS Part I: Overview and non-iterative methods

A Hybrid AANN-KPCA Approach to Sensor Data Validation

Modèles financiers en temps continu

HAND: Highly Available Dynamic Deployment Infrastructure for Globus Toolkit 4

Applying the Theta Model to Short-Term Forecasts in Monthly Time Series

A robust optimisation approach to project scheduling and resource allocation. Elodie Adida* and Pradnya Joshi

Guidelines and Specification for the Construction and Maintenance of the. NASDAQ OMX Credit SEK Indexes

Event Based Project Scheduling Using Optimized Ant Colony Algorithm Vidya Sagar Ponnam #1, Dr.N.Geethanjali #2

A Real-time Adaptive Traffic Monitoring Approach for Multimedia Content Delivery in Wireless Environment *

The US Dollar Index Futures Contract

Load Balancing in Internet Using Adaptive Packet Scheduling and Bursty Traffic Splitting

Return Persistence, Risk Dynamics and Momentum Exposures of Equity and Bond Mutual Funds

606 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 6, NO. 6, JUNE Impact of Human Mobility on Opportunistic Forwarding Algorithms

International Journal of Mathematical Archive-7(5), 2016, Available online through ISSN

Attribution Strategies and Return on Keyword Investment in Paid Search Advertising

THE IMPACT OF QUICK RESPONSE IN INVENTORY-BASED COMPETITION

Fixed Income Attribution. Remco van Eeuwijk, Managing Director Wilshire Associates Incorporated 15 February 2006

MODEL-BASED APPROACH TO CHARACTERIZATION OF DIFFUSION PROCESSES VIA DISTRIBUTED CONTROL OF ACTUATED SENSOR NETWORKS

II. IMPACTS OF WIND POWER ON GRID OPERATIONS

FRAMEWORK OF MEETING SCHEDULING IN COMPUTER SYSTEMS

Selected Financial Formulae. Basic Time Value Formulae PV A FV A. FV Ad

Omar Shatnawi. Eks p l o a t a c j a i Ni e z a w o d n o s c Ma in t e n a n c e a n d Reliability Vo l.16, No. 4,

Transcription:

6 IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 Mahemacal Model of Daa Backup and Recovery Karel Burda The Faculy of Elecrcal Engneerng and Communcaon Brno Unversy of Technology, Brno, Czech Republc Summary The daa backup and daa recovery are well esablshed dscplnes whn he compuer scence already. Despe hs fac, however, here s currenly no suffcenly general mahemacal model for quanfcaon of backup sraeges. In hs paper, such mahemacal model s derved. The model nroduced s based on he assumpon ha he oal sze of he daa s consan. Usng hs model, we can quanavely evaluae he properes of dfferen ypes of backup sraeges. For he full, dfferenal and ncremenal backup sraeges, formulae are derved for he calculaon of he average oal backup sze and he average daa recovery sze n dependence on he probably of he change of daa uns. The quanave relaons obaned are dscussed. Key words: Daa backup, daa recovery, mahemacal model.. Inroducon Daa are one of he mos valuable asses of persons and organzaons. However, we can lose hese daa due o echncal falures, human errors, ec. Ths s he reason why daa are coped no backup daa reposores (so-called daa backup) n order o have he possbly of reconsrucng hese daa when he daa are los n he man daa reposory (so-called daa recovery). Wh regard o mahemacal models of backup processes, A. Frsch saes ha sudes performng numercal and quanave modelng backup processes are surprsngly rare (see [], p. 4). The models publshed so far allow only compuaons of full and ncremenal backups (e.g. [], p. 4) and he quanfcaon of he Amanda backup scheme ([], p. 745-750 or [], p. 5-9). Z. Kurmas and L. Chervenak have publshed a comparson of some backup sraeges [4], however, her evaluaon of backup sraeges s based on smulaons and herefore no analyc resuls are presened n hs paper. Oher research papers are orened on backup frequency and mng (e.g. [5] or [6]) and also do no provde any suable resuls for he evaluaon of general backup sraeges. Therefore, we can sae ha here s currenly no suffcenly general mahemacal model for quanfcaon of daa backup and recovery, despe he grea mporance of boh of hese procedures. Ths paper nroduces a mahemacal model of daa backup whch allows a quanave comparson of dfferen backup mehods. In he nex secon, he basc erms are formulaed and he process of daa backup and recovery s formalzed. The hrd secon s dedcaed o he schemes of daa recovery and he fourh secon deals wh he aomc backup (backups based on snapshos of daa uns). In he ffh secon, a mahemacal model for he quanfcaon of backups s derved and n he sxh secon, hs model s appled o dfferen daa backup mehods and he resuls obaned are dscussed.. Daa backup formalzaon The ermnology of he daa backup s no unfed and many auhors are usng he basc erms n dfferen and ofen conflcng ways [7]. Therefore, we specfy and formalze basc erms a frs. Daa D are a srucure of symbols, whch represens some nformaon. In he daa reposory, hese daa are arranged no ceran elemenary (.e. furher ndvsble) sequences of symbols (e.g. secors of he hard dsk). We refer o an elemenary sequence d of symbols as a daa un and denoe he x-h daa un as d x. In he course of me, he daa un d x can correspond o dfferen sequences of symbols, whch we call versons of he daa un d x. The daa un d x, whch does no represen any nformaon, s he so-called empy daa un. We wll formally denoe such a daa un as d x =. The model s based on he assumpon ha he oal sze of he daa D s consan and ha he sze of he meadaa (overhead of daa uns) s neglgble. We have already saed ha he symbol sequence of he daa un d x vares wh me. From he backup vewpon, he mporan evens are changng he daa uns and akng he backups. To smplfy descrpon, we assume ha he evens menoned can no come smulaneously and changng he daa uns and akng he backups are nsananeous,.e. he duraon of hese evens s equal o zero. When he daa un d x has been changed a he me, we denoe hs verson of he daa un as d x ( ). Ths verson s vald n he me nerval, ), where s he nsan of he nex change of he daa un d x. The quany d x ( ) can also represen a verson of he daa un whch s vald a he me. In our model, we assume ha f >, hen >. Manuscrp receved July 5, 04 Manuscrp revsed July 0, 04

IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 7 We wll denoe he daa a he nsan as D( ) and assume ha hese daa conss of n daa uns d ( ) o d n ( ). Then, we can formally represen he daa as an array of n daa uns: n ) = d ( ), d ( ),, d ( ) = d ( ) () [ ] [ ]. D ( n x x= The bass of any arbrary ype of backup s he full backup. The full backup F( ) s a record of all daa uns whch were vald a he me,.e.: F( ) = D( ). () Full backups are large, because hey also conan daa uns whch have remaned unchanged snce he prevous backup. Therefore, hese backups are combned wh paral backups, whch conan only daa whch have been changed compared o some prevous backup. Le D( ) be he daa a he nsan and le D( ) be he daa a he nsan. Denoe by b x (, ) he change of he daa un d x a he nsan n conras o he nsan. I holds for hs change:, when d x has been deleed, b x(, ) = 0, when d x( ) = d x( ), () d x( ), oherwse. The frs lne expresses he suaon when he daa un d x has been deleed, he second lne represens he suaon when here has been no change n he daa un d x and he hrd lne corresponds o he suaon when he daa un d x has been eher creaed or modfed. Then, he paral backup P(, ) s a oal of changes of all daa uns whn he nerval (, : n, ) = b (, ) (4) [ ]. P ( x x= I generally holds ha b x (, ) b x (, ) and herefore P(, ) P(, ). For example, when D( ) = [d ( ), d ( ), d ( )] and D( ) = [d ( ), d ( ), ], hen P(, ) = [0, d ( ), ] bu P(, ) = [0, d ( ), d ( )]. When we know he daa D( ) and he paral backup P(, ), hen we can recover D( ),.e. daa a he nsan. Le us defne an operaon whn whch he daa D( ) are recovered from he daa D( ) and he paral backup P(, ). We refer o hs operaon as a daa revson. Formally, we defne he daa revson n he followng way: D ) P(, ) = D( ), (5) ( where he revson of each daa un s gven by he formula: d ( ) = d ( ) b (, ) = x x, when bx (, ) =, = d x( ), when bx (, ) = 0, bx (, ), oherwse. x (6) In he case of he frs lne, he backup conans he nformaon ha he daa un d x has been deleed n he course of he me nerval (,. For hs daa un, he resul of daa revson s an empy un. In he case of he second lne, he backup conans he nformaon ha here has been no change n he daa un d x n he gven nerval. For hs daa un, he resul of daa revson s he verson of he daa un a he nsan. The hrd lne corresponds o he suaon when he backup conans he changed verson of he daa un d x. In hs case, he resul of daa revson s he changed verson of he daa un. When holds for he daa revson n (5) ha <, hen we wll speak abou forward daa recovery. In hs case, we oban by daa revson a newer verson, D( ), from he older verson, D( ). When holds ha >, hen we wll speak abou backward daa recovery. In hs case, we oban by daa revson he older verson, D( ), from he newer verson, D( ). For our examples of backups, holds ha: D( ) = D( ) P(, ) = [ d( ), d( ), d( ) ] [ 0, d( ), ] = = [ d( ) 0, d( ) d( ), d( ) ] = [ d( ), d( ), ] and D( ) = D( ) P(, ) = [ d( ), d( ), ] [ 0, d( ), d( ) ] = = [ d( ) 0, d( ) d( ), d( ) ] = [ d( ), d( ), d( ) ]. When we do no need o dsngush beween he full backup F( ) and he paral backup P(, ), hen we wll generally denoe hese backups as B( ). We wll classfy paral backups no nerval and aomc backups. An nerval backup I(, ) s a paral backup P(, ) whch conans he laes verson of he daa uns ha have been changed n he nerval (,. Ths means ha f any daa un s changed several mes n he nerval (,, hen he gven nerval backup wll conan only he verson of he daa un whch was vald a he nsan. From hs, follows ha we can recover he sae of daa only for he nsans of akng backups. Ths ype of backup s hsorcally he oldes and hese backups have a smaller sze. The aomc backup s a modern mehod whch has been made possble by he so-called snapsho echnque (e.g. [8]). Wh hs echnque, when a new verson of a daa un s wren, he older verson of he gven daa un (he so-called snapsho) s reaned n he reposory. In he case of aomc backup, all snapshos are backed up and herefore we can recover he daa no he sae a an arbrary nsan. A drawback of hs ype of backup s he large sze of he aggregae of all snapshos. Before we explan he aomc backups n more deal, we wll frs dscuss he daa recovery schemes.

8 IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04. Daa recovery schemes The scheme of daa recovery defnes he procedure of daa reconsrucon from he backups acqured. In hs paper, we wll express daa recovery schemes by a graph. The nodes and of such a graph represen he backups B( ) and B( ) and he orened edge (, ) expresses ha we mus frs recover he daa from he backup B( ) and subsequenly revse hese daa by usng he daa from he backup B( ). We wll call hs relaon beween backups as a backup successon. In hs case, we wll call he backup B( ) a reference backup and he backup B( ) a subsequen backup. An example of he recovery scheme for fve consecuve backups B( ) o B( 5 ) s gven n Fg.. The backup B( ),.e. node, s a full backup whch conans all daa a he nsan. Smulaneously, hs backup s he reference backup for he paral backups B( ) and B( 4 ). These backups are reference backups for he paral backups B( ) and B( 5 ). The nerval backups B( ), B( ) and B( 5 ) conan changes of daa uns compared o he neares prevous backup. The nerval backup B( 4 ) conans he laes verson of he daa uns whch were changed whn he nerval (, 4. 4 5 4 recovery s very smple. The daa D( ) are smple o recover by wrng he full backup F( ) no he man reposory,.e.: D( ) = F( ). (7) 4 5 4 5 Fg. : An example of he mlesone recovery scheme. Oher schemes are based on a combnaon of he full backup and parals backups. We wll refer o hese schemes as reference recovery schemes. The dfferenal and ncremenal recovery schemes are he mos frequenly used. The dfferenal recovery scheme s llusraed n Fg.. In hs ype of recovery scheme, he frs backup B( ) s he full backup and all he oher backups are nerval backups. The reference backup for each nerval backup s he frs backup. Formally, we can express hs scheme as B( ) = F( ) and B( ) = I(, ) where = o 5. The advanage over he prevous scheme s he smaller sze of he aggregae of backups bu he recovery of D( ) for > s more complcaed. In hs case, we frs wre he full backup F( ) no he man reposory a frs and hen we revse hese daa by he backup I(, ). Formally, we can express he daa recovery accordng o he dfferenal recovery scheme as: F( ), for =, D( ) = (8) F( ) I(, ), oherwse. 5 Fg. : An example of a recovery scheme. From Fg., s apparen ha f we wan o recover daa a he nsan, we mus frs wre he full backup B( ) no he man reposory, hen we mus revse hese daa by he backup B( ), and fnally, we mus sll revse he daa obaned by usng he backup B( ). Now, we can explan he recovery schemes more precsely. In hs paper, we focus on he mlesone, dfferenal and ncremenal recovery schemes because hese schemes are he mos wdely used. The above recovery schemes are assocaed wh he full, dfferenal and ncremenal backup sraeges [9]. For our example of fve backups, he aforesad schemes are llusraed n Fg. o Fg. 4. Fg. llusraes he mlesone recovery scheme. Ths scheme consss of full backups (he so-called mlesones) only,.e. B( ) = F( ). A drawback of hs recovery scheme s he large sze of backups (a large capacy of backup reposores s requred), bu he daa 4 5 4 5 Fg.: An example of he dfferenal recovery scheme. Fg. 4 llusraes he ncremenal recovery scheme. In hs ype of recovery scheme, he frs backup B( ) s agan he full backup and all he oher backups are nerval backups. However, he reference backup for each nerval backup s he neares prevous backup. Formally, we can express hs scheme as B( ) = F( ) and B( ) = I(, ) where = o 5. The advanage over he wo prevous schemes s he smaller sze of he aggregae of backups; however, he recovery of D( ) for > s he mos complcaed on he average. Ths s because daa from he backup B( ) mus be sequenally revsed by all backups

IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 9 B( ) o B( ). Formally, we can express he daa recovery accordng o he ncremenal recovery scheme as: F( ), for =, D( ) = F( ) I(, )... I( 4 4 5, ), oherwse. Fg. 4: An example of he ncremenal recovery scheme. All he reference schemes descrbed are forward daa recovery schemes,.e. we can recover he daa D( ) from he nal full backup F( ), where >. By analogy, here are also backward daa recovery schemes. In hese cases, he reference full backup conans he newes daa and he paral backups conan older versons of daa uns. If need be, we can reurn o a ceran former sae of he daa. The represenaves of backup sysems wh backward daa recovery are some backup sysems wh daa mrrorng. Daa mrrorng (e.g. [0]) s based on ha each change of any daa un n he man reposory s praccally nsanly execued n he backup reposory oo. Therefore, he daa n boh reposores are dencal and he backup reposory can be used n he case of he falure of he man reposory as a full-blown replacemen of he mpared reposory. When we back up older versons of daa uns, hen from he vald daa D( ), we can resore daa a he nsan, where <. 4. Aomc backup Now, we can reurn o he descrpon of he aomc backup. In he case of aomc backup, when a new verson of he daa un s wren, he older verson of hs daa un (he so-called snapsho) s reaned n he reposory. A backup program looks up and backs up snapshos whch have no been backed up ye. By usng he me daa, whch are kep abou each snapsho, we can arrange he versons of all daa uns accordng o her valdy n he me. Ths sequence allows us o recover daa a an arbrary nsan. From he backup vewpon, we can consder each snapsho as an ndvdual backup. Then we can consder he sequence of snapshos n me as a sequence of ndvdual backups, whch are organzed n he ncremenal recovery scheme. Each aomc backup always conans a sngle 5 (9) verson of a sngle daa un only. When he daa un d y s changed a he nsan, hen we can formally express he correspondng aomc backup A(, ) as: n, ) = b (, ) (0) [ ], A ( x x= where d y( ), for x = y, bx (, ) = () 0, oherwse. Of course, he operaon of daa revson also holds for aomc backups. We wll llusrae he aomc backup on a scenaro accordng o Table and Fg. 5. Tab. : A scenaro for he llusraon of he aomc backup. d ( ) d ( ) d ( ) Backups B d ( ) d ( ) d ( ) B( ) = F( ) = [d ( ), d ( ), d ( )] d ( ) d ( ) d ( ) B( ) = A(, ) = [0, d ( ), 0] d ( ) d ( ) d ( ) B( ) = A(, ) = [0, 0, d ( )] 4 d ( ) d ( 4 ) d ( ) B( 4 ) = A(, 4 ) = [0, d ( 4 ), 0] 5 d ( ) d ( 4 ) d ( ) 4 5 [d ( ), d ( ), d ( )] [0, d ( ), 0] [0, 0, d ( )] 4 [0, d ( 4 ), 0] Fg. 5: An example of he aomc backup. In hs scenaro, he daa conss of hree daa uns d x, where x {,, }. A he nsan, hree daa uns d ( ), d ( ) and d ( ) were placed no he reposory. The full backup F( ) was smulaneously creaed from hese versons of daa uns,.e. B( ) = F( ) = [d ( ), d ( ), d ( )] (see Fg. 5). A he nsan, he daa un d was changed from he verson d ( ) o he verson d ( ). A he nsan, he daa un d was changed from he verson d ( ) o he verson d ( ) whle a he nsan 4, he daa un d was changed from he verson d ( ) o he verson d ( 4 ). Le us suppose ha a backup program s acvaed a he nsan 5. A hs nsan, he vald daa uns are d ( ), d ( 4 ) and d ( ). Also suaed n he reposory are he snapshos d ( ), d ( ) and d ( ). Accordng o he me daa n he meadaa of snapshos,

0 IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 he backup program can arrange hese snapshos n me. On hs bass, he backup program creaes he aomc backups B( ) = A(, ) = [0, d ( ), 0], B( ) = A(, ) = [0, 0, d ( )], and B( 4 ) = A(, 4 ) = [0, d ( 4 ), 0]. The snapshos can hen be deleed and he backups B( ), B( ), B( ), and B( 4 ) can be ulzed o recover he daa a an arbrary nsan n he nerval, 5 ). As regards he las lne of Table should be noed ha here was no daa change a he nsan 5 and herefore no aomc backup exss a hs momen. If he aggregae of aomc backups were overly large, hen we could ransform he aomc backups no a sngle nerval backup. In hs way, we can reduce he oal sze of backups because he nerval backup conans only he laes verson of each changed daa un. A he same me, however, we canno recover daa a an arbrary nsan. We can derve he nerval backup from aomc backups by usng he daa revson as follows: I, ) = A(, ) A(, )... A(, ). () ( + + + For he above example, we can wre ha: I(, 4 ) = A(, ) A(, ) A(, 4) = = [ 0, d( ),0] [ 0,0, d( ) ] [ 0, d( 4 ),0] = = [ 0 0 0, d( ) 0 d( 4 ),0 d( ) 0] = = [ 0, d( 4 ), d( ) ] We see ha our aomc backups conan hree non-empy daa uns bu he nerval backup consss of wo nonempy daa uns only. We have reduced he sze of backups, bu now we can only recover he daa a he nsan 4. When we wan o have he possbly of recoverng daa from he pas, hen we can use a combnaon of daa mrrorng and snapshoong. The prncple s he followng. When he daa un d x n he man reposory s changed from he verson d x ( - ) o he verson d x ( ), hen hs change s also execued n he backup reposory and he older verson d x ( - ) s backed up as he aomc backup A(, - ). Aomc backups arranged accordng o me enable us o recover any pas daa. For example, when we need o recover he daa D( ), hen we execue hs recovery as: ) = D( ) A(, ) A(, )... A(, ). () D ( + An example of he recovery scheme whch s based on he combnaon of he daa mrrorng and snapshoong s n Fg. 6 and Table. Ths example corresponds o he scenaro from Table. For hs recovery scheme, he vald daa D( ) of he backup reposory are he reference full backup,.e. F( ) = D( ). A he nsan 5, holds ha F( 5 ) = [d ( ), d ( 4 ), d ( )]. Ths sae s smulaneously he sae a he nsan of he laes daa change,.e. a he nsan 4. Therefore, we can wre ha F( 5 ) = D( 5 ) = D( 4 ) = B( 4 ). The backup F( 5 )= B( 4 ) s he nal reference backup for he ncremenal recovery scheme wh subsequen aomc backups A( 4, ) = [0, d ( ), 0], A(, ) = [0, 0, d ( )], and A(, ) = [0, d ( ), 0]. Le us noe ha hs scheme s very smlar o he aomc ncremenal scheme n Fg. 5. The dfference beween hese schemes s her orenaon n me. In Fg. 5, we see he forward ncremenal scheme whle Fg. 6 llusraes he backward ncremenal scheme. We pon ou ha he backward recovery schemes need no be ncremenal aomc schemes only. As we menoned above, we can derve arbrary nerval backups from aomc backups by usng (). Ths enables us o consruc arbrary backward nerval schemes. Tab.: A scenaro llusrang he backward aomc backup. d ( ) d ( ) d ( ) Backups B d ( ) d ( ) d ( ) d ( ) d ( ) d ( ) B( ) = A(, ) = [0, d ( ), 0] d ( ) d ( ) d ( ) B( ) = A(, ) = [0, 0, d ( )] 4 d ( ) d ( 4 ) d ( ) B( ) = A( 4, ) = [0, d ( ), 0] 5 d ( ) d ( 4 ) d ( ) D( 5 ) = B( 4 ) = [d ( ), d ( 4 ), d ( )] 4 5 4 [d ( ), d ( 4 ), d ( )] [0, d ( ), 0] [0, 0, d ( )] [0, d ( ), 0] Fg. 6: An example of he backward aomc backup. Now, we summarze he ermnology nroduced n hs paper. We wll classfy he backups as follows: full, paral, nerval, aomc. We wll sor he recovery schemes as follows: mlesone, reference, forward, backward. We wll classfy he reference recovery schemes as follows: dfferenal, ncremenal,

IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 combned. In he nex secon, we nroduce a mahemacal model for a quanave evaluaon of backups and recovery schemes. backup (.e. unl = + τ). We can formally express hs probably as a condonal probably: Q ( τ) = Pr( U > r + τ U > r). (6) 4 5. Mahemacal model r τ In hs secon, we deal wh a mahemacal model whch enables us o quanfy he sze of dfferen backups. Ths enables us o evaluae he properes of dfferen daa recovery schemes. We derve he model for he forward recovery schemes, bu he formulas obaned are also vald for he backwards recovery schemes. In he model, we assume ha all daa uns d x conss of he same number of symbols. We wll call hs number of symbols he sze of he daa un and denoe by d. We pessmscally suppose ha here are no empy daa uns n he man reposory,.e. all n daa uns always conan ceran daa. Then, for he oal amoun of he daa D sored n he man reposory, holds ha D = n d. The same formula evdenly holds for he sze of he full backup: F( ) = D. (4) Now, le us solve he sze of paral backups. Le us suppose ha he ( )-h change of he daa un d x occurred a he nsan - and he -h change of hs daa un occurred a he nsan. Denoe he me beween hese consecuve changes by u,.e. u = ( - ). In he model, he mes u are represened by he random varable U whose averaged value s Δ. We assume ha hs averaged value s he same for all daa uns. Then, he varable λ = /Δ represens he rae of changes of each daa un. We furher assume ha he probably dsrbuon of he random varable U s an exponenal dsrbuon. The cumulave dsrbuon funcon G of hs dsrbuon s: λu G( u) = Pr( U u) = e. (5) Now, we wll represen mahemacally he nerval backups. To hs purpose, we wll use he quanes n Fg. 7. In he Fgure, he nsans,,, and 4 are ndcaed. The nsans and 4 are nsans a whch a ceran daa un was changed. The nsans and are nsans a whch he daa backup was execued. We noe ha hese backups need no be adacen. The quany τ = ( ) s he me beween hese backups, he varable u = ( 4 ) s he me beween wo consecuve changes of he daa un (.e. a realzaon of he random varable U), and he quany r = ( ) s he me beween he change of he daa un and he followng backup. Now, we are neresed n he probably Q(τ), whch s he probably ha when he daa un has no been agan changed unl he frs backup (.e. unl ), hen hs daa un s no changed unl he nsan of he nex observed Fg.7: Quanes for he dervaon of he mahemacal model. The exponenal dsrbuon s a dsrbuon whou memory and herefore holds (e.g. [], p. 40): Pr( U > r + τ U > r) = Pr( U > τ). (7) I follows from (7) ha when he daa un s no changed n he me nerval r (he condon U > r), hen he probably ha hs daa un does no change n he me nerval (r + τ) s he same as he probably ha hs change does no occur n he me nerval τ. Then we have: Q( τ) = Pr( U > r + τ U > r) = Pr( U > τ) = λτ (8) = Pr( U τ) = G( τ) = e. The complemenary probably P(τ) = Q(τ) s, of course, a probably ha he daa un s changed n he me nerval τ. The daa D conss of n daa uns and herefore he nerval backup I( τ, ) on average consss of n P(τ) changed daa uns. Then he average sze I( τ, ) of hs backup s: λτ I( τ, ) = d n P( τ) = D ( e ). (9) Ths formula enables compung he average sze of each subsequen nerval backup B( ) = I( τ, ), whch s spaced he me nerval τ from s reference backup B( τ). Le us assume ha backups are execued perodcally wh he me nerval T. Then holds ha he me nerval beween wo dfferen backups τ = k T, where k =,,,... Now, le us defne he quany q: λt q = Q( T ) = e. (0) The quany q s he probably ha he daa un does no change n he me nerval T. The quany p = q s he probably ha he daa un changes n he me nerval T. For hs quany, holds: λt p = q = e. () Then he average sze of he nerval backup I( k T, ) s: λkt k I( k T, ) = D e = D q () u ( ) ( ).

IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 Ths formula enables compung he average sze of each subsequen nerval backup B( ) = I( k T, ), whch s spaced he me nerval k T from s reference backup B( k T). Now, we deermne he oal average sze S of he aggregae of aomc backups from he me nerval T. We have already nroduced he noaon accordng o whch he rae of changes of each daa un s λ and he number of daa uns s n. Then, he oal number of changes N = n λ T. Each such change s recorded n one aomc backup of d symbols n sze. Therefore, he oal average sze S( T, ) of he aggregae of aomc backups from he me nerval T s: S( T, ) = d n λ T = D λ T = D ln. () p We noe ha he nverson of equaon () was used o express S( T, ) n he laes erm. By usng formulas (4), (), and (), we can quanfy he average sze of dfferen ypes of backup. Now, we can quanavely analyze he dfferen ypes of recovery scheme. 6. Dscusson We wll llusrae he ulzaon of our model on recovery schemes n Fgs o 4. The mlesone scheme s n Fg., he dfferenal scheme n Fg., he ncremenal scheme n Fg. 4, and he combned scheme n Fg.. We noe ha he schemes nroduced are comparable snce all hese schemes conss of M = 5 backups. We wll resrc ourselves o forward schemes only, bu he resuls obaned, of course, also hold for he equvalen backward recovery schemes. Wh all nerval recovery schemes, we assume ha he backups are execued wh he perod T. The sze of he daa D( ) and he backups B( ) wll be denoed D( ) and B( ), respecvely. For a comparson of he schemes, we wll ulze wo parameers. The frs parameer s he average oal backup sze C. We compue he value of hs parameer by summng he average szes of all backups n he gven scheme,.e.: C = M = B( ). (4) The parameer C represens he sorage space demands of he gven recovery scheme and herefore we ry o mnmze s value. A hgher value of he average oal backup sze means he backup reposory has a larger sorage capacy and herefore a hgher cos oo. The average recovery me s he nex creron for evaluang he recovery schemes. If we assume a consan rae of wrng daa from he backup reposory no he man reposory, hen he average recovery me depends on he average sze of backups ha we mus wre no he man reposory n order o recover he requred daa D( ). However, o recover dfferen daa,.e. D( ) o D( M ), we mus wre no he reposory he dfferen backups of dfferen szes. For example, n he recovery scheme accordng o Fg., we need o wre only he full backup B( ) o recover he daa D( ). However, o recover he daa D( 5 ), we need o wre no he reposory he backups B( ), B( 4 ), and B( 5 ). We wll call he se of backups needed for recoverng he daa D( ) he recovery daa and denoe her average sze by R. Now, le us reurn o he recovery schemes. We recall ha he node represens he backup a he nsan. We wll call he node of a full backup he roo. Le us denoe by V he se of nodes n he pah from he roo o he node. In our example, he pah for he node s gven by he node self and herefore V = {}. In he case of he node 5, he pah s gven by he sequence of nodes -4-5 and herefore V 5 = {, 4, 5}. Then, for R holds: R = B( ). (5) k k V For example, accordng o Fg., R 5 = B( ) + B( 4 ) + B( 5 ). By usng (5), we can compue he sze of he daa for recoverng any sae of D( ). We wll call he average value of all quanes R o R M he average daa recovery sze and denoe by R. Formally, we can wre: M R = R. (6) M = The average recovery me depends n drec proporon on he quany R and herefore we ry o mnmze he value of R. In such a case, he daa recovery wll be he fases on average. Now, we derve he formulas for C and R for all he schemes consdered. In he case of he mlesone scheme (Fg. ), we see ha he average oal backup sze equals he sum of he average szes of all M = 5 full backups,.e. holds: M C = F( ) = D M. = (7) The average szes of he recovery daa R are dencal for all D( ), where R = D. Therefore, for he average daa recovery sze R, smply holds: R = D. (8) In he case of he dfferenal recovery scheme (see Fg. ), we know ha B( ) = F( ) and B( ) = I(, ), where = o M. Then for he szes of sngle backups, s vald: F( ) = D, =, B( ) = (9) I(, ) = D ( q ), =,,... M.

IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 By subsung hese quanes no (4), (5), and (6), we fnally oban: and q q C = D M q M (0) M D q q R = M. () M q In he case of he ncremenal recovery scheme (see Fg. 4), we know ha B( ) = F( ) and B( ) = I(, ), where = o M. Then for he szes of sngle backups, s vald: F( ) = D, =, B( ) = () I(, ) = D ( q), =,,... M. By subsung hese quanes no (4), (5), and (6), we fnally oban: C = D M M q () and [ ( ) ] D R = [ + M ( M ) q]. (4) For comparson, we addonally derve he quanes C and R for he combned recovery scheme n Fg.. The frs backup s full,.e. B( ) = D. The second, hrd and ffh backups are spaced one me nerval T from her reference backups and herefore we can wre ha B( ) = B( ) = B( 5 ) = D ( q). The fourh backup s spaced he me nerval T from s reference backup and herefore B( 4 ) = D ( q ). By subsung hese quanes no (4), (5), and (6), we oban: C = D 5 q q (5) and ( ) D R = ( 4 q q ). (6) 5 To compare C and R for he schemes consdered, we ulze he funcons C = f(p) and R = g(p), where p s he probably ha he daa un s changed n he me nerval T. The graphs of funcons C = f(p) n he mulples of he quany D are n Fg. 8 for he all schemes consdered. In he case of he mlesone recovery scheme, he value of C s he consan 5 D or generally M D. Now, le us analyze he reference recovery schemes. For p = 0, we can see ha C = D for all reference schemes. Ths s due o he fac ha here are no changes n he daa uns. All subsequen backups are empy and he value C = D s gven by he frs (.e. full) backup. The second common exreme of all reference schemes s he value of C when he rae of daa un changes λ. In hs case, all daa uns are always changed whn he nerval T and herefore he probably p =. Then, he oal average sze C of backups equals he value M D,.e. he sorage space demands of all reference schemes are dencal wh hose of he mlesone scheme. 5 C Mlesone [ D ] Dfferenal 4 Combned Incremenal 0 0 0. 0.4 0.6 0.8 p Fg. 8: The dependence of he average oal backup sze C on he probably p for dfferen recovery schemes. When we compare all schemes, we can see ha here are wo exremes. The frs exreme s he mlesone recovery scheme. For all p (0, ), hs scheme has he hghes demands on he sorage capacy of he backup reposory. The second exreme s he ncremenal recovery scheme where, by conras, hese sorage space demands are he lowes. Then, we can ake he sorage space demands of he mlesone recovery scheme as an upper bound: Cmax = D M (7) and he sorage space demands of he ncremenal recovery scheme as a lower bound: C = D M p (8) [( ) ] mn + The sorage space demands of he oher schemes are beween hese bounds. Fg. 9 llusraes he funcons R = g(p),.e. he dependence of he average daa recovery sze R on he probably p. We can see ha he lower bound of R s gven by he mlesone recovery scheme: R mn = D. (9) By conras, he upper bound of R s gven by he ncremenal recovery scheme: M R max = D p +. (40) From he wo fgures above, we can see ha he mlesone and ncremenal recovery schemes are boundary cases for boh he quany C and he quany R. The mlesone recovery scheme s he wors scheme from he vewpon of sorage space demands, bu he bes scheme from he vewpon of he recovery me. In he case of he

4 IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 ncremenal recovery scheme, he exac oppose holds. Then, we can ake he dfferenal scheme and combned recovery schemes as a compromse beween he wo exremes above. R [ D ].5.5 0.5 Incremenal Mlesone Combned Dfferenal 0 0 0. 0.4 0.6 0.8 p Fg. 9: The dependence of he average daa recovery sze R on he probably p for dfferen recovery schemes. The las hng n hs dscusson s comparng he szes of he nerval and aomc backups. From (), we know ha he average sze I of he nerval backup whn he me nerval T s: I( T, ) = D ( q) = D p. (4) From (), we know ha he oal average sze S of all aomc backups from he me nerval T s: S( T, ) = D ln. (4) p Fg. 0 llusraes he dependence of S and I on he probably p. We can easly prove ha holds ha S( T, ) / I( T, ),.e. he average sze I of he nerval backup whn he me nerval T s no greaer han he oal average sze S of all aomc backups whn he same me nerval T. Ths s due o ha he aggregae of aomc backups conans all versons of daa uns and no only he laes versons. 5 S, I [ D ] 4 0 0 0. 0.4 0.6 0.8 p Fg. 0: Dependence of S and I on he probably p. I S From he Fgure, we can see ha he dfferences beween S and I are no sgnfcan for small values of p. However, hese dfferences are defnely sgnfcan for greaer values of p. For example, he rao S / I equals approxmaely.4 for p = 0.5. Then, n he case of he aomc backup, we need a backup reposory wh a capacy whch s 40 percen greaer han n he case of he nerval backup. In he case of p 0.8, he rao S / I equals,.e. we need a backup reposory wh a capacy whch s wo mes greaer han n he case of he nerval backup. The advanage of he aomc backup,.e. he possbly of recoverng he daa a any me nsan, s pad for by ncreased requremens for he sorage capacy of he backup reposory. 7. Concluson In he paper, he ermnology and mahemacal apparaus for daa backup and recovery s exended. The core of he paper s a mahemacal model of he daa backup and recovery. The mahemacal model enables us o compue he average sze of an arbrary backup (eq. and ) from he probably p ha he daa un s changed n he me nerval T. I enables us o deermne he average oal backup sze C (eq. 4) for any recovery scheme and also he average daa recovery sze R (eq. 6). The model proposed allows us o compare dfferen recovery schemes (e.g. Fg. 8 and 9) and exends he heory of he daa backup and recovery. The model also allows us o compare nerval and aomc backups (see eq., and Fg. 0). The model nroduced s based on he assumpons ha he probably p s he same for all daa uns and ha he daa un change s an even whch has no nfluence on he changes of oher daa uns,.e. hese changes are muually ndependen evens. Furher assumpons are ha he oal sze of he daa s consan and ha no daa un s empy. The assumpons nroduced are no general and herefore our nex goal s o creae a more general model. In any case, however, he model descrbed s suable for heorec purposes a leas. References [] A. Frsch: Sysem Backup: Mehodologes, Algorhms and Effcency Models. In J. Bergsra, M. Burgess: Handbook of Nework and Sysem Admnsraon. Elsever, Amserdam 007. [] S. Nelson: Pro Daa Backup and Recovery. Apress, New York 0. [] A. Frsch: Essenal Sysem Admnsraon. O Relly Meda, Sebasopol 00.

IJCSNS Inernaonal Journal of Compuer Scence and Nework Secury, VOL.4 No.7, July 04 5 [4] Z. Kurmas, A. L. Chervenak: Evaluang Backup Algorhms. In IEEE Symposum on Mass Sorage Sysems. 000, pp. 5-4. [5] M. Burgess, T. Rean: A rsk analyss of dsk backup or reposory manenance. Scence of Compuer Programmng. 64 (007), pp. -. DOI: 0.06/.scco.006.06.00. [6] C. Qan, Y. Huang, X. Zhao, Tosho Nakagawa: X: Opmal Backup Inerval for a Daabase Sysem wh Full and Perodc Incremenal Backup. Journal of Compuers. 5 (00), pp. 557-564. [7] Teradacyl: Backup Terms and Defnons. Teradacyl LLC. Rereved Aprl 7, 04 from hp://www.eradacyl.com/backup-knowledge/backupdefnons/backup ermnology.hml [8] N. Garrmella: Undersandng and explong snapsho echnology for daa proecon. Par : Snapsho echnology overvew. IBM developerworks. (Aprl 6, 006). Rereved Aprl 7, 04 from hp://www.bm.com/developerworks/vol/lbrary/- snapsm/ndex.hml?ca=da- [9] P. de Guse: Enerprse Sysems Backup and Recovery. CRC Press, Boca Raon 008. [0] M. Lone: Msson-Crcal Nework Plannng. Arech House, London 00. [] L. Lakaos, L. Szedl, M. Telek: Inroducon o Queueng Sysems wh Telecommuncaon Applcaons. Sprnger, New York 0. Karel Burda receved he M.S. and PhD. degrees n Elecrcal Engneerng from he Lpovsky Mkulas Mlary Academy n 98 and 988, respecvely. Durng 988-004, he was a lecurer n wo mlary academes. A presen, he works a Brno Unversy of Technology. Hs curren research neress nclude he secury of nformaon sysems and crypology.