Optimization Model of Reliable Data Storage in Cloud Environment Using Genetic Algorithm

Similar documents
Genetic Algorithm Based Optimization Model for Reliable Data Storage in Cloud Environment

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

An Alternative Way to Measure Private Equity Performance

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

Can Auto Liability Insurance Purchases Signal Risk Attitude?

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Forecasting the Direction and Strength of Stock Market Movement

Improved SVM in Cloud Computing Information Mining

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

An Interest-Oriented Network Evolution Mechanism for Online Communities

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

A Secure Password-Authenticated Key Agreement Using Smart Cards

A DATA MINING APPLICATION IN A STUDENT DATABASE

LITERATURE REVIEW: VARIOUS PRIORITY BASED TASK SCHEDULING ALGORITHMS IN CLOUD COMPUTING

Research of Network System Reconfigurable Model Based on the Finite State Automation

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

Automated information technology for ionosphere monitoring of low-orbit navigation satellite signals

IMPACT ANALYSIS OF A CELLULAR PHONE

LIFETIME INCOME OPTIONS

Politecnico di Torino. Porto Institutional Repository

Optimization of network mesh topologies and link capacities for congestion relief

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

An MILP model for planning of batch plants operating in a campaign-mode

Watermark-based Provable Data Possession for Multimedia File in Cloud Storage

Complex Service Provisioning in Collaborative Cloud Markets

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

Traffic State Estimation in the Traffic Management Center of Berlin

M3S MULTIMEDIA MOBILITY MANAGEMENT AND LOAD BALANCING IN WIRELESS BROADCAST NETWORKS

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Network Aware Load-Balancing via Parallel VM Migration for Data Centers

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

What is Candidate Sampling

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

Cloud-based Social Application Deployment using Local Processing and Global Distribution

Effective Network Defense Strategies against Malicious Attacks with Various Defense Mechanisms under Quality of Service Constraints

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

A Self-Organized, Fault-Tolerant and Scalable Replication Scheme for Cloud Storage

Optimal Choice of Random Variables in D-ITG Traffic Generating Tool using Evolutionary Algorithms

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

One Click.. Ȯne Location.. Ȯne Portal...

The OC Curve of Attribute Acceptance Plans

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Intra-year Cash Flow Patterns: A Simple Solution for an Unnecessary Appraisal Error

Pricing Model of Cloud Computing Service with Partial Multihoming

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

Introduction CONTENT. - Whitepaper -

Calculating the high frequency transmission line parameters of power cables

J. Parallel Distrib. Comput.

Research on Evaluation of Customer Experience of B2C Ecommerce Logistics Enterprises

A heuristic task deployment approach for load balancing

Application of Multi-Agents for Fault Detection and Reconfiguration of Power Distribution Systems

Efficient Project Portfolio as a tool for Enterprise Risk Management

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

A Performance Analysis of View Maintenance Techniques for Data Warehouses

Traffic-light a stress test for life insurance provisions

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

Preventive Maintenance and Replacement Scheduling: Models and Algorithms

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Oservce Vs. Sannet - Which One is Better?

Sciences Shenyang, Shenyang, China.

iavenue iavenue i i i iavenue iavenue iavenue

Software project management with GAs

Calculation of Sampling Weights

DEFINING %COMPLETE IN MICROSOFT PROJECT

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feasibility of Using Discriminate Pricing Schemes for Energy Trading in Smart Grid

An Evolutionary Game Theoretic Approach to Adaptive and Stable Application Deployment in Clouds

Enabling P2P One-view Multi-party Video Conferencing

P2P/ Grid-based Overlay Architecture to Support VoIP Services in Large Scale IP Networks

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

Daily Mood Assessment based on Mobile Phone Sensing

Analysis of Premium Liabilities for Australian Lines of Business

An RFID Distance Bounding Protocol

A GENERIC HANDOVER DECISION MANAGEMENT FRAMEWORK FOR NEXT GENERATION NETWORKS

Project Networks With Mixed-Time Constraints

Fair Virtual Bandwidth Allocation Model in Virtual Data Centers

Mining Multiple Large Data Sources

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

Transcription:

Internatonal Journal of Grd Dstrbuton Computng, pp.175-190 http://dx.do.org/10.14257/gdc.2014.7.6.14 Optmzaton odel of Relable Data Storage n Cloud Envronment Usng Genetc Algorthm Feng Lu 1,2,3, Hatao Wu 1,3, Xaochun Lu 1,3 and Xyang Lu 4 1 Natonal Tme Servce Center, Chnese Academy of Scences, 3 East Shuyuan Road, X an 710600, Chna 2 Unversty of Chnese Academy of Scences, 19A Yuquan Road, Beng 100049, Chna 3 Key Laboratory of Precson Navgaton and Tmng Technology, Chnese Academy of Scences, 3 East Shuyuan Road, X an 710600, Chna 4 Xdan Unversty, 2 South Taba Road, X an 710071, Chna elkhood@163.com Abstract assve data storage s one of the great challenges for cloud computng servce, and relable storage of senstve data drectly affects qualty of storage servce. In ths paper, based on analyss of data storage process n cloud envronment, the cost of massve data storage s consdered to be comprsed of data storage prce, data mgraton and communcaton; and the storage relablty conssts of data transmsson relablty and hardware dependablty. A mult-obectve optmzaton model for relable massve storage s proposed, n whch storage cost and relablty are the obectves. Then, a genetc algorthm for solvng the model s desgned. Fnally, expermental results ndcate that the proposed model s postve and effectve. Keywords: Cloud storage, mult-obectve optmzaton model, genetc algorthm, relable storage 1. Introducton The development of nformaton-based socety entals that more and more resources are beng dgtzed, causng endless growth of data resource storage capacty, and thus resultng n substantal ncrease of storage costs. oreover, dfferent applcatons requre dfferent storage capacty. However, storage space assgned to these applcatons s often not fully utlzed. The servce provder s facng the tradeoff between rapd growth of nformaton resources and control of the costs. On one hand, servce provders not only produce enormous mportant data by themselves, but also requre massve nformaton resources. On the other hand, large amount of storage equpment and manpower are needed to store the nformaton resources. Therefore, new data storage devces requred by servce provder should nclude features such as storage vrtualzaton [1], dynamcally extensble storage capacty [2, 3] and relable data storage. Cloud storage s a new concept derved from cloud computng, referrng to a system whch assembles plentful dfferent small and large storage devces of same type n the grd by cooperatng applcaton software to ontly and externally provde functons of data storage and busness access usng cluster applcaton, grd technology or Dstrbuted Fle System (DFS). Compared wth tradtonal storage devces, cloud storage s not only ISSN: 2005-4262 IJGDC Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng hardware but also a complex system comprsed of network devce, storage devce, server, applcaton software, publc access nterface, Access Network (AN), clent program, etc. Each part provdes data storage and busness access through applcaton software wth the storage devce as the core. Strctly speakng, cloud storage s a servce rather than a storage devce. Cloud storage provder may provde personalzed storage servces accordng to clents demands, such as storage space, network bandwdth, data safety, dsaster recovery performance, etc. Currently, cloud storage s a feld n whch h-tech enterprses compete; many h-tech enterprses have launched ther own cloud data storage products ncludng Cloud Drve by Amazon, Lve esh by crosoft, Does by Google, etc. Cloud storage has become a tendency for future storage development. Wth development of cloud storage technology, applcatons that combne technques of all knds of searches and applcatons pertanng to cloud storage should be mproved from the vew of safety, relablty, data access, etc. 2. Problem Analyss 2.1. Rsk Analyss of Data Resources n Cloud Storage Envronment Accordng to lterature [4], Gartner ndcates that seven securty rsks are faced by cloud computng: prvleged user access, regulatory complance, data locaton, data segregaton, recovery, nvestgatve support, and long-term vablty. Among dfferent safety rsks, data dsaster recovery s the most mportant ssue whch needs to be consdered by cloud storage provders and clents. Enterprses hand over ther senstve and mportant commercal data to cloud servce provder. Therefore, loss of the data wll not only brng enterprses fatal dsaster but also cause legal dspute between enterprses and ther clents. For example, n bankng system, clent data s vtal for both the bank and ts clents; Unrecoverable data loss wll result n ncalculable loss for the bank and ts clents. Therefore, for cloud storage provders, t s ther own busness obectve to guarantee the safety, relablty and recoverablty of data. Nevertheless, when provdng some mportant clents wth storage servce, legal clauses are sgned wth the clents, so that f data loss occurs, clents wll be compensated. Therefore, guarantee of data dsaster recoverablty s of sgnfcant mportance for cloud storage provders. 2.2. DC (Data Center) Wkpeda defnes a data center as a complex set of faclty whch ncludes not only computer systems and assocated components such as telecommuncatons and storage systems, but also redundant data communcatons connectons, envronmental controls and varous securty devces. In lterature [5], Google llustrates datacenters as buldngs wth mult functons, where multple servers and communcaton gear are collocated because of ther common envronmental requrements and physcal securty needs, and ease of mantenance, other than ust a collecton of co-located servers. 1) Server: Compared wth PC, server has more relable contnuous operatng capablty, more powerful storage and network communcaton capablty, faster falure recovery capablty and has easer extenson space. Data backup functons are also requred by applcatons whch are senstve to data. In realty, servers are located n geographcally dfferent locatons. Therefore the nfluence caused by geographcal dstance between servers should be consdered when storng the data. In practcal applcatons, servers are usually connected by nternet or routers. Consderng each 176 Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng server as node, the connectons as sdes, connecton of servers can be shown as a graph, more specfcally a complete graph. 2) Storage capablty: Hardware s vtal for data storage, whch drectly affects DC s data storage capablty and storage safety. Nowadays, varous types of hard dsks can be found n the market, each havng ts own characterstcs. Hard dsk nterface s the connectng component between hardware and host system, and functons as the data transmsson devce between HDD (Hard Dsk Drve) cachng and host memory. Dfferent hard dsk nterfaces determne speed of connecton between hard dsk and computer. In the whole system, qualty of hard dsk nterface drectly affects program runnng speed and system performance level. In DC, the future of enterprse-based dsk array les n mxed use of SSD (Sold State Dsk) and HDD. For the sake of cost control, both should be taken nto consderaton. HDD can provde massve storage space, whle SSD can provde hgh performance. Generally, for hard dsks used for server confguraton crculated n the market, ts prce s drectly proportonal to ts stablty and data read/wrte speed; 3) Storage relablty: Wth the development of cloud computng, n recent years, more and more ndvduals and enterprses choose to mgrate ther busnesses to large DC to reduce ther own operatng cost. The value of data embodes n ts context as well as potental beneft brought for busness development, hence data s becomng more mportant day by day, and thus hgher requrements of data storage relablty are demanded by users. assve data storage presents a great deal of challenges. In vew of DC level, massve data storage management system s hghly complex. As for storage equpment level, numerous cheap storage devces are used by DC to control cost; therefore data loss caused by falure of storage devce tends to occur easly. For example, Data servce nterrupton accdents occurrng to Google and Amazon began concernng users about DC storage relablty. Therefore, how to mprove storage relablty of large-scale data n DC has become a hot topc for research n recent years. 3. odel Establshment Snce cloud storage s a burgeonng servce, the users are hghly concerned about data storage relablty and safety, n addton to the prce of the servce. To some extent, the data submtted by users may nvolve ther prvacy or nterest. Therefore, more sutable storage servce whch meets ther requrements may be chosen by users regardng storage prce vs. relablty. Hence, when buldng the mathematc model, storage servce cost and storage relablty are both consdered as obectves, and mult-obectve optmzaton model of relable storage of cloud storage s proposed. 3.1. Assumptons Assumng server cluster owned by certan DC s S { s1, s2,, s N }, the servers are allocated at dfferent places wth geographcal dfference among them. The matrx ( d( s, s )) N N represents spatal dstance between two servers, where d( s, s ) 0, 1~ n and, {1,2,, N},, then d( s, s ) 0. Assumng data set submtted to DC by users s F { f1, f2,, f }, where f s ndvsble data block, each data block has ts own safety level L( f ), 1 ~. Copyrght c 2014 SERSC 177

Internatonal Journal of Grd Dstrbuton Computng Dfferent prces are charged for each server storng data wth dfferent safety level manly because data fles wth dfferent safety level have dfference backup fle numbers. The number and locaton of backups are decded accordng to the safety level of data fle specfed. Therefore, when provdng storage servce, data fles submtted by users may be mgrated among servers n DC accordng to requrement. Thus, durng mgraton, an overhead cost arses from communcaton and mgraton, along wth some problems related to relablty. Relablty n mgraton s manly related to connecton stablty and dstance among servers. For convenent model descrpton, the followng parameters are ntroduced: Data fle storage dentfcaton: f the data fle f s stored n the k type hard dsk of server s, then mark the matrx Y Sgn functon: sgn() s defned as k,or Y 0 1 0 ( ) { f y sgn y 0 else Functon X ( f, s ) represents the storage space of data fle f n server s Intrnsc relablty Rs ( ) owned by the server s tself Relablty TR( s, s ) of data transmsson among servers, where TR( s, s ) [0,1], and TR( s, s ) s nversely proportonal to d( s, s ) There are K types of hard dsk n DC, current avalable capacty of varous hard dsks n server s can be denoted asa1( s ), A2( s ),, AK ( s ), and the h type of hard dsk has relablty of ts own: P( h), h 1,2,, K Data storage relablty, whch s the product of transmsson relablty, hardware relablty and server relablty durng data transmsson; the formula s: N SR( f, R( s ), X ( f, s ), TR( s, s )) R( s ) X ( f, s ) TR( s, s ) P( Y ) (1) B B 1 3.2. Storage Cost Dfferent prces are charged for each server provdng storage servce to data at dfferent safety levels, denoted as C ( s L( f )). Plus, data fles wth dfferent safety s levels have dfferent backup fle numbers; hgher safety levels correspond to more backups. Here, safety level s made equvalent to the amount of data fle backups, and these backups are stored n servers at dfferent geographc locatons whch comples wth our practce of data storage n cloud computng. Hence our assumpton s reasonable. The storage prce of fle f s: N C ( f ) C( s, L( f )) X ( f, s ) c 1 (2) 178 Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng The storage prce of data fle F submtted by user s: 3.3. graton Cost N C ( F) C( s, L( f )) X ( f, s ) c 1 1 Snce servers are located n dfferent geographc locatons, dfferent mgraton dstance occurs durng mgraton of data block from one server to another. Hence, mgraton cost s related to mgraton dstance and sze of data fle. Therefore, the mgraton cost of data block f mgrated from the server s to the server s s expressed as: k C ( f, s, s ) C X ( f, s ) d( s, s ) (4) mg k m k Where, C s parameter of mgraton cost. m Hence, the mgraton cost of data fle F mgrated from server s to another server s: C ( f, s, s ) C X ( f, s ) d( s, s ) mg k m k k1 k1 (3) (5) 3.4. Communcaton Cost Data transmsson among servers manly embodes n communcaton flow due to dfferent geographc locatons of servers, hence, dstance between source server ( s ) and host server ( s ) s another factor nfluencng communcaton cost, expressed as: N N C ( f ) [ X ( f s ) W ( s s ) d( s s )] com k k 1 1 (6) Where, W( s, s ) represents communcaton cost of servers. The whole communcaton cost of data fle F after storage s: N N C ( F) [ X ( f, s ) W( s, s ) d( s, s )] com k k1 1,1 1 (7) 3.5. Total Cost for Storage Total storage cost of data fle F s the sum of storage fee, mgraton cost and communcaton cost of each data block. C( F, S) C ( F) C ( f, s, s ) C ( F) c mg k com k1 N C( s, L( f )) X ( f, s ) C X ( f, s ) d( s, s ) m k 1 1 k 1 N N k 1 1, 1 [ X ( f, s ) W ( s, s ) d( s, s )] k (8) In the tme of mgraton, all servers S are requred to be traversed, after whch mnmum mgraton cost s obtaned. 3.6. Relablty of Data Storage Copyrght c 2014 SERSC 179

Internatonal Journal of Grd Dstrbuton Computng Rs ( ) s ntrnsc relablty owned by server s, TR( s s ) s relablty of data transmsson among servers, where, TR( s s ) 1 TR( s s ) [0 1), and TR( s s ) s nversely proportonal to d( s s ). When TR( s s ) 0, data s unable to be mgrated between severs s and s. The p type of hard dsk has ntrnsc relablty, denoted as P( h) h 1 2 K. Data storage relablty, whch s the product of transmsson relablty, hardware relablty and server relablty durng data transmsson, has the followng formula: N S ( f, R( s ), X ( f, s ), TR( s, s )) R( s ) X ( f, s ) TR( s, s ) P( Y ) r B B 1 4. athematcal odel 4.1. Obectve Analyss 1) Cost calculaton: when data fle F s submtted by user from certan server s, the B total cost of data storage, whch s the sum of storage cost, communcaton cost and mgraton cost, s calculated by DC. The obectve s acquston of optmal storage soluton to mnmze total storage cost. 2) Storage relablty: After data s submtted by user, DC needs to reach maxmum relablty, whch manly ncludes transmsson relablty and devce dependablty durng storage. 4.2. Constrants 1) Storage constrant: Total data sze n each server s no more than avalable capacty of hard dsk; 2) Off-sde storage of backup data: Each data fle and ts backup should not be stored n the server n dentcal locaton; 3) Relablty constrant: Intrnsc relablty owned by hard dsk or server s no less than safety factor owned by data block fle. 4.3. ult-obectve Optmzaton odel By aggregatng each ndex analyzed above, mult-obectve optmzaton model of relable storage n cloud computng envronment s obtaned: (9) 180 Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng N mn C( F, S) C( s, L( f )) X ( f, s ) CmX ( fk, s) d( s, s ) 1 1 k 1 N N [ X ( fk, s ) W ( s, s ) d( s, s )] k 1 1, 1 N max SR( F, S) Sr ( f, R( s ), X ( f, s ), TR( s, sb)) 1 1 N N R( s ) X ( f, s ) TR( s, sb) P( Y ) 1 1 1 s. t. L( f) R( Y ), 1,2,, f AY ( s ), 1,2,, N sgn( Y ) L( f ), 1,2,, 1 R( s ) [0,1), 1,2,, Ph ( ) [0, 1), h1,2,, K Snce obectve functons n the above mathematcal model contan both maxmzed obectve and mnmzed obectve, t s not convenent to optmze both the obectves usng the algorthm. For a more convenent soluton of above mult-obectve optmzaton problem, we equvalently transform them to mult-obectve mnmzaton problem: mn C( F, S) mn SR '( F, S) 1 SR( F, S) s. t. L( f ) R( Y ), 1,2,, f AY ( s ), 1,2,, N sgn( Y ) L( f ), 1,2,, 1 R( s ) [0,1), 1,2,, P( h) [0,1), h 1,2,, K After equvalent transformaton, the less SR( F S) s, the hgher relablty s. Hence, n the followng numercal experment, the problem (11) s valdated and solved. (10) (11) 5. GA for Solvng ult-obectve Optmzaton Currently, there are three types of methods for solvng mult-obectve optmzaton: aggregate mult-obectve nto sngle obectve, non-pareto and Pareto. Although the method of aggregate mult-obectve nto sngle obectve s smple n desgn and effcent n operaton, at the end only one effcent soluton can be obtaned. ultple effcent solutons can be obtaned by Non-Pareto, however these solutons are all concentrated at endponts of effectve nterfaces wth too many non-nferor solutons beng lost. Pareto-based optmzaton method usually maps mult-obectve values drectly to ftness functon, and effectve soluton set s searched by comparng domnant relaton of functon values. Thus a seres of non-nferor solutons can be obtaned, makng t an effectve method. Due to connotatve concurrency, randomness and hgh robustness of GA, t s mostly appled to Pareto-based optmzaton method. A seres of classc algorthms are proposed and successfully appled. 5.1. Flow of GA Accordng to GA concept, smple GA flow can be shown as Fgure 1. 5.2. Soluton Algorthm In ths paper, the noted mult-obectve genetc algorthm NSGA-II s used for solvng Copyrght c 2014 SERSC 181

Internatonal Journal of Grd Dstrbuton Computng mult-obectve optmzaton of relable storage. Accordng to features of the model n ths paper, parameters of NSGA-II are set as: Populaton sze: N 100 Crossover: usng smplex crossover operator for ndvdual crossover, where, crossover probablty s pc 0 95, hgh crossover probablty s used to rase updatng speed of algorthm populaton, therefore ncreasng the convergence rate of the algorthm Start algorthm ntalzaton generate ntal populaton compute ndvdual ftness value meet endng requrements? crossover operaton mutaton operaton select next generaton of populaton determne optmal soluton output optmal soluton End Fgure 1. Flow Chart of GA utaton: usng Gauss mutaton operator for ndvdual mutaton, mutaton probablty s pm 09. Snce constraned mult-obectve optmzaton s a problem, feasble soluton s dffcult to be obtaned. Hence, mutaton probablty s set as 0.9 to mprove search performance of the algorthm. Strategy selecton: usng roulette selecton strategy to select ndvduals for crossover or mutaton, eltsm strategy s used to preserve next generaton of populaton. Stop condton: when algorthm teratons reach 400 tmes, or soluton found by algorthm does not change n 30 teratons, the algorthm wll stop runnng and the found soluton wll be output. The lterature [6] has theoretcally proved that utlzaton of eltsm strategy and Gauss mutaton can guarantee GA s convergence to optmal soluton by probablty 1. Therefore, 182 Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng we adopt the two strateges to guarantee convergence of the algorthm. 6. Smulaton Experment 6.1. Experment Setup For the experments, we adopt two cloud computng servce envronments, namely EC2 (from Amazon) and GoGrd, to valdate the correctness and avalablty of the proposed model. Storage type confguraton and storage prce used are obtaned from data openly publshed by the two companes, n whch cloud storage prces of EC2 from Amazon are classfed as US-east prce, US-west prce and EU-west prce. Snce storage of data fle of hgher safety level requres specfc storage devce and encrypton algorthm wth correspondng safety level for processng, storage of ths knd of data fle demands more CPU computng tme, resultng n hgher storage prce. The detaled data s shown n Table 1-5. Table 1. Amazon EC2 Storage Instance and Prce Name CPUs emory(gb) Prce($/hour) US-east US-west EU-west m1.small 1 1.7 0.085 0.095 0.095 m1.large 4 7.5 0.34 0.38 0.38 m1.xlarge 8 15 0.68 0.76 0.76 c1.medum 5 1.7 0.17 0.19 0.19 c1.xlarge 20 7 0.68 0.76 0.76 m2.xlarge 6.5 7 0.50 0.57 0.57 m2.2xlarge 13 34.2 1.00 1.14 1.14 m2.4xlarge 26 68.4 2.00 2.28 2.28 Table 2. GoGrd Storage Type and Prce Name CPUs emory(gb) Prce($/hour) EU-west X-Small 0.5 0.5 0.095 Small 1 1 0.19 edum 2 2 0.38 Large 4 4 0.76 X-Large 8 8 1.52 XX-Large 16 16 3.04 Table 3. Prce of EC2 and Gogrd Network Data Transmsson Cloud Servce Prce($/GB) Data Input Data Output EC2 0.10 0.15 GoGrd 0.00 0.29 Copyrght c 2014 SERSC 183

Internatonal Journal of Grd Dstrbuton Computng Table 4. Avalable Bandwdth from UA to Cloud Servce Provder Cloud Servce Average Bandwdth (B/s) Transmsson Tme/GB(s) Stdev(%) EC2 us-east-1 1.54 665 40.87 EC2 us-west-1 1.19 864 42.43 EC2 eu-west-1 24.16 42 23.58 GOGrd EU West 5.91 173 32.06 Table 5. Prvate Storage Types Name CPUs emory(gb) Small 4 2 Large 8 8 Xlarge 16 16 6.2. Analyss of Expermental Results The expermental data fle s stored as one ntegral data fle and also dvded nto a group of sub data fles. When the data fle s treated as one ntegral data fle, the storage soluton s to store all data fles n an dentcal server. When the data fle s treated as a group of sub data fles, they can be stored n dfferent servers. Ths settng helps to observe the effect of solutons of dfferent types of data fle storage n cloud storage envronment on storage cost and relablty. The expermental result s shown n Fgure 3-5. Where, each pont n the fgures represents correspondng storage cost and relablty concluded by optmal storage soluton, usng the mult-obectve optmzaton model for storage relablty proposed n ths paper. Frstly, n order to further analyze the effect on mgraton frequency of data fles durng storage to storage relablty, relatve experments for analyss are performed wthout consderng the storage cost. In the experments, sub data fles wth hgher safety level are of hgher proporton. In order to mprove precson of analyss, multple experments are performed to record average mgraton frequency of all sub data fles and relablty of correspondng soluton, the result of whch s shown n Fgure 2. Accordng to Fgure 2, storage relablty of data fles decreases sharply wth ncreasng average mgraton frequency. Durng storage, relablty of data to be stored s manly nfluenced by stablty of network transmsson. When provdng storage servce, f all data s stored n one server wthout any data fle transferrng, most relable storage can be acheved for all data fles. However, the cost for storage servce s hghest n ths case. In order to reduce storage cost, data fles may be transferred, but that wll obvously reduce storage relablty of the data fles, whch s evdent n Fgure 2. In cloud envronment, hgher length and hgher frequency of data fles transfer result n lower relablty for the whole data storage servce. 184 Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng Fgure 2. Effect of Total graton Frequency of Data Fles on Storage Relablty In the next phase of the experment, all sub fles of expermental data fles are stored as one ntegral fle n order to observe effect of sub data fles on storage soluton. Two types of condtons are tested: equal proporton dstrbuton of sub data fles at varous safety levels, and hgher proporton of fles at hgher safety levels compared to those at lower safety levels. The results are shown n Fgure 3 and Fgure 4. Accordng to both fgures, t s obvous that for ncreased proporton of sub data fles at hgher safety levels, storage cost s also ncreased, and storage soluton produced by the model s evdently less than when sub data fles at varous safety levels are equally dstrbuted. Ths comples wth practcal applcaton, thereby valdatng the correctness of the model. Fgure 3. Integral Storage of Data Fles (data fles at varous safety levels are dstrbuted equally n proporto n) Copyrght c 2014 SERSC 185

Internatonal Journal of Grd Dstrbuton Computng Fgure 4. Integral Storage of Data Fles (proporton of data fles at hgher safety levels s hgher than those at lower safety levels) In the next phase, by storng each sub fle of expermental data ndvdually, the effect of proporton of sub fles at dfferent safety levels n data fle on storage soluton s observed. The condtons mentoned n the prevous paragraph are also consdered here. The results are shown n Fgure 5 and Fgure 6. Accordng to the results, when sub fles at varous safety levels n expermental data are dstrbuted wth same proporton, more unformly dstrbuted storage solutons can be obtaned usng the model proposed n ths paper, In the output storage solutons, storage relablty dstrbutes moderately unformly between 0.63~0.95, whch s benefcal for provdng clents wth mult representatve solutons. When proporton of sub fles of expermental data at hgher safety levels s hgher, number of storage solutons, unformty and unversalty of dstrbuton clearly decrease. Accordng to comparson between Fgure 5 and Fgure 6, the change of storage cost for storage soluton shown n Fgure 6 s obvously greater than that shown n Fgure 5. Ths s because dfferent storage prces are provded for storng fles at hgher safety levels by dfferent servers or storage devces, larger data fle at hgher safety level wll result n dramatc change of storage cost. Fgure 6 shows when relablty s between 0.83~0.95, storage solutons wthout great change of storage cost are observed. The quck ncrease of mgraton frequency of data fles n these solutons result n decrease of relablty durng mgraton. 186 Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng Fgure 5. Respectve Storage of Sub Data Fles (data fles at varous safety levels are dstrbuted equally n proporton) Fgure 6. Respectve Storage of Sub Data Fles (the proporton of data fles at hgher safety levels s hgher than that at lower safety levels) Fnally, comprehensve analyss s made by comparng Fgures 3, 4, 5 and 6. Among sub fles at varous safety levels of expermental data fles dstrbuted wth varous proportons, the storage cost of storng sub fles of same relablty separately s notceably less than storng the sub fles as one ntegral fle. Accordng to Fgure 3, f the expermental data fles are stored as one ntegral fle, the soluton of storage cost wth relablty greater than 0.5 obtaned by the model n ths paper s between 10~16 thousand dollars. Storage relablty and storage cost correspondng to dfferent storage solutons vary sgnfcantly. Storage cost ncreases sharply wth ncreased storage relablty. Ths s due to the fact that when ntegrally storng 120 GB-szed data fles, one frst needs to fnd a server wth avalable storage space of over 120 GB. oreover, storage prces of servers n dfferent locaton are dfferent. graton from the server where data was ntally submtted, to a dfferent target server wll not only produce hgher mgraton cost and communcaton cost, but also reduce Copyrght c 2014 SERSC 187

Internatonal Journal of Grd Dstrbuton Computng relablty of data storage. Accordng to Fgure 5, f the sub data fles of expermental data fles are stored ndvdually, the optmal soluton to storage cost obtaned by the model n ths paper s between 5500~9500 dollars, the storage relablty s between 0.60~0.95, the dstrbuton s relatvely unform, and storage cost and relablty s relatvely stable. Compared wth Fgure 3, respectve storage of sub fles of data fles can obvously reduce storage cost, and on the premse that storage relablty s guaranteed, more storage solutons can be found by the proposed model. Hence, ndvdual storage of data fles at varous safety levels can not only realze cost reducton for data storage wth prerequste of storage relablty guaranteed, but also acheve more storage solutons, whch s helpful for servce provders to provde more dversfed storage solutons so as to mprove ther attractveness n storage servce. 7. Concluson By analyzng data storage nformaton n cloud envronment, a mult-obectve optmzaton model for relable storage s bult. In vew of data fles wth safety requrement stored by users, n the model, both data storage cost ncludng storage prce, mgraton cost and communcaton cost and data relablty ncludng transmsson relablty n storage process and storage devce relablty after data storage are consdered. In order to valdate correctness and avalablty of the model, a mult-obectve GA s desgned under the framework of algorthm NSGA-II for model soluton. In the experments, computatons are carred out through constructng several storage stuatons by usng parameters publshed by exstng commercal cloud storage servces. The expermental results valdate correctness and avalablty of the model, and show that the model can provde multple storage solutons for users so that storage resources n DC can be effectvely utlzed. Acknowledgement Ths paper s a revsed and expanded verson of a paper enttled Genetc Algorthm Based Optmzaton odel for Relable Data Storage n Cloud Envronment presented at CST 2014, Indonesa, June 19-22, 2014. References [1] S. Aameek, K. adhukar and. Dushmanta, Server-storage Vrtualzaton: Integraton and Load Balancng n Data Centers, Proceedng of the 2008 AC/IEEE Conference on Supercomputng, (2008) November 15-21, Texas, USA. [2] C. Gregory, K, Idt and G. Rachd, Relable dstrbuted storage, IEEE Computer Socety, vol. 42, no. 4, (2009). [3] Y. H. Deng, F. Wang and L. N. He, Dynamc and Scalable Storage anagement Archtecture for Grd Orented Storage Devces, Parallel Computng, vol. 34, no. 6, (2008). [4] J. Heser and. Ncolett, Assessng the Securty Rsks of Cloud Computng, http://www.gartner.com, (2008). [5] B. Luz and H. Urs, The Datacenter as a Computer, organ & Claypool Publshers, Calforna (2009). [6] L. Fan, Research on Effcent and Intellgent Algorthms for Two Classes of Complex Optmzaton Problem, Xdan Unversty, (2012). [7] S. Ghemawat, H. Goboff and S. T. Leung, The Google Fle System, Proceedngs of the 19th AC Symposum on Operatng Systems Prncples, (2003) October 19-22, New York, USA. [8] A. Alves, C. Vegas and P. Nedl, A Dstrbuted Tablng Algorthm for Rule Based Polcy Systems, IEEE Computer Socety, vol. 15, no. 5, (2004). 188 Copyrght c 2014 SERSC

Internatonal Journal of Grd Dstrbuton Computng [9] C. Thomas, The Bascs of Relable Dstrbuted Storage Networks, IEEE Computer Socety, vol. 6, no. 3, (2004). [10] S. Ahmad, B. Ahmad, S.. Saqb and R.. Khattak, Trust odel: Cloud's Provder and Cloud's User, Internatonal Journal of Advanced Scence and Technology, vol. 44, (2012). [11] S. Gurumurth, Archtectng Storage for the Cloud Computng Era, IEEE Computer Socety, vol. 34, no. 6, (2009). [12] P. Buxmann, T. Hess and S. Lehmann, Software as a Servce, Busness and Economcs, vol. 50, no. 6, (2008). Authors Feng Lu, he receved hs master's degree n easurng and Testng Technologes and Instruments from Graduate School of Chnese Academy of Scences, Chna, n 2008. Now he s studyng the PhD of Astrometry and Celestal echancs n Unversty of Chnese Academy of Scences, Chna. Hs current research nterests nclude the applcatons of cloud computng and parallel computng n GNSS. Hatao Wu, he receved hs PhD n Astrometry and Celestal echancs from Graduate School of Chnese Academy of Scences, Chna, n 2002. Now he s a research professor n Natonal Tme Servce Center (NTSC), Chnese Academy of Scences. Hs current research nterests focus on overall techncal and applcatons of satellte navgaton system. Xaochun Lu, she receved hs PhD n Astrometry and Celestal echancs from Graduate School of Chnese Academy of Scences, Chna, n 2004. Now she s a research professor n Natonal Tme Servce Center (NTSC), Chnese Academy of Scences. Her current research nterests focus on satellte navgaton sgnal desgn and assessment. Xyang Lu, he receved hs PhD n Crcuts and Systems from Xdan Unversty, Chna, n 2007. Now he s a professor n School of Software of Xdan Unversty. Hs current research nterests focus on dstrbuted computng and software testng. Copyrght c 2014 SERSC 189

Internatonal Journal of Grd Dstrbuton Computng 190 Copyrght c 2014 SERSC