Genetic Algorithm Based Optimization Model for Reliable Data Storage in Cloud Environment



Similar documents
Optimization Model of Reliable Data Storage in Cloud Environment Using Genetic Algorithm

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

Introduction CONTENT. - Whitepaper -

Traffic State Estimation in the Traffic Management Center of Berlin

A Design Method of High-availability and Low-optical-loss Optical Aggregation Network Architecture

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Watermark-based Provable Data Possession for Multimedia File in Cloud Storage

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Alternative Way to Measure Private Equity Performance

Improved SVM in Cloud Computing Information Mining

Pricing Model of Cloud Computing Service with Partial Multihoming

Forecasting the Direction and Strength of Stock Market Movement

M3S MULTIMEDIA MOBILITY MANAGEMENT AND LOAD BALANCING IN WIRELESS BROADCAST NETWORKS

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

Can Auto Liability Insurance Purchases Signal Risk Attitude?

An Optimal Model for Priority based Service Scheduling Policy for Cloud Computing Environment

NEURO-FUZZY INFERENCE SYSTEM FOR E-COMMERCE WEBSITE EVALUATION

Research of Network System Reconfigurable Model Based on the Finite State Automation

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Optimization of network mesh topologies and link capacities for congestion relief

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

A Load-Balancing Algorithm for Cluster-based Multi-core Web Servers

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

RESEARCH ON DUAL-SHAKER SINE VIBRATION CONTROL. Yaoqi FENG 1, Hanping QIU 1. China Academy of Space Technology (CAST)

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

A Secure Password-Authenticated Key Agreement Using Smart Cards

Cloud-based Social Application Deployment using Local Processing and Global Distribution

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

LIFETIME INCOME OPTIONS

Politecnico di Torino. Porto Institutional Repository

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

A DATA MINING APPLICATION IN A STUDENT DATABASE

Multi-sensor Data Fusion for Cyber Security Situation Awareness

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) , info@teltonika.

Daily Mood Assessment based on Mobile Phone Sensing

Effective Network Defense Strategies against Malicious Attacks with Various Defense Mechanisms under Quality of Service Constraints

IMPACT ANALYSIS OF A CELLULAR PHONE

Mathematical Framework for A Novel Database Replication Algorithm

An RFID Distance Bounding Protocol

Intra-year Cash Flow Patterns: A Simple Solution for an Unnecessary Appraisal Error

A Dynamic Energy-Efficiency Mechanism for Data Center Networks

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

Project Networks With Mixed-Time Constraints

Complex Service Provisioning in Collaborative Cloud Markets

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing

Application of Multi-Agents for Fault Detection and Reconfiguration of Power Distribution Systems

Network Aware Load-Balancing via Parallel VM Migration for Data Centers

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

A Self-Organized, Fault-Tolerant and Scalable Replication Scheme for Cloud Storage

Load Balancing By Max-Min Algorithm in Private Cloud Environment

Fair Virtual Bandwidth Allocation Model in Virtual Data Centers

What is Candidate Sampling

Performance Analysis and Coding Strategy of ECOC SVMs

J. Parallel Distrib. Comput.

P2P/ Grid-based Overlay Architecture to Support VoIP Services in Large Scale IP Networks

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

QOS DISTRIBUTION MONITORING FOR PERFORMANCE MANAGEMENT IN MULTIMEDIA NETWORKS

AN EFFICIENT GROUP AUTHENTICATION FOR GROUP COMMUNICATIONS

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

CONTENTS Introduction... 3

Ensuring Data Storage Security in Cloud Computing

Network Security Situation Evaluation Method for Distributed Denial of Service

Data Mining from the Information Systems: Performance Indicators at Masaryk University in Brno

A Study on Secure Data Storage Strategy in Cloud Computing

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

LITERATURE REVIEW: VARIOUS PRIORITY BASED TASK SCHEDULING ALGORITHMS IN CLOUD COMPUTING

ACKNOWLEDGEMENTS. Core Operational Guidelines for Telehealth Services Involving Provider-Patient Interactions

Calculating the high frequency transmission line parameters of power cables

Overview of monitoring and evaluation

DBA-VM: Dynamic Bandwidth Allocator for Virtual Machines

An MILP model for planning of batch plants operating in a campaign-mode

A High-confidence Cyber-Physical Alarm System: Design and Implementation

Efficient Project Portfolio as a tool for Enterprise Risk Management

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu

iavenue iavenue i i i iavenue iavenue iavenue

Hosted Voice Self Service Installation Guide

GR-303 Solution For Access Gateways

Hosting Virtual Machines on Distributed Datacenters

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

Transcription:

Advanced Scence and Technology Letters, pp.74-79 http://dx.do.org/10.14257/astl.2014.50.12 Genetc Algorthm Based Optmzaton Model for Relable Data Storage n Cloud Envronment Feng Lu 1,2,3, Hatao Wu 1,3, Xaochun Lu 1,3, Xyang Lu 4, Le Fan 4 1 atonal Tme Servce Center, Chnese Academy of Scences, 3 East Shuyuan Road, 710600 X an, Chna 2 Unversty of Chnese Academy of Scences, 19A Yuquan Road, 100049 Beng, Chna 3 Key Laboratory of Precson avgaton and Tmng Technology, Chnese Academy of Scences, 3 East Shuyuan Road, 710600 X an, Chna 4 Xdan Unversty, 2 South Taba Road, 710071 X an, Chna elkhood@163.com Abstract. Massve data storage s one of the great challenges for cloud computng servce, and relable storage of senstve data drectly affects qualty of storage servce. In ths paper, based on analyss of data storage process n cloud envronment, the cost of massve data storage s consdered to be comprsed of data storage prce, data mgraton and communcaton; and the storage relablty conssts of data transmsson relablty and hardware dependablty. A mult-obectve optmzaton model for relable massve storage s proposed, n whch storage cost and relablty are the obectves. Then, a genetc algorthm for solvng the model s desgned. Fnally, expermental results ndcate that the proposed model s postve and effectve. Keywords: Cloud storage, mult-obectve optmzaton model, genetc algorthm, relable storage 1 Introducton The development of nformaton-based socety entals that more and more resources are beng dgtzed, causng endless growth of data resource storage capacty, and thus resultng n substantal ncrease of storage costs. Moreover, dfferent applcatons requre dfferent storage capacty. However, storage space assgned to these applcatons s often not fully utlzed. The servce provder s facng the tradeoff between rapd growth of nformaton resources and control of the costs. On one hand, servce provders not only produce enormous mportant data by themselves, but also requre massve nformaton resources. On the other hand, large amount of storage equpment and manpower are needed to store the nformaton resources. Therefore, new data storage devces requred by servce provder should nclude features such as storage vrtualzaton [1], dynamcally extensble storage capacty [2][3] and relable data storage. Cloud storage s a new concept derved from cloud computng, referrng to a system whch assembles plentful dfferent small and large storage devces of same ISS: 2287-1233 ASTL Copyrght 2014 SERSC

Advanced Scence and Technology Letters type n the grd by cooperatng applcaton software to ontly and externally provde functons of data storage and busness access usng cluster applcaton, grd technology or Dstrbuted Fle System (DFS). Compared wth tradtonal storage devces, cloud storage s not only hardware but also a complex system comprsed of network devce, storage devce, server, applcaton software, publc access nterface, Access etwork (A), clent program, etc. Each part provdes data storage and busness access through applcaton software wth the storage devce as the core. Strctly speakng, cloud storage s a servce rather than a storage devce. Cloud storage provder may provde personalzed storage servces accordng to clents demands, such as storage space, network bandwdth, data safety, dsaster recovery performance, etc. Currently, cloud storage s a feld n whch h-tech enterprses compete; many htech enterprses have launched ther own cloud data storage products ncludng Cloud Drve by Amazon, Lve Mesh by Mcrosoft, Does by Google, etc. Cloud storage has become a tendency for future storage development. Wth development of cloud storage technology, applcatons that combne technques of all knds of searches and applcatons pertanng to cloud storage should be mproved from the vew of safety, relablty, data access, etc. 2 Problem Analyss 2.1 Rsk Analyss of Data Resources n Cloud Storage Envronment Accordng to lterature [4], Gartner ndcates that seven securty rsks are faced by cloud computng: prvleged user access, regulatory complance, data locaton, data segregaton, recovery, nvestgatve support, and long-term vablty. Among dfferent safety rsks, data dsaster recovery s the most mportant ssue whch needs to be consdered by cloud storage provders and clents. Enterprses hand over ther senstve and mportant commercal data to cloud servce provder. Therefore, loss of the data wll not only brng enterprses fatal dsaster but also cause legal dspute between enterprses and ther clents. For example, n bankng system, clent data s vtal for both the bank and ts clents; Unrecoverable data loss wll result n ncalculable loss for the bank and ts clents. Therefore, for cloud storage provders, t s ther own busness obectve to guarantee the safety, relablty and recoverablty of data. evertheless, when provdng some mportant clents wth storage servce, legal clauses are sgned wth the clents, so that f data loss occurs, clents wll be compensated. Therefore, guarantee of data dsaster recoverablty s of sgnfcant mportance for cloud storage provders. 2.2 DC (Data Center) Wkpeda defnes a data center as a complex set of faclty whch ncludes not only computer systems and assocated components such as telecommuncatons and Copyrght 2014 SERSC 75

Advanced Scence and Technology Letters storage systems, but also redundant data communcatons connectons, envronmental controls and varous securty devces. In lterature [5], Google llustrates datacenters as buldngs wth mult functons, where multple servers and communcaton gear are collocated because of ther common envronmental requrements and physcal securty needs, and ease of mantenance, other than ust a collecton of co-located servers. 1) Server: Compared wth PC, server has more relable contnuous operatng capablty, more powerful storage and network communcaton capablty, faster falure recovery capablty and has easer extenson space. Data backup functons are also requred by applcatons whch are senstve to data. In realty, servers are located n geographcally dfferent locatons. Therefore the nfluence caused by geographcal dstance between servers should be consdered when storng the data. In practcal applcatons, servers are usually connected by nternet or routers. Consderng each server as node, the connectons as sdes, connecton of servers can be shown as a graph, more specfcally a complete graph. 2) Storage capablty: Hardware s vtal for data storage, whch drectly affects DC s data storage capablty and storage safety. owadays, varous types of hard dsks can be found n the market, each havng ts own characterstcs. Hard dsk nterface s the connectng component between hardware and host system, and functons as the data transmsson devce between HDD (Hard Dsk Drve) cachng and host memory. Dfferent hard dsk nterfaces determne speed of connecton between hard dsk and computer. In the whole system, qualty of hard dsk nterface drectly affects program runnng speed and system performance level. In DC, the future of enterprse-based dsk array les n mxed use of SSD (Sold State Dsk) and HDD. For the sake of cost control, both should be taken nto consderaton. HDD can provde massve storage space, whle SSD can provde hgh performance. Generally, for hard dsks used for server confguraton crculated n the market, ts prce s drectly proportonal to ts stablty and data read/wrte speed; 3 Model Establshment Snce cloud storage s a burgeonng servce, the users are hghly concerned about data storage relablty and safety, n addton to the prce of the servce. To some extent, the data submtted by users may nvolve ther prvacy or nterest. Therefore, more sutable storage servce whch meets ther requrements may be chosen by users regardng storage prce vs. relablty. Hence, when buldng the mathematc model, storage servce cost and storage relablty are both consdered as obectves, and mult-obectve optmzaton model of relable storage of cloud storage s proposed. 3.1 Assumptons Assumng server cluster owned by certan DC s S { s, s,, s } 1 2, the servers are allocated at dfferent places wth geographcal dfference among them. The matrx ( d ( s, s )) represents spatal dstance between two servers, 76 Copyrght 2014 SERSC

Advanced Scence and Technology Letters where d ( s, s ) 0, 1 ~ and, {1, 2,, },, then d ( s, s ) 0. Assumng data set submtted to DC by users s F { f, f,, f }, where f s ndvsble data 1 2 M block, each data block has ts own safety level L ( f ), 1 ~ M. Dfferent prces are charged for each server storng data wth dfferent safety level manly because data fles wth dfferent safety level have dfference backup fle numbers. The number and locaton of backups are decded accordng to the safety level of data fle specfed. Therefore, when provdng storage servce, data fles submtted by users may be mgrated among servers n DC accordng to requrement. Thus, durng mgraton, an overhead cost arses from communcaton and mgraton, along wth some problems related to relablty. Relablty n mgraton s manly related to connecton stablty and dstance among servers. For convenent model descrpton, the followng parameters are ntroduced: Data fle storage dentfcaton: f the data fle f s stored n the k type hard dsk of server s, then mark the matrx Y Sgn functon: s g n () s defned as 1 0 ( ) { f y sg n y 0 e lse k,ory 0 Functon X ( f, s ) represents the storage space of data fle f n server Intrnsc relablty R ( s ) owned by the server s tself Relablty T R ( s, s ) of data transmsson among servers, where (, ) [0,1] T R s s, and T R ( s, s ) s nversely proportonal to d ( s, s ) There are K types of hard dsk n DC, current avalable capacty of varous hard dsks n server s can be denoted as A ( s ), A ( s ),, A ( s ) 1 2 K, and the h type of hard dsk has relablty of ts own: P ( h ), h 1, 2,, K Data storage relablty, whch s the product of transmsson relablty, hardware relablty and server relablty durng data transmsson; the formula s: S R ( f, R ( s ), X ( f, s ), T R ( s, s )) ( ) (, ) (, ) ( ) B R s X f s T R s s P Y B 1 s (1) 3.2 Storage Cost Dfferent prces are charged for each server provdng storage servce to data at dfferent safety levels, denoted as C ( s L ( f )). Plus, data fles wth dfferent safety s levels have dfferent backup fle numbers; hgher safety levels correspond to more backups. Here, safety level s made equvalent to the amount of data fle backups, and these backups are stored n servers at dfferent geographc locatons whch comples wth our practce of data storage n cloud computng. Hence our assumpton s reasonable. The storage prce of fle f s: Copyrght 2014 SERSC 77

Advanced Scence and Technology Letters C ( f ) (, ( )) (, ) c C s L f X f s 1 The storage prce of data fle F submtted by user s: M C ( F ) C ( s, L ( f )) X ( f, s ) c 1 1 (2) (3) 3.3 Mgraton Cost Snce servers are located n dfferent geographc locatons, dfferent mgraton dstance occurs durng mgraton of data block from one server to another. Hence, mgraton cost s related to mgraton dstance and sze of data fle. Therefore, the mgraton cost of data block f k mgrated from the server s to the server s s expressed as: C ( f, s, s ) C X ( f, s ) d ( s, s ) (4) Where, C s parameter of mgraton cost. m m g k m k Hence, the mgraton cost of data fle F mgrated from server server s: M C ( f, s, s ) C X ( f, s ) d ( s, s ) k 1 k 1 M m g k m k s to another (5) 3.4 Communcaton Cost Data transmsson among servers manly embodes n communcaton flow due to dfferent geographc locatons of servers, hence, dstance between source server ( s ) and host server ( s ) s another factor nfluencng communcaton cost, expressed as: C ( f ) [ X ( f s ) W ( s s ) d ( s s )] co m k k 1 1 Where, W ( s, s ) represents communcaton cost of servers. The whole communcaton cost of data fle F after storage s: M C ( F ) [ X ( f, s ) W ( s, s ) d ( s, s )] co m k k 1 1,1 1 (6) (7) 4 Concluson By analyzng data storage nformaton n cloud envronment, a mult-obectve optmzaton model for relable storage s bult. In vew of data fles wth safety requrement stored by users, n the model, both data storage cost ncludng storage prce, mgraton cost and communcaton cost and data relablty ncludng 78 Copyrght 2014 SERSC

Advanced Scence and Technology Letters transmsson relablty n storage process and storage devce relablty after data storage are consdered. In order to valdate correctness and avalablty of the model, a mult-obectve GA s desgned under the framework of algorthm SGA-II for model soluton. In the experments, computatons are carred out through constructng several storage stuatons by usng parameters publshed by exstng commercal cloud storage servces. The expermental results valdate correctness and avalablty of the model, and show that the model can provde multple storage solutons for users so that storage resources n DC can be effectvely utlzed. References 1. Aameek, S., Madhukar, K., Dushmanta, M.: Server-storage Vrtualzaton: Integraton and Load Balancng n Data Centers[C]. In: Proceedng of the 2008 ACM/IEEE Conference on Supercomputng (2008) 1-12 2. Gregory, C., Idt, K,, Rachd, G.: Relable dstrbuted storage[j]. IEEE Computer Socety, Vol. 42, o. 4 (2009) 60-67 3. Deng, Y.H., Wang, F., Helan,.: Dynamc and Scalable Storage Management Archtecture for Grd Orented Storage Devces [J]. Parallel Computng, Vol. 34, o. 6 (2008) 17-31 4. Heser, J., colett, M.: Assessng the Securty Rsks of Cloud Computng. http://www.gartner.com (2008) 5. Luz, B., Urs, H.: The Datacenter as a Computer. Morgan & Claypool Publshers, Calforna (2009) 6. Fan, L.: Research on Effcent and Intellgent Algorthms for Two Classes of Complex Optmzaton Problem. Xdan Unversty (2012) 7. Ghemawat, S., Goboff, H., Leung, S.T.: The Google Fle System[C]. In: Proceedngs of the 19th ACM Symposum on Operatng Systems Prncples, ew York (2003) 32-47 8. Wang, C., Wang, Q., Ku, R., Lou, W.J.: Ensurng Data Storage Securty n Cloud Computng[C]. In: 17th IEEE Internatonal Workshop on Qualty of Servce (2009) 1-9 9. Alves, A., Vegas, C., edl, P.: A Dstrbuted Tablng Algorthm for Rule Based Polcy Systems. IEEE Computer Socety, Vol. 15, o. 5 (2004) 123-132 10. Thomas, C.: The Bascs of Relable Dstrbuted Storage etworks [J]. IEEE Computer Socety, Vol. 6, o. 3 (2004) 16-24 11. Yang, J.F., Chen, Z.B.: Analyss and Research of Cloud Computng System Instance[C]. In: Proceedngs of the 2010 Second Internatonal Conference on Future etworks (2010) 88-92 12. Hayes, B.: Cloud Computng [J]. Communcatons of the ACM, Vol. 51, o. 7 (2008) 30-34 13. Gurumurth, S.: Archtectng Storage for the Cloud Computng Era [J]. IEEE Computer Socety, Vol. 34, o. 6 (2009) 124-135 14. Wu, J.Y., Png, L.D., Ge, X.P.: Cloud Storage as the Infrastructure of Cloud Computng[C]. In: Proceedngs of the 2010 Internatonal Conference on Intellgent Computng and Cogntve Informatcs (2010) 380-383 15. Buxmann, P., Hess, T., Lehmann, S.: Software as a Servce. Busness and Economcs, Vol. 50, o. 6 (2008) 500-503 Copyrght 2014 SERSC 79