A* Algorithm Based Optimization for Cloud Storage



Similar documents
Storage Optimization in Cloud Environment using Compression Algorithm

CHINA CELL PHONE MARKET PROFILE. Beijing Zeefer Consulting Ltd.

China - One Nation, Multiple Markets

The Development of Soil Survey and Soil Mapping in China

Workshop Management Office: Fairlink Exhibition Services Ltd.

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware

A Service Revenue-oriented Task Scheduling Model of Cloud Computing

Institute of Pharmacology under Chinese Academy of Medical Sciences Institute of Zoology under Chinese Academy of Medical Sciences

GLOBAL HOUSING WATCH. April 2016

Interim Administrative Measures on Domain Name Registration

SECRET. under the Banking Ordinance. Return of Mainland Activities

A Load Balancing Model Based on Cloud Partitioning for the Public Cloud

Rural Energy Consumption and its impacts on Climate Change

China s Distributed Solar PV Ambitions Policies and Challenges. Asia Solar Energy Forum 2015

Game Theory Based Iaas Services Composition in Cloud Computing

Teacher Education in China : Current Situation & Related Issues

A SURVEY ON WORKFLOW SCHEDULING IN CLOUD USING ANT COLONY OPTIMIZATION

Transport Infrastructure Development in China October Fung Business Intelligence Centre

CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW

Volkswagen Group China Dr. Jörg Mull, Executive Vice President, Finance. Investor Conference Call with Deutsche Bank Beijing, December 03rd, 2012

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

IPTV Standards in China. Chuanyang Miao ZTE Corporation, China

Study on Cloud Computing Resource Scheduling Strategy Based on the Ant Colony Optimization Algorithm

China s Electricity System: A Primer on Planning, Pricing, and Operations. Fritz Kahrl Jim Williams E3

Types, Concentration, Diffusion and Spatial Structure Evolution of Natural Gas Resource Flow in China

Analysis of Information Management and Scheduling Technology in Hadoop

Performance Evaluation of Round Robin Algorithm in Cloud Environment

New Cloud Computing Network Architecture Directed At Multimedia

APPLICATION OF ADVANCED SEARCH- METHODS FOR AUTOMOTIVE DATA-BUS SYSTEM SIGNAL INTEGRITY OPTIMIZATION

Improving MapReduce Performance in Heterogeneous Environments

From Grid Computing to Cloud Computing & Security Issues in Cloud Computing

The Problem of Scheduling Technicians and Interventions in a Telecommunications Company

How To Understand Cloud Computing

Patterns of Domestic Grain Flows and Regional Comparative Advantage in Grain Production in China

HYBRID GENETIC ALGORITHMS FOR SCHEDULING ADVERTISEMENTS ON A WEB PAGE

A Method of Cloud Resource Load Balancing Scheduling Based on Improved Adaptive Genetic Algorithm

ANALYSIS OF GIS HIGHER EDUCATION IN CHINA

A resource schedule method for cloud computing based on chaos particle swarm optimization algorithm

Studying on Construction Programs of the Platform of Primary Products Marketing

CLOUD COMPUTING IN HIGHER EDUCATION

Cloud Computing based on the Hadoop Platform

CLOUD DATABASE ROUTE SCHEDULING USING COMBANATION OF PARTICLE SWARM OPTIMIZATION AND GENETIC ALGORITHM

An Efficient Checkpointing Scheme Using Price History of Spot Instances in Cloud Computing Environment

A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster

Mapping the Trend of Regional Inequality in China from Nighttime Light Data

An Optimized Load-balancing Scheduling Method Based on the WLC Algorithm for Cloud Data Centers

Participatory Cloud Computing and the Privacy and Security of Medical Information Applied to A Wireless Smart Board Network

GLOBAL INSIGHT AUTOMOTIVE SEMINAR

Virtual Machine Instance Scheduling in IaaS Clouds

WORKFLOW ENGINE FOR CLOUDS

ENERGY-EFFICIENT TASK SCHEDULING ALGORITHMS FOR CLOUD DATA CENTERS

Volunteer Computing, Grid Computing and Cloud Computing: Opportunities for Synergy. Derrick Kondo INRIA, France

Fig. 1 WfMC Workflow reference Model

Volkswagen Group China Dr. Jörg Mull, Executive Vice President, Finance. J.P. Morgan Investor Meeting, Beijing June 05 th, 2014

An Improved ACO Algorithm for Multicast Routing

Shareability and Locality Aware Scheduling Algorithm in Hadoop for Mobile Cloud Computing

Connecting Global Competence. The future of building for China

Cloud Computing for Agent-based Traffic Management Systems

A Secure Strategy using Weighted Active Monitoring Load Balancing Algorithm for Maintaining Privacy in Multi-Cloud Environments

IJESRT. Scientific Journal Impact Factor: (ISRA), Impact Factor: 2.114

Challenges and Perspectives of Welfare Regimes in China

HYBRID ACO-IWD OPTIMIZATION ALGORITHM FOR MINIMIZING WEIGHTED FLOWTIME IN CLOUD-BASED PARAMETER SWEEP EXPERIMENTS

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM

7. Student Loan Reform in China: Problems and challenges

Harbin Bank s Featured Road to Inclusive Finance

Gender Gaps in China: Facts and Figures

Next Generation Mobile Cloud Gaming

What happened so far?

CNOOC Natural Gas Business

Dynamic Round Robin for Load Balancing in a Cloud Computing

DESIGN OF A PLATFORM OF VIRTUAL SERVICE CONTAINERS FOR SERVICE ORIENTED CLOUD COMPUTING. Carlos de Alfonso Andrés García Vicente Hernández

Resource Scheduling in Cloud using Bacterial Foraging Optimization Algorithm

Statistical Report on Internet Development in China

FEDERATED CLOUD: A DEVELOPMENT IN CLOUD COMPUTING AND A SOLUTION TO EDUCATIONAL NEEDS

Annual Salary Survey Report 2013

Secured Storage of Outsourced Data in Cloud Computing

Multilevel Communication Aware Approach for Load Balancing

An Efficient Approach for Task Scheduling Based on Multi-Objective Genetic Algorithm in Cloud Computing Environment

Transcription:

International Journal of Digital Content Technology and its Applications Volume 4, Number 8, November 21 A* Algorithm Based Optimization for Cloud Storage 1 Ren Xun-Yi, 2 Ma Xiao-Dong 1* College of Computer Nanjing University of Posts and Telecommunications, renxy@njupt.edu.cnnn 2 Jiangsu YiTong HIGH-Tech Co., LTD doi:1.4156/jdcta.vol4. issue8.23 Abstract Cloud Storage provide users with storage space and make user friendly and timely acquire data, which is foundation of all kinds of cloud applications. However, there is lack of deep studies on how to optimize cloud storage aiming at improvement of data access performance. In this paper, mathematical description for cloud storage optimization is given and as a objective optimization problems which is solved by our proposed optimized A* algorithm, as a result the data is distributed in appropriate nodes with the best efficiency. The experimental results demonstrate the performance of the algorithms is feasible in reducing MakeSpan, and the optimization can produce a storage strategy which is keeping with the real conditions. Besides, lastly, the time limitation of A* algorithm is investigated in experiments. 1. Introduction Keywords: A* Algorithm, Cloud Computing, Cloud Atorage, Optimization. Cloud computing is one new distributed computing mode after grid computing, pervasive computing. Its aim is to build a virtual infrastructure providing users with remotely computing and storage capacity [1-3]. Since 26, there have been some of the more successful cloud facilities, such as Amazon's Elastic Compute Cloud [3], IBM's Blue Cloud [5], Nimbus [6], OpenNEbula [7], and Google s Google App Engine [8] and so on. Cloud storage is a kind of cloud computing. It provides space for data storage, and user-friendly and timely access way to user, such as a simple storage service Simple Storage Service (S3) built on Amazon EC2 as well as the Google File System [9]. The greatest advantage of cloud storage is it enables users at any time access data. In cloud system, storage management system automatically analysis user s requirements and locate and transform data, which greatly facilitate the users. But, high demands are proposed for cloud management system itself. For example, a service failure occurs in Simple Storage Service (S3) in July 28, and this failure lasted for eight hours, making online companies relying on S3 suffer a great loss.the reason causing the system failure is that the S3 system can not effectively route the user's requests to the appropriate physical storage server. Therefore, cloud storage must be optimized to ensure that the data storage and access efficiency. Cloud storage itself is an objective optimization problem as same as Grid Resource Scheduling. Using some intelligent algorithm can efficiently solving optimization problem. In this paper, we proposed using A * algorithm [1] [11] to optimize cloud storage. Our proposed method for cloud storage optimization can effectively distribute data to the appropriate cloud nodes. 2. Problem Description Given a storage cloud environment, N users need to storage the data D = (D1, D2,..., Dn) in a reasonable manner to the M nodes CM = (C1, C2,..., CM). We must consider one key factor: clouds efficiently respond to user's overall access. For a user, the cost of storage the data D i in the node C j includes two parts: on is time consuming cost in transferring data D to the C j and the other is local storage price of C j. namely: StorageCosts = trans( D,user,C ) +s(d, C ) (1) j j 23

A* Algorithm Based Optimization for Cloud Storage Ren Xun-Yi, Ma Xiao-Dong The total access cost of N users is transmission data from the cloud nodes to all users. N trans D Cj useri (2) i=1 AccessCosts = (,, ) The purpose of optimization is to make the total cost of storage and access is minimum, it is expressed as: M TatalC Min( * AccessCosts k * Storagecosts k) (3) k 1 Where and are weighting factors, and 1..Storage constraint optimization problem formalized as follows: AvailStorage SizeofReqData (4) Formula (4) means the available space of current node is greater than the size of a request for stored data, of course, where the some constraint conditions, such as load balancing, etc may added. But in order to simplify the problem here we only consider storage constraints. Thus, cloud storage problem equal to regard (3) as the target optimization problem, (4) as conditions. At present, there are many heuristic methods for solving optimization problems, such as genetic algorithm, Tabu search, simulated annealing, etc., A * algorithm is a global optimal search algorithm, in this paper we apply the A * optimization algorithm for cloud storage optimization. 3. A* Algorithm Optimization for Cloud Storage In A * algorithm, the evaluation of a solution cost by the following formula: EV(s) = D(s) + V(s) (5) D (s) is paid cost, and V (s) is a potential cost. Assume that the optimal solution is V (s) *, then the actual cost is as follows: EV(s) = D(s) + V(s)* (6) Since V (s) * is unknown, using the least upper bound of V (s): Lub (V (s)) to instead of V (s) *. In case: Lub (V(s)) V(s)* (7) Then, D(s) +Lub (V(s)) < D(s) +V(s)*. Obviously, using of D (s) + V (s)* to evaluate the current solution performance, regardless of D (s) be good or bad, the optimal solution is certainly not lose. We applied A * algorithm for the cloud storage optimization to obtain the global optimal solution. We combine the Min-Min algorithm to seek the approximate solution for the A * algorithm, our optimization algorithm is shown in Figure 1. 24

International Journal of Digital Content Technology and its Applications Volume 4, Number 8, November 21 A* based Storage Optimization Algorithm 1: BestCost Get_Minmin() 2:h Get_height(T) //get hight of tree 3:n 4:while (T<Set(Time))and(ε>Set(ε)) 5: if n==h then 6: UPDATE-Bestcost(Bestcost,Currentcost); 7: else 8: {DS Get_Cost1; 9: VS Get_Cost2;} 1: If (Ds+Vs< bestcost) 11: Select_next_site; 13: If current level is travel finished 15: n n+1; 16: end while Figure 1. A* based Storage Optimization Algorithm The optimization effect is low efficiency if A * algorithm use a random distribution strategy as initial solution, Therefore in line 1 of this algorithm, Min-Min algorithm is used to generate an approximate optimal solution. The process is: Min-Min algorithm distributes a data to a cloud node with minimal storage cost every time, and then removes the occupied node from nods list. Because Min-Min algorithm only consider each optimal distribution, a blindly removal of storage nodes may exclude the optimal solution, and thus can not reach the global optimum. To avoid this problem, we get the height of the tree in second line, and then in generating the initial tree, any of the N data can be stored in the either of M cloud nodes, which will generate N * M allocation, as a result, the initial tree has the N-layer and each layer has M branches. Thus, a blindly removal of optimal storage nodes may avoided. In the While loop (4-14 lines), right tree travel is done. First, we determine whether the node to meet the constraints, that is, whether to meet the formula (3), after that, using of formula (4) to calculate the total cost. If the current calculated cost of leaf node, BestCost is updated; if CurrentCost <BestCost, BestCost is replaced with the CurrentCost, by which, our algorithm keep close to the optimal solution (5-6 lines). If the current node is in the middle layer, then allocated cost VS (line 8) is calculated, which is done from the root node to the current node, and then compare the likely paid cost with BestCost to determine whether or not to prune. Therefore, the algorithm determine the choice of solution by calculating only the minimum cost of next level, not do it from the root node to the leaf nodes, by which, it achieve rapid pruning effect. 4. Experimental Results Our experiment conduct storage Optimization based on real downloads of Chinese internet. We regard China's internet as a large storage cloud, provincial capitals as main cloud nodes, in which environment, data is distributed to the most appropriate cloud nodes to make the user s entire download time minimize. Two kinds of experimental data are involved: one is the transmission time, and the other is the number of visitors. We evaluate transmission time by combination of bandwidth and the railway between the Chinese provinces capital. The number of visitors is estimated accord to historical software downloads in some noticeable software websites. Then the percentage of each software downloads is calculated by using above two kinds of data. It is divided by the ratio of the number of provinces; we acquired any software scheduling information including user's location, software name, and software size. To simplify, we set α = 1., β =. A variety of performance are shown in Figure 2, the algorithm itself time-consuming is shown in Figure 3. 25

Algorithm Execution Time(s) Total Access Time(ms) A* Algorithm Based Optimization for Cloud Storage Ren Xun-Yi, Ma Xiao-Dong 7 6 5 4 3 2 1 24 297 368 514 67 649 927 11 Number of Task Random Maxmin Minmin GA GSA Tabu MCT SA A* Figure 2. Access time of Varity optimization algorithm 16 14 12 1 8 6 4 2 24 297 368 514 67 649 927 11 Number of Task Random Maxmin Minmin GA GSA Tabu MCT SA A* Figure 3. Algorithm consumption time Figure 2 shows that the performance of random algorithm and the MaxMin algorithm is the worst mainly because the Random algorithm do not consider access performance, while MaxMin algorithm priority execute long task with maximum access time, obviously it can not achieve whole least costly. Tabu algorithm performance is poor, and it can be seen from Figure 3, the algorithm execution time is also great. Second, MCT algorithm assign task to a random node with minimum expected completion time (wait time and execution time). This will result in very large increase in access cost. Min-Min algorithm can get good effects in the case task number are few. And it can be seen from Figure 3, the algorithm execution time has always been very small. But when the task number is greater than 6, the access cost is rapid increase. SA algorithm not only has a good effect in access cost but also has the advantages of fast implementation. Genetic algorithm depends on the mutation probability and crossover probability, the performance of adopting default values are poor than simulated annealing. In this experiment, we set the population size 5, crossover probability.25, mutation probability.1, its performance is second only to the effect of A *. From Figure 2 we can see that, regardless of the number of tasks increases, A * algorithm optimized access cost has always been minimal, and we set the algorithm execution time 3 seconds, can be seen from Figure 3, the execution time is minimum compared with other algorithms. After storage optimization, local and remote access is shown in Figure 4. 26

Time(ms) Access Time(s) International Journal of Digital Content Technology and its Applications Volume 4, Number 8, November 21 45 4 35 3 25 2 15 1 5 Beijing Liaoning Heilongjiang Anhui Shanghai Jiangxi Hebei Hubei Guangdong 9 Shaanxi Qinghai Sichuan Local Remote Yunnan Neimenggu Province Figure 4. Local and remote Access time of province From Figure 4 we can see that the optimization results places a higher percentage of data distribution in Beijing, Guangdong, Shanghai, Tianjin and other. This is consistent with the actual situation. Because in general the more number of Internet users is, the more greater network bandwidth is, the more multiple copies of data should been stored in the province. In Jiangsu Province, remotely access number is most, followed by Guangdong, HeNan, Sichuan, Shandong, Zhejiang and other capital. We conclude that the amount of remote access not only is relevant to local access, but also to the geographic location. Moreover optimized data not can achieve data sharing among multiprovinces. Furthermore, in order to set reasonable parameters for A * algorithm, we take into account impact of the upper execution time limit on A * algorithm. The algorithm time limit is set to 1 second, 2 seconds, 3 seconds, 4 seconds, 5 seconds, the average access costs for different tasks number are obtained in the 1 times optimization, as shown in Figure 5. As can be seen from Figure 5, the time limit is set to 5 seconds, which can eliminate the randomness, and the algorithm is more stable, and optimized effect is best of all set time limitation. 4 3 2 1 1 2 3 4 5 6 7 8 9 1 2 Number of Task 3 5 Figure 5. Upper execution time limit on A * algorithm 1s 2s 3s 4s 5s 5. Conclusion and Future works Cloud computing is most popular distributed computing in the recent years. Cloud storage plays a key role to cloud application. In this paper, we give the description of cloud storage problems, and the A * algorithm is used to optimize cloud storage. The experimental results show that the proposed method can improve data access efficiency, and has a strong practical application value. The future work is to build a small-scale cloud storage environment, and employ this method into the actual storage environments. 27

A* Algorithm Based Optimization for Cloud Storage Ren Xun-Yi, Ma Xiao-Dong 6. Acknowledgement Supported by National Natural Science Foundation of China (6173188), China Postdoctoral Science Foundation (21471355), the Talent Project of Nanjing University of Posts and Telecommunications (NY286) 7. Reference [1] A. Weiss. Computing in the Clouds[J]. networker 27,11(4):16-25. [2] Rajkumar Buyya, Chee Shin Yeo, Srikumar Venugopal, et al. Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility[j].future Generation Computer Systems 29,25:599-616. [3] Twenty experts define cloud computing[url]. http://cloudcomputing.sys-con.com/ read/612375_p.htm (18.7.8) [4] Amazon Inc. Amazon Web Services EC2 site[url]. http://aws.amazon.com/ec2, 28. [5] IBM Blue Cloud project [URL]. http://www-3.ibm.com/press/us/en/pressrelease/22613.wss/, access on June 28. [6] Nimbus Project [URL].http://workspace.globus.org/clouds/nimbus.html/, 28. [7] OpenNEbula Project [URL]. http://www.opennebula.org/, access on Apr. 28. [8] S. Ghemawat, H. Gobioff, and S. Leung. The google file system[c]. In Proceedings of the 19th ACM Symposium on Operating Systems Principles, pages 29 43,23. [9] GoogleApp [URL]http://appengine.google.com/ access on June 28 [1] Message Passing Interface Forum.MPI:A message passing interface standard[c].university of Tennessee,Tech Rep,1994.P94~23. [11] K.Chow and B.Liu.On mapping signal processing algorithms to a heterogeneous multiprocessor system[c].in:1991 International Conference on Acoustics,Speech,and Signal Processing (ICASSP'91),Vol.3:1585~1588. 28