Presentation of Multi Level Data Replication Distributed Decision Making Strategy for High Priority Tasks in Real Time Data Grids



Similar documents
A Survey for Replica Placement Techniques in Data Grid Environment

Based on the Correlation of the File Dynamic Replication Strategy in Multi-Tier Data Grid

An Adaptive Replication Approach for Relocation Services in Data Intensive Grid Environment

Efficient Data Replication Scheme based on Hadoop Distributed File System

Varalakshmi.T #1, Arul Murugan.R #2 # Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam

ISSN: Keywords: HDFS, Replication, Map-Reduce I Introduction:

How To Balance In Cloud Computing

Dynamic resource management for energy saving in the cloud computing environment

Data Management and Network Marketing Model

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

Data Consolidation: A Task Scheduling and Data Migration Technique for Grid Networks

It takes know-how to retrieve large files over public networks

ENERGY-EFFICIENT TASK SCHEDULING ALGORITHMS FOR CLOUD DATA CENTERS

16th International Conference on Control Systems and Computer Science (CSCS16 07)

Ant Colony Optimization for Data Grid Replication Services Technical Report RR DIIS. UNIZAR.

A New Hybrid Load Balancing Algorithm in Grid Computing Systems

IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION

AN ADAPTIVE DISTRIBUTED LOAD BALANCING TECHNIQUE FOR CLOUD COMPUTING

Load-Balancing Enhancement by a Mobile Data Collector in Wireless Sensor Networks

A Distributed Render Farm System for Animation Production

Improving Performance and Reliability Using New Load Balancing Strategy with Large Public Cloud

THE IMPACT OF DATA REPLICATION ON JOB SCHEDULING PERFORMANCE IN HIERARCHICAL DATA GRID

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Xweb: A Framework for Application Network Deployment in a Programmable Internet Service Infrastructure

Supporting Mobility In Publish-Subscribe Networks

A High-Performance Virtual Storage System for Taiwan UniGrid

CloudAnalyzer: A cloud based deployment framework for Service broker and VM load balancing policies

Cloud Computing for Agent-based Traffic Management Systems

A Fast Path Recovery Mechanism for MPLS Networks

An ACO Algorithm for Scheduling Data Intensive Application with Various QOS Requirements

Consecutive Geographic Multicasting Protocol in Large-Scale Wireless Sensor Networks

Real-Time Analysis of CDN in an Academic Institute: A Simulation Study

An Implementation of Load Balancing Policy for Virtual Machines Associated With a Data Center

A COGNITIVE NETWORK BASED ADAPTIVE LOAD BALANCING ALGORITHM FOR EMERGING TECHNOLOGY APPLICATIONS *

A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM

TWO LEVEL JOB SCHEDULING AND DATA REPLICATION IN DATA GRID

3D On-chip Data Center Networks Using Circuit Switches and Packet Switches

Web Service Based Data Management for Grid Applications

ANALYSIS OF WORKFLOW SCHEDULING PROCESS USING ENHANCED SUPERIOR ELEMENT MULTITUDE OPTIMIZATION IN CLOUD

Towards a Content Delivery Load Balance Algorithm Based on Probability Matching in Cloud Storage

Design and Analysis of a Load Balancing Strategy in Data Grids

Cyber Forensic for Hadoop based Cloud System

Figure 1. The cloud scales: Amazon EC2 growth [2].

A cost-effective mechanism for Cloud data reliability management based on proactive replica checking

Research on Digital Agricultural Information Resources Sharing Plan Based on Cloud Computing *

Setting deadlines and priorities to the tasks to improve energy efficiency in cloud computing

On the Placement of Management and Control Functionality in Software Defined Networks

An Optimization Model of Load Balancing in P2P SIP Architecture

International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November ISSN

Collaborative & Integrated Network & Systems Management: Management Using Grid Technologies

Game Theory Based Iaas Services Composition in Cloud Computing

A Hybrid Load Balancing Policy underlying Cloud Computing Environment

A Survey Study on Monitoring Service for Grid

CHAPTER 7 SUMMARY AND CONCLUSION

Cluster-based Multi-path Routing Algorithm for Multi-hop Wireless Network

A Load Balanced PC-Cluster for Video-On-Demand Server Systems

Dynamic Adaptive Feedback of Load Balancing Strategy

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

International Journal of Mechatronics, Electrical and Computer Technology

OCRP Implementation to Optimize Resource Provisioning Cost in Cloud Computing

Game Theory Based Load Balanced Job Allocation in Distributed Systems

New Cloud Computing Network Architecture Directed At Multimedia

Method of Fault Detection in Cloud Computing Systems

A Taxonomy and Survey of Grid Resource Planning and Reservation Systems for Grid Enabled Analysis Environment

Implementation of a Medical Image File Accessing System on Cloud Computing

A General Distributed Scalable Peer to Peer Scheduler for Mixed Tasks in Grids

ADAPTIVE LOAD BALANCING ALGORITHM USING MODIFIED RESOURCE ALLOCATION STRATEGIES ON INFRASTRUCTURE AS A SERVICE CLOUD SYSTEMS

Optimal Service Pricing for a Cloud Cache

A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster

A Proposed Framework for Ranking and Reservation of Cloud Services Based on Quality of Service

Self-organized Multi-agent System for Service Management in the Next Generation Networks

Log Mining Based on Hadoop s Map and Reduce Technique

On the Cost of Reliability in Large Data Grids

Big Data Storage Architecture Design in Cloud Computing

Operating System Multilevel Load Balancing

Exploring Resource Provisioning Cost Models in Cloud Computing

Efficient File Sharing Scheme in Mobile Adhoc Network

Improvisation of The Quality Of Service In ZigBee Cluster Tree Network

A Comparative Study of Tree-based and Mesh-based Overlay P2P Media Streaming

A Hybrid Electrical and Optical Networking Topology of Data Center for Big Data Network

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April ISSN

Telecom Data processing and analysis based on Hadoop

QoS Based Scheduling of Workflows in Cloud Computing UPnP Architecture

A Reliable and Fast Data Transfer for Grid Systems Using a Dynamic Firewall Configuration

A QoS-driven Resource Allocation Algorithm with Load balancing for

Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.

Distributed communication-aware load balancing with TreeMatch in Charm++

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

A New Mechanism for Service Recovery Technology by using Recovering Service s Data

A Service Revenue-oriented Task Scheduling Model of Cloud Computing

Power Consumption Based Cloud Scheduler

A Novel Load Balancing Optimization Algorithm Based on Peer-to-Peer

CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW

QoS EVALUATION OF CLOUD SERVICE ARCHITECTURE BASED ON ANP

CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES

Present a New Middleware to Control and Management Database Distributed Environment

Distributed Consistency Method and Two-Phase Locking in Cloud Storage over Multiple Data Centers

Efficient Scheduling in Cloud Networks Using Chakoos Evolutionary Algorithm

Research on Trust Management Strategies in Cloud Computing Environment

Transcription:

Presentation of Multi Level Data Replication Distributed Decision Making Strategy for High Priority Tasks in Real Time Data Grids Naghmeh Esmaieli Esmaily.naghmeh@gmail.com Mahdi Jafari Ser_jafari@yahoo.com Mehran Mohsen Zadeh mohsenzadeh@srbiau.ac.ir Azad Islamic University, Science & Research Branch of Tehran, Iran Abstract Real time data grids are some type of data grids in which the jobs should be done within a definite time period. If this period takes a longer time, the programs will not possibly be executed within the specified time period. Data replication issue is brought up as a procedure for prompt access to data and execution of programs within the specified deadline in the grids. In data replication, the parameters under assessment for replication, selection of the file to be replicated, number of replications and the location of new copies are important challenges that should be addressed. In this paper, dynamic replication algorithm of a new data called Distributed Date Replication (DDR) is introduced that improves the mentioned challenges and meets real time data by distributed decision for replication in three levels. Simulation results showed that the average time for executing the jobs and server traffic in DDR algorithm decreased as compared to other algorithms and less jobs failed. Keywords - Real time data grids, Dynamic data replication, Distributed decision I. INTRODUCTION In real time data grids, there are time limitations for execution of some jobs. In other words, some jobs have higher priorities. Considering the critical nature of real time data grid application, the time to respond to jobs should be reduced [1]. Limited bandwidth and delayed access are the difficulties of transferring a file from one site to another. Data replication is a common method to facilitate access to data, to reduce bandwidth consumption and delayed access by providing one or more replications on another site [3]. Since in real time data grids, data replication leads to prompt data access and execution of programs within the specified time period, it has also a positive effect on the responding time [9, 17, and 18]. There are two replication methods, static method and dynamic method [3]. The first method is the static method in which replication policy is specified from the beginning and on a fixed and static basis. In fact, it is considered as a part of system configuration. It is certain that no changes will be made in this policy upon any change in the topology of data grid network or in the pattern of requests. Therefore, system output decreases significantly and the resources may not be used suitably and desirably. In contrast, there is the dynamic replication [1,8,10,12,15,16, 21,22,23,24] in which data grid automatically generates replication where necessary and puts that in a suitable place so that it may have an optimal access frequency and does not increase the costs [8,10,12,15,16]. DDR algorithm which is explained in this paper is a dynamic algorithm to use the advantages of dynamic replication. This paper is organized as follows: The executed jobs are described in section 2 and the distributed decision algorithm for data replication will be explained in detail in section 3. In section 4, the presented algorithm in real time data grid is evaluated through simulation. Conclusion and future jobs are presented in section 5. II. RELATED WORK In [8, 19] data replication methods, data is selected based on different types of locality. There are three types of locality, temporal locality, geographical locality and spatial locality. In temporal locality, if a file is referred, it will be possibly referred to in the future as well. In geographical locality, if a node finds access to a file, the neighboring node will find access too. In file locality, if a file is referred to, the neighboring files will also have access to the same file in the future. One of the data replication methods can be used based on the method of access pattern to a file. Of course, it is possible that one data replication method has a good performance in an environment but it has a bad performance in another environment. Most of the jobs executed in the field of data replication have been concentrated on time locality [4]. From a point of view, replication algorithms are divided into two main groups, these algorithms can be classified into two groups: pull based [7, 10] and push based algorithms [12, 13]. In a pull model, a site without the requested file in its local storage would decide when to replicate it to itself, and where from. On the other hand, in a push model, a site having a particular file in its local storage would decide when to replicate it, and where to [1]. In [11] there is a thorough data management that keeps the number of accesses to the files. In [7] Pulling is set forth in another algorithm called distributed file replication. In [1] some algorithms, dynamic file transfer is presented without real time requirements such as Cascading, Caching and Fast Spread. In [6, 8] several algorithms such as Cascading- Enhanced, Caching and Cascading and Caching-PP are described to improve real time efficiency in data gird. In [4] MAC algorithm (Minimum Access Cost) only concentrates on reading data in a structure-free hierarchical environment. This algorithm decides to copy a file based on the average response time of a file and frequency of accesses. In [2, 20] 27

28

29

30

31

32

necessity from different aspects. Simulation results showed that decision making in three different levels, as compared to LALW algorithm in which decision is made in one level, would meet real time requirements by providing on time and suitable replications and by placing the replications in suitable locations and would also maintain the loading balance of grid. Moreover, considering the application of real time data grids, it would reduce job response time, number of job deaths and server traffic. Considering the importance of data replication and decrease of data replication time overhead in the real time data grids, replication time reduction is recommended for the future. REFERENCES [1] Atakan Dogan, A study on performance of dynamic file replication algorithms for real-time file access in data grids, Future Generation Computer Systems 25 (8) (2009) 829 839. [2] Ruay-Shiung Chang, Hui-Ping Chang, Yun-Ting Wang, A Dynamic Weighted Data Replication Strategy in Data Grids, in: Proceeding of the IEEE, Department of Computer Science and Information Engineering, National Dong Hwa University, Shoufeng, Hualien 974, TAIWAN, 2008 [3] A. Abdurrab, T. Xie :"FIRE: A File Reunion Based Data Replication Strategy for Data Grids" 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, (2010) [4] Xin Sun, Jun Zheng, Qiongxin Liu, Yushu Liu, Dynamic Data Replication Based on Access Cost in Distributed Systems, in: Proceeding of the IEEE, Fourth International Conference on Computer Sciences and Convergence Information Technology, 2009, China [5] J. Zhang, B.S. Lee, X. Tang, C.K. Yao :"A model to predict the optimal performance of the hierarchical data grid", Future Generation Computer Systems 26 (1) (2010) 1 11. [6] Leyli Mohammad Khanli a,, Ayaz Isazadeh a, Tahmuras N. Shishavan: PHFS: A dynamic replication method, to decrease access latency in the multi-tier data grid, Future Generation Computer Systems 27 (3) (2011) 233 244, Iran [7] H. Lamehamedi, Z. Shentu, B. Szymanski, E. Deelman, Simulation of dynamic data replication strategies in data grids, in: Heterogeneous Computing Workshop, 2003, p. 100b. [8] K. Ranganathan, I. Foster, Identifying dynamic replication strategies for high performance data grids, in: Proceedings of the Second International Workshop on Grid Computing, Denver, CO, November 2001, pp. 75 86. [9] The European data grid project. http://eu-datagrid.web.cern.ch/eudatagrid/. [10] D.G. Camaron, A.P. Millar, C. Nicholson, R.C. -Schiaffino, F. Zini, K. Stockinger, Analysis of scheduling and replica optimisation strategies for data grids using optorsim, Journal of Grid Computing 2 (1) (2004) 57_69. [11] M. Tang, B.S. Lee, C.K. Yao, X.Y. Tang, Dynamic replication algorithm for the multi tier data grid, Future Generation Computer Systems 21 (5) (2005) 775 790. [12] K. Ranganathan, I. Foster, Simulation studies of computation and data scheduling algorithms for data grids, Journal of Grid Computing 1 (1) (2003) 63_74. [13] M.R. Rahman, K. Barker, R. Alhaji, Replica placement design with static optimality and dynamic maintainability, in: IEEE Int'l Symposium on Cluster Computing and the Grid, 2006. [14] U. Cibej, B. Slivnik, B. Robic, The complexity of static data replication in data grids, Parallel Computing 31 (8) (2005) 900 912. [15] A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, and S. Tuecke, The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets, Journal of Network and Computer Application, vol. 23, pages 187-200, 2000. [16] M. Tang, B.S. Lee, X. Tang, C.K. Yeo, The impact of data replication on job scheduling performance in the data grid, Future Generation Computer Systems 22 (3) (2006) 254 268. [17] Y. Shi, A. Shortridge, J. Bartholic, Grid computing for real-time distributed collaborative geoprocessing, in: Symposium on Geospatial Theory, Processing and Applications, 2002. [18] M.K. Madi, S. Hassan, Dynamic replication algorithm in data grid: survey, in: Proceeding of International Conference on Network Applications, Protocols and Services, 22 November 2008. [19] R.Shiung Chang, Ning Yuan Huang, Jih Sheng Chang, A predictive algorithm for replication optimization in data grid, in: Proceeding of ICS 2006, Taiyuan, Taiwan, 2006, pp. 199 204. [20] R.S. Chang, H.P. Chang, A dynamic data replication strategy using access weights in data grids, The Journal of Supercomputing 45 (3) (2008) 277 295. [21] H. Lamehamedi, Z. Shentu, B. Szymanski, E. Deelman, Simulation of dynamic data replication strategies in data grids, in: Proceeding of 12th Heterogeneous Computing Workshop, Nice, France, April 2003. [22] H. Lamehamedi, B.K. Szymanski, Decentralized data management framework for data grids, Future Generation Computer Systems 23 (1) (2007) 109 115. [23] X. Dong, J. Li, Z. Wu, D. Zhang, J. Xu, On dynamic replication strategies in data service grids, in: Proceeding of 11th IEEE Symposium on Object Oriented Real- Time Distributed Computing, ISORC, 2008. [24] S.M. Park, J.H. Kim, Y.B. Ko, W.S. Yoon, Dynamic data grid replication strategy based on internet hierarchy, in: Proceeding of Second International Workshop on Grid and Cooperative Computing, GCC 2003. [25] D.G. Cameron, R.C. Schiaffino, A.P. Millar, C. Nicholson, K. Stockinger, and F. Zini, Evaluating scheduling and replica optimisation strategies in optorsim, Proc. 4th Int l Workshop on Grid Computing, pp. 52-59, 2003. 33