Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk.
|
|
- Gary Chandler
- 8 years ago
- Views:
Transcription
1 Load Rebalancing for Distributed File Systems in Clouds. Smita Salunkhe, S. S. Sannakki Department of Computer Science and Engineering KLS Gogte Institute of Technology, Belgaum, Karnataka, India Affiliated to Visvesvaraya technological University A B S T R A C T Distributed file systems are key building blocks for cloud computing applications based on the MapReduce programming paradigm. In such file systems, nodes simultaneously serve computing and storage functions; a file is partitioned into a number of chunks allocated in distinct nodes so that MapReduce tasks can be performed in parallel over the nodes. However, in a cloud computing environment, failure is the norm, and nodes may be upgraded, replaced, and added in the system. Files can also be dynamically created, deleted, and appended. This results in load imbalance in a distributed file system; that is, the file chunks are not distributed as uniformly as possible among the nodes. Emerging distributed file systems in production systems strongly depend on a central node for chunk reallocation. This dependence is clearly inadequate in a largescale, failure-prone environment because the central load balancer is put under considerable workload that is linearly scaled with the system size, and may thus become the performance bottleneck and the single point of failure. In this paper, a fully distributed load rebalancing algorithm is presented to cope with the load imbalance problem. Our algorithm is compared against a centralized approach in a production system and a competing distributed solution presented in the literature. The simulation results indicate that our proposal is comparable with the existing centralized approach and considerably outperforms the prior distributed algorithm in terms of load imbalance factor, movement cost, and algorithmic overhead. The performance of our proposal implemented in the Hadoop distributed file system is further investigated in a cluster environment. Index Terms : Load rebalance, distributed file systems, clouds, movement cost, load imbalance, chunk. I. INTRODUCTION CLOUD Computing is a compelling technology. In clouds, clients can dynamically allocate their resources on-demand without sophisticated deployment and management of resources. Key enabling technologies for clouds include the MapReduce programming paradigm, distributed file systems, virtualization, and so forth. These techniques emphasize scalability, so clouds can be large in scale, and comprising entities can arbitrarily fail and join while maintaining system reliability. Distributed file systems are key building blocks for cloud computing applications based on the MapReduce programming paradigm. In such file systems, nodes simultaneously serve computing and storage functions; a file is partitioned into a number of chunks allocated in distinct nodes so that MapReduce tasks can be performed in parallel over the nodes. For example, consider a word count application that counts the number of distinct words and the frequency of each unique word in a large file. In such an application, a cloud partitions the file into a large number of disjointed and fixed-size pieces (or file chunks) and assigns them to different cloud storage , IJAFRC and NCRTIT 2015 All Rights Reserved
2 nodes (i.e., chunkservers). Each storage node (or node for short) then calculates the frequency of each unique word by scanning and parsing its local file chunks. In such a distributed file system, the load of a node is typically proportional to the number of file chunks the node possesses. Because the files in a cloud can be arbitrarily created, deleted, and appended, and nodes can be upgraded, replaced and added in the file system, the file chunks are not distributed as uniformly as possible among the nodes. Load balance among storage nodes is a critical function in clouds. In a load-balanced cloud, the resources can be well utilized and provisioned, maximizing the performance of MapReduce-based applications. In this paper, we are interested in studying the load rebalancing problem in distributed file systems specialized for large-scale, dynamic and data-intensive clouds. (The terms rebalance and balance are interchangeable used.) Such a large-scale cloud has hundreds or thousands of nodes (and may reach tens of thousands in the future). Our objective is to allocate the chunks of files as uniformly as possible among the nodes such that no node manages an excessive number of chunks. Additionally, we aim to reduce network traffic (or movement cost) caused by rebalancing the loads of nodes as much as possible to maximize the network bandwidth available to normal applications. Moreover, as failure is the norm, nodes are newly added to sustain the overall system performance resulting in the heterogeneity of nodes. Exploiting capable nodes to improve the system performance is, thus, demanded. II. LOAD REBALANCING ALGORITHM The load rebalancing problem in distributed file systems specialized for large-scale, dynamic and dataintensive clouds to allocate the chunks of files as uniformly as possible among the nodes of chunks and aims to reduce network traffic (or movement cost ). We consider a large-scale distributed file system consisting of a set of chunkservers V in a cloud, where the cardinality of V is V =n. Typically, n can be 1,000, 10,000, or more. In the system, a number of files are stored in the n chunkservers. First, let us denote the set of files as F. Each file f F is partitioned into a number of disjointed, fixed- size chunks denoted by C f. For example, each chunk has the same size, 64 bytes, in Hadoop HDFS [4]. Second, the load of a chunkserver is proportional to the number of chunks hosted by the server [3]. Third, node failure is the norm in such a distributed system, and the chunkservers may be upgraded, replaced and added in the system. Finally, the files in F may be arbitrarily created, deleted, and appended. The net effect results in file chunks not being uniformly distributed to the chunkservers. Our objective in the current study is to design a load rebalancing algorithm to reallocate file chunks such that the chunks can be distributed to the system as uniform uniformly as possible while reducing the movement cost as much as possible. Here, the movement cost is defined as the number of chunks migrated to balance the loads of the chunkservers. Let A be the ideal number of chunks that any chunkserver i V is required to manage in a system-wide load-balanced state, A. Necessity of Load Rebalancing Eliminates the dependence on central nodes. Operates in a distributed manner in which nodes perform their load-balancing tasks independently without synchronization or global knowledge regarding the system. Reduce algorithmic overhead introduced to the DHTs as much as possible , IJAFRC and NCRTIT 2015 All Rights Reserved
3 III. SYSTEM DESIGN System Architecture shows a small-scale cluster environment consisting of a single, dedicated name node and 25 data nodes. Specifically, the name node is equipped with Intel Core 2 processor and 1 Gbytes RAM. As the number of file chunks in our experimental environment is small, the RAM size of the name node is sufficient to cache the entire name node process and the metadata information, including the directories and the locations of file chunks. In proposed work, experimental environment, a number of clients are established to issue requests to the name node. The requests include commands to create directories with randomly designated names, to remove directories arbitrarily chosen, etc Name Node Data node Data node Load Balancer Data node Network Clients Figure 1 System Architecture A. Load Balancer Fig. 2 Shows load balancer, A file is partitioned into a number of chunks allocated in distinct nodes so that MapReduce tasks can be performed in parallel over the nodes. For example, consider a word count application that counts the number of distinct words and the frequency of each unique word in a large file. In such an application, a cloud partitions the file into a large number of disjointed and fixed-size pieces (or file chunks) and assigns them to different cloud storage nodes (i.e., chunkservers) , IJAFRC and NCRTIT 2015 All Rights Reserved
4 Flowchart: Annexure 1. IV. CONCLUSION A novel load-balancing algorithm to deal with the load rebalancing problem in large-scale, dynamic, and distributed file systems. Proposed System will balance the loads of nodes and reduce the demanded movement cost as much as possible, while taking advantage of physical network locality and node heterogeneity. Proposal is comparable to the centralized algorithm in the Hadoop HDFS production system. Proposed Load-balancing algorithm exhibits a fast convergence rate. Our proposal is comparable to the centralized algorithm in the Hadoop HDFS production system and dramatically out- performs the competing distributed algorithm in terms of load imbalance factor, movement cost, and algorithmic overhead. Particularly, our load-balancing algorithm exhibits a fast convergence rate. V. REFERENCES [1] Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, Haiying Shen, Member, IEEE, and Yu-Chang Chao, Load Rebalancing for Distributed File Systems in Clouds, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 5, MAY , IJAFRC and NCRTIT 2015 All Rights Reserved
5 [2] S. Ghemawat, H. Gobioff, and S.-T. Leung, The Google File System, Proc. 19th ACM Symp. Operating Systems Principles (SOSP 03),pp , Oct [3] Hadoop Distributed File System, hdfs/, [4] VMware, [5] Xen, [6] Apache Hadoop, [7] Hadoop Distributed File System Rebalancing Blocks, [8 ] K. McKusick and S. Quinlan, GFS: Evolution on Fast-Forward, Comm. ACM,vol. 53, no. 3, pp , Jan [9] HDFS Federation, [10] I. Stoica, R. Morris, D. Liben-Nowell, D.R. Karger, M.F. Kaashoek,F. Dabek, and H. Balakrishnan, Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications, , IJAFRC and NCRTIT 2015 All Rights Reserved
6 Annexure , IJAFRC and NCRTIT 2015 All Rights Reserved
SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE SYSTEM IN CLOUD
International Journal of Advances in Applied Science and Engineering (IJAEAS) ISSN (P): 2348-1811; ISSN (E): 2348-182X Vol-1, Iss.-3, JUNE 2014, 54-58 IIST SOLVING LOAD REBALANCING FOR DISTRIBUTED FILE
More informationDistributed file system in cloud based on load rebalancing algorithm
Distributed file system in cloud based on load rebalancing algorithm B.Mamatha(M.Tech) Computer Science & Engineering Boga.mamatha@gmail.com K Sandeep(M.Tech) Assistant Professor PRRM Engineering College
More informationSurvey on Load Rebalancing for Distributed File System in Cloud
Survey on Load Rebalancing for Distributed File System in Cloud Prof. Pranalini S. Ketkar Ankita Bhimrao Patkure IT Department, DCOER, PG Scholar, Computer Department DCOER, Pune University Pune university
More informationLoad Re-Balancing for Distributed File. System with Replication Strategies in Cloud
Contemporary Engineering Sciences, Vol. 8, 2015, no. 10, 447-451 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2015.5263 Load Re-Balancing for Distributed File System with Replication Strategies
More informationSecured Load Rebalancing for Distributed Files System in Cloud
Secured Load Rebalancing for Distributed Files System in Cloud Jayesh D. Kamble 1, Y. B. Gurav 2 1 II nd Year ME, Department of Computer Engineering, PVPIT, Savitribai Phule Pune University, Pune, India
More informationWater-Filling: A Novel Approach of Load Rebalancing for File Systems in Cloud
Water-Filling: A Novel Approach of Load Rebalancing for File Systems in Cloud Divya Diwakar Department of Information Technology SATI College, Vidisha (M.P.), India Sushil Chaturvedi Department of Information
More informationLoad Rebalancing for File System in Public Cloud Roopa R.L 1, Jyothi Patil 2
Load Rebalancing for File System in Public Cloud Roopa R.L 1, Jyothi Patil 2 1 PDA College of Engineering, Gulbarga, Karnataka, India rlrooparl@gmail.com 2 PDA College of Engineering, Gulbarga, Karnataka,
More informationIMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION
INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND SCIENCE IMPACT OF DISTRIBUTED SYSTEMS IN MANAGING CLOUD APPLICATION N.Vijaya Sunder Sagar 1, M.Dileep Kumar 2, M.Nagesh 3, Lunavath Gandhi
More informationBalancing the Load to Reduce Network Traffic in Private Cloud
Balancing the Load to Reduce Network Traffic in Private Cloud A.Shenbaga Bharatha Priya 1, J.Ganesh 2 M-TECH (IT) Final Year, Department of IT, Dr.Sivanthi Aditanar College of Engineering, Tiruchendur,
More informationBALANCING BLOCKS FOR DISTRIBUTED FILE SYSTEMS IN CLOUD
BALANCING BLOCKS FOR DISTRIBUTED FILE SYSTEMS IN CLOUD Harika Pratibha Kovvuri 1, Chinabusi Koppula 2 1. M.Tech Scholar, Department of CSE, Kaushik College of Engineering, Visakhapatnam, AP, India. 2.
More informationEnhance Load Rebalance Algorithm for Distributed File Systems in Clouds
Enhance Load Rebalance Algorithm for Distributed File Systems in Clouds Kokilavani.K, Department Of Pervasive Computing Technology, Kings College Of Engineering, Punalkulam, Tamil nadu Abstract This paper
More informationCloud Based Adaptive Overlapped Data Chained Declustering
Cloud Based Adaptive Overlapped Data Chained Declustering Vidya G. Shitole 1, Prof. N. P. Karlekar 2 Student, M.E. 2nd year, Computer Engineering, SIT Lonavala, University of Pune, Maharashtra, India 1
More informationHow To Balance In Cloud Computing
A Review on Load Balancing Algorithms in Cloud Hareesh M J Dept. of CSE, RSET, Kochi hareeshmjoseph@ gmail.com John P Martin Dept. of CSE, RSET, Kochi johnpm12@gmail.com Yedhu Sastri Dept. of IT, RSET,
More informationAn Efficient Distributed Load Balancing For DHT-Based P2P Systems
An Efficient Distributed Load Balancing For DHT-Based P2P Systems Chahita Taank 1, Rajesh Bharati 2 1 PG Student, 2 Professor, Computer Engineering Dept DYPIET, PUNE. Abstract- In a distributed system
More informationSecure and Privacy-Preserving Distributed File Systems on Load Rebalancing in Cloud Computing
Secure and Privacy-Preserving Distributed File Systems on Load Rebalancing in Cloud Computing A Sumanth 1, Bhaludra Raveendranadh Singh 2, Moligi Sangeetha 3 1 A Sumanth, pursuing M.Tech (CSE) from Visvesvaraya
More informationLONG TERM EVOLUTION WITH 5G USING MAPREDUCING TASK FOR DISTRIBUTED FILE SYSTEMS IN CLOUD
LONG TERM EVOLUTION WITH 5G USING MAPREDUCING TASK FOR DISTRIBUTED FILE SYSTEMS IN CLOUD 1 MSSoundarya, 2 GSiva Kumar Assistant Professor, Department of CSE Gnanamani College of Engineering ABSTRACT -
More informationDesign and Implementation of Performance Guaranteed Symmetric Load Balancing Algorithm
Design and Implementation of Performance Guaranteed Symmetric Load Balancing Algorithm Shaik Nagoor Meeravali #1, R. Daniel *2, CH. Srinivasa Reddy #3 # M.Tech, Department of Information Technology, Vignan's
More informationDistributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms
Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes
More informationAdaptive Overlapped Data Chained Declustering In Distributed File Systems
Adaptive Overlapped Data Chained Declustering In Distributed File Systems Vidya G. Shitole 1, Prof.Nandkishor Karlekar 2 Student, M.E. 2 nd year, Computer Engineering, Sinhgad Institute of Technology,
More informationDistributed File Systems
Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)
More informationComparative analysis of Google File System and Hadoop Distributed File System
Comparative analysis of Google File System and Hadoop Distributed File System R.Vijayakumari, R.Kirankumar, K.Gangadhara Rao Dept. of Computer Science, Krishna University, Machilipatnam, India, vijayakumari28@gmail.com
More informationSurvey on Load Rebalancing For Distributed File System in Cloud with Security
Survey on Load Rebalancing For Distributed File System in Cloud with Security Jayesh D. Kamble, Prof. Y.B.Gurav IInd Year ME, Department of Computer Engineering, PVPIT, Pune, India Associate Professor&
More informationSimple Load Rebalancing For Distributed Hash Tables In Cloud
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727 Volume 13, Issue 2 (Jul. - Aug. 2013), PP 60-65 Simple Load Rebalancing For Distributed Hash Tables In Cloud Ch. Mounika
More informationLOAD BALANCING WITH PARTIAL KNOWLEDGE OF SYSTEM
LOAD BALANCING WITH PARTIAL KNOWLEDGE OF SYSTEM IN PEER TO PEER NETWORKS R. Vijayalakshmi and S. Muthu Kumarasamy Dept. of Computer Science & Engineering, S.A. Engineering College Anna University, Chennai,
More informationR.Tamilarasi #1, G.Kesavaraj *2
ENHANCING SECURE MULTI USER ACCESS IN CLOUD ENVIRONMENT BY LOAD BALANCING RTamilarasi #1, GKesavaraj *2 #1 Mphil, Research Scholar, Vivekananda Arts and Science College for women *2 Assistant professor,department
More informationDESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVIRONMENT
DESIGN ARCHITECTURE-BASED ON WEB SERVER AND APPLICATION CLUSTER IN CLOUD ENVIRONMENT Gita Shah 1, Annappa 2 and K. C. Shet 3 1,2,3 Department of Computer Science & Engineering, National Institute of Technology,
More informationEfficient Metadata Management for Cloud Computing applications
Efficient Metadata Management for Cloud Computing applications Abhishek Verma Shivaram Venkataraman Matthew Caesar Roy Campbell {verma7, venkata4, caesar, rhc} @illinois.edu University of Illinois at Urbana-Champaign
More informationInternational journal of Engineering Research-Online A Peer Reviewed International Journal Articles available online http://www.ijoer.
RESEARCH ARTICLE ISSN: 2321-7758 GLOBAL LOAD DISTRIBUTION USING SKIP GRAPH, BATON AND CHORD J.K.JEEVITHA, B.KARTHIKA* Information Technology,PSNA College of Engineering & Technology, Dindigul, India Article
More informationDistributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
More informationLecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl
Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind
More informationInternational Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 8, August 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
More informationVaralakshmi.T #1, Arul Murugan.R #2 # Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam
A Survey on P2P File Sharing Systems Using Proximity-aware interest Clustering Varalakshmi.T #1, Arul Murugan.R #2 # Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam
More informationThe Google File System
The Google File System Motivations of NFS NFS (Network File System) Allow to access files in other systems as local files Actually a network protocol (initially only one server) Simple and fast server
More informationComparative analysis of mapreduce job by keeping data constant and varying cluster size technique
Comparative analysis of mapreduce job by keeping data constant and varying cluster size technique Mahesh Maurya a, Sunita Mahajan b * a Research Scholar, JJT University, MPSTME, Mumbai, India,maheshkmaurya@yahoo.co.in
More informationA Load Balancing Model Using a New Algorithm in Cloud Computing For Distributed File with Hybrid Security
A Load Balancing Model Using a New Algorithm in Cloud Computing For Distributed File with Hybrid Security Dhafar Hamed Abd Department of Computer Science, Al Maaref University College. Abstract: A fully
More informationData-Intensive Computing with Map-Reduce and Hadoop
Data-Intensive Computing with Map-Reduce and Hadoop Shamil Humbetov Department of Computer Engineering Qafqaz University Baku, Azerbaijan humbetov@gmail.com Abstract Every day, we create 2.5 quintillion
More informationR.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5
Distributed data processing in heterogeneous cloud environments R.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5 1 uskenbaevar@gmail.com, 2 abu.kuandykov@gmail.com,
More informationComputing Load Aware and Long-View Load Balancing for Cluster Storage Systems
215 IEEE International Conference on Big Data (Big Data) Computing Load Aware and Long-View Load Balancing for Cluster Storage Systems Guoxin Liu and Haiying Shen and Haoyu Wang Department of Electrical
More informationSnapshots in Hadoop Distributed File System
Snapshots in Hadoop Distributed File System Sameer Agarwal UC Berkeley Dhruba Borthakur Facebook Inc. Ion Stoica UC Berkeley Abstract The ability to take snapshots is an essential functionality of any
More informationHDFS Space Consolidation
HDFS Space Consolidation Aastha Mehta*,1,2, Deepti Banka*,1,2, Kartheek Muthyala*,1,2, Priya Sehgal 1, Ajay Bakre 1 *Student Authors 1 Advanced Technology Group, NetApp Inc., Bangalore, India 2 Birla Institute
More informationBig Data and Hadoop with components like Flume, Pig, Hive and Jaql
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
More informationThe Google File System
The Google File System By Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (Presented at SOSP 2003) Introduction Google search engine. Applications process lots of data. Need good file system. Solution:
More informationDistributed File Systems
Distributed File Systems Alemnew Sheferaw Asrese University of Trento - Italy December 12, 2012 Acknowledgement: Mauro Fruet Alemnew S. Asrese (UniTN) Distributed File Systems 2012/12/12 1 / 55 Outline
More informationDESIGN AN EFFICIENT BIG DATA ANALYTIC ARCHITECTURE FOR RETRIEVAL OF DATA BASED ON
DESIGN AN EFFICIENT BIG DATA ANALYTIC ARCHITECTURE FOR RETRIEVAL OF DATA BASED ON WEB SERVER IN CLOUD ENVIRONMENT Gita Shah 1, Annappa 2 and K. C. Shet 3 1,2,3 Department of Computer Science & Engineering,
More informationCLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES
CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES 1 MYOUNGJIN KIM, 2 CUI YUN, 3 SEUNGHO HAN, 4 HANKU LEE 1,2,3,4 Department of Internet & Multimedia Engineering,
More informationInternational Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 349 Load Balancing Heterogeneous Request in DHT-based P2P Systems Mrs. Yogita A. Dalvi Dr. R. Shankar Mr. Atesh
More informationA Study on Workload Imbalance Issues in Data Intensive Distributed Computing
A Study on Workload Imbalance Issues in Data Intensive Distributed Computing Sven Groot 1, Kazuo Goda 1, and Masaru Kitsuregawa 1 University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan Abstract.
More informationRedistribution of Load in Cloud Using Improved Distributed Load Balancing Algorithm with Security
ISSN (Online) : 2319-8753 ISSN (Print) : 2347-6710 International Journal of Innovative Research in Science, Engineering and Technology Volume 3, Special Issue 3, March 2014 2014 International Conference
More informationLoad Balancing in Structured Overlay Networks. Tallat M. Shafaat tallat(@)kth.se
Load Balancing in Structured Overlay Networks Tallat M. Shafaat tallat(@)kth.se Overview Background The problem : load imbalance Causes of load imbalance Solutions But first, some slides from previous
More informationAn Efficient Deduplication File System for Virtual Machine in Cloud
An Efficient Deduplication File System for Virtual Machine in Cloud Bhuvaneshwari D M.E. computer science and engineering IndraGanesan college of Engineering,Trichy. Abstract Virtualization is widely deployed
More informationComparison of Distributed File System for Cloud With Respect to Healthcare Domain
Comparison of Distributed File System for Cloud With Respect to Healthcare Domain 1 Ms. Khyati Joshi, 2 Prof. ChintanVarnagar 1 Post Graduate Student, 2 Assistant Professor, Computer Engineering Department,
More informationDistributed Metadata Management Scheme in HDFS
International Journal of Scientific and Research Publications, Volume 3, Issue 5, May 2013 1 Distributed Metadata Management Scheme in HDFS Mrudula Varade *, Vimla Jethani ** * Department of Computer Engineering,
More informationLifetime Management of Cache Memory using Hadoop Snehal Deshmukh 1 Computer, PGMCOE, Wagholi, Pune, India
Volume 3, Issue 1, January 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com ISSN:
More informationReducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan
Reducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan Abstract Big Data is revolutionizing 21st-century with increasingly huge amounts of data to store and be
More informationFault Tolerance in Hadoop for Work Migration
1 Fault Tolerance in Hadoop for Work Migration Shivaraman Janakiraman Indiana University Bloomington ABSTRACT Hadoop is a framework that runs applications on large clusters which are built on numerous
More informationLoad Balancing in Structured P2P Systems
1 Load Balancing in Structured P2P Systems Ananth Rao Karthik Lakshminarayanan Sonesh Surana Richard Karp Ion Stoica ananthar, karthik, sonesh, karp, istoica @cs.berkeley.edu Abstract Most P2P systems
More informationBig Fast Data Hadoop acceleration with Flash. June 2013
Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional
More informationScala Storage Scale-Out Clustered Storage White Paper
White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current
More informationThe Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform
The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, Ko-Chin Chang, and Wei-Tsong Lee Abstract Virtualization platform solutions
More informationCloud Computing at Google. Architecture
Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale
More informationLarge Scale Distributed File System Survey
Large Scale Distributed File System Survey Yuduo Zhou Indiana University Bloomington yuduo@indiana.edu ABSTRACT Cloud computing, one type of distributed systems, is becoming very popular. It has demonstrated
More informationA Brief Analysis on Architecture and Reliability of Cloud Based Data Storage
Volume 2, No.4, July August 2013 International Journal of Information Systems and Computer Sciences ISSN 2319 7595 Tejaswini S L Jayanthy et al., Available International Online Journal at http://warse.org/pdfs/ijiscs03242013.pdf
More informationRevoScaleR Speed and Scalability
EXECUTIVE WHITE PAPER RevoScaleR Speed and Scalability By Lee Edlefsen Ph.D., Chief Scientist, Revolution Analytics Abstract RevoScaleR, the Big Data predictive analytics library included with Revolution
More informationHadoop Technology for Flow Analysis of the Internet Traffic
Hadoop Technology for Flow Analysis of the Internet Traffic Rakshitha Kiran P PG Scholar, Dept. of C.S, Shree Devi Institute of Technology, Mangalore, Karnataka, India ABSTRACT: Flow analysis of the internet
More informationHadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela
Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance
More informationSCHEDULING IN CLOUD COMPUTING
SCHEDULING IN CLOUD COMPUTING Lipsa Tripathy, Rasmi Ranjan Patra CSA,CPGS,OUAT,Bhubaneswar,Odisha Abstract Cloud computing is an emerging technology. It process huge amount of data so scheduling mechanism
More informationLog Mining Based on Hadoop s Map and Reduce Technique
Log Mining Based on Hadoop s Map and Reduce Technique ABSTRACT: Anuja Pandit Department of Computer Science, anujapandit25@gmail.com Amruta Deshpande Department of Computer Science, amrutadeshpande1991@gmail.com
More informationAn Architecture for Video Surveillance Service based on P2P and Cloud Computing
An Architecture for Video Surveillance Service based on P2P and Cloud Computing Yu-Sheng Wu, Yue-Shan Chang, Tong-Ying Juang, Jing-Shyang Yen speaker: 饒 展 榕 Outline INTRODUCTION BACKGROUND AND RELATED
More informationMapReduce and Hadoop Distributed File System
MapReduce and Hadoop Distributed File System 1 B. RAMAMURTHY Contact: Dr. Bina Ramamurthy CSE Department University at Buffalo (SUNY) bina@buffalo.edu http://www.cse.buffalo.edu/faculty/bina Partially
More informationAnalysis and Optimization of Massive Data Processing on High Performance Computing Architecture
Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture He Huang, Shanshan Li, Xiaodong Yi, Feng Zhang, Xiangke Liao and Pan Dong School of Computer Science National
More informationHDFS scalability: the limits to growth
Konstantin V. Shvachko HDFS scalability: the limits to growth Konstantin V. Shvachko is a principal software engineer at Yahoo!, where he develops HDFS. He specializes in efficient data structures and
More informationStorage Systems Autumn 2009. Chapter 6: Distributed Hash Tables and their Applications André Brinkmann
Storage Systems Autumn 2009 Chapter 6: Distributed Hash Tables and their Applications André Brinkmann Scaling RAID architectures Using traditional RAID architecture does not scale Adding news disk implies
More informationResearch on Job Scheduling Algorithm in Hadoop
Journal of Computational Information Systems 7: 6 () 5769-5775 Available at http://www.jofcis.com Research on Job Scheduling Algorithm in Hadoop Yang XIA, Lei WANG, Qiang ZHAO, Gongxuan ZHANG School of
More informationGoogle File System. Web and scalability
Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might
More informationA Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems*
A Content-Based Load Balancing Algorithm for Metadata Servers in Cluster File Systems* Junho Jang, Saeyoung Han, Sungyong Park, and Jihoon Yang Department of Computer Science and Interdisciplinary Program
More informationHadoop & its Usage at Facebook
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction
More informationEvaluating HDFS I/O Performance on Virtualized Systems
Evaluating HDFS I/O Performance on Virtualized Systems Xin Tang xtang@cs.wisc.edu University of Wisconsin-Madison Department of Computer Sciences Abstract Hadoop as a Service (HaaS) has received increasing
More informationScalable Multiple NameNodes Hadoop Cloud Storage System
Vol.8, No.1 (2015), pp.105-110 http://dx.doi.org/10.14257/ijdta.2015.8.1.12 Scalable Multiple NameNodes Hadoop Cloud Storage System Kun Bi 1 and Dezhi Han 1,2 1 College of Information Engineering, Shanghai
More informationEnhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications
Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications Ahmed Abdulhakim Al-Absi, Dae-Ki Kang and Myong-Jong Kim Abstract In Hadoop MapReduce distributed file system, as the input
More informationComparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques
Comparision of k-means and k-medoids Clustering Algorithms for Big Data Using MapReduce Techniques Subhashree K 1, Prakash P S 2 1 Student, Kongu Engineering College, Perundurai, Erode 2 Assistant Professor,
More informationA REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM
A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information
More informationLoad Balancing in Structured Peer to Peer Systems
Load Balancing in Structured Peer to Peer Systems DR.K.P.KALIYAMURTHIE 1, D.PARAMESWARI 2 Professor and Head, Dept. of IT, Bharath University, Chennai-600 073 1 Asst. Prof. (SG), Dept. of Computer Applications,
More informationLoad Balancing in Structured Peer to Peer Systems
Load Balancing in Structured Peer to Peer Systems Dr.K.P.Kaliyamurthie 1, D.Parameswari 2 1.Professor and Head, Dept. of IT, Bharath University, Chennai-600 073. 2.Asst. Prof.(SG), Dept. of Computer Applications,
More informationMINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT
MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,
More informationHadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
More informationPerformance Issues of Heterogeneous Hadoop Clusters in Cloud Computing
Performance Issues of Heterogeneous Hadoop Clusters in Cloud Computing 1 B.Thirumala Rao, 2 N.V.Sridevi, 3 V.Krishna Reddy, 4 L.S.S.Reddy Department of Computer Science and Engineering, Lakireddy Bali
More informationText Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
More informationA PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM
A PROXIMITY-AWARE INTEREST-CLUSTERED P2P FILE SHARING SYSTEM Dr.S. DHANALAKSHMI 1, R. ANUPRIYA 2 1 Prof & Head, 2 Research Scholar Computer Science and Applications, Vivekanandha College of Arts and Sciences
More informationAccelerating and Simplifying Apache
Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly
More informationMobile Cloud Computing for Data-Intensive Applications
Mobile Cloud Computing for Data-Intensive Applications Senior Thesis Final Report Vincent Teo, vct@andrew.cmu.edu Advisor: Professor Priya Narasimhan, priya@cs.cmu.edu Abstract The computational and storage
More informationNoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
More informationHow To Build Cloud Storage On Google.Com
Building Scalable Cloud Storage Alex Kesselman alx@google.com Agenda Desired System Characteristics Scalability Challenges Google Cloud Storage What does a customer want from a cloud service? Reliability
More informationA Study on Data Analysis Process Management System in MapReduce using BPM
A Study on Data Analysis Process Management System in MapReduce using BPM Yoon-Sik Yoo 1, Jaehak Yu 1, Hyo-Chan Bang 1, Cheong Hee Park 1 Electronics and Telecommunications Research Institute, 138 Gajeongno,
More informationKrunal Patel Department of Information Technology A.D.I.T. Engineering College (G.T.U.) India. Fig. 1 P2P Network
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Secure Peer-to-Peer
More informationParallel Processing of cluster by Map Reduce
Parallel Processing of cluster by Map Reduce Abstract Madhavi Vaidya, Department of Computer Science Vivekanand College, Chembur, Mumbai vamadhavi04@yahoo.co.in MapReduce is a parallel programming model
More informationCloud computing doesn t yet have a
The Case for Cloud Computing Robert L. Grossman University of Illinois at Chicago and Open Data Group To understand clouds and cloud computing, we must first understand the two different types of clouds.
More informationProcessing of Hadoop using Highly Available NameNode
Processing of Hadoop using Highly Available NameNode 1 Akash Deshpande, 2 Shrikant Badwaik, 3 Sailee Nalawade, 4 Anjali Bote, 5 Prof. S. P. Kosbatwar Department of computer Engineering Smt. Kashibai Navale
More informationAnt-based Load Balancing Algorithm in Structured P2P Systems
Ant-based Load Balancing Algorithm in Structured P2P Systems Wei Mi, 2 Chunhong Zhang, 3 Xiaofeng Qiu Beijing University of Posts and Telecommunications, Beijing 876, China, {miwei985, zhangch.bupt., qiuxiaofeng}@gmail.com
More informationLarge-Scale Data Sets Clustering Based on MapReduce and Hadoop
Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE
More informationLOAD BALANCING FOR OPTIMAL SHARING OF NETWORK BANDWIDTH
LOAD BALANCING FOR OPTIMAL SHARING OF NETWORK BANDWIDTH S.Hilda Thabitha 1, S.Pallavi 2, P.Jesu Jayarin 3 1 PG Scholar,,Dept of CSE,Jeppiaar Engineering College,Chennai, 2 Research Scholar,Sathyabama University,Chennai-119.
More information