# MinCopysets: Derandomizing Replication In Cloud Storage

Save this PDF as:

Size: px
Start display at page:

## Transcription

3 System Windows Azure LinkedIn s HDFS Node Size Chunk Size Nodes in Cluster 75TB 1GB TB 128MB RAMCloud 192GB 8MB ,000 Replication Scheme Random replication within a small cluster First replica on first rack, other replicas together on second rack Random replication Table 1: Parameters of different large scale storage systems. These parameters are estimated based on publicly available details [9, 10, 19]. replicate a single chunk three times. We randomly select three different machines to store our replicas. If a power outage causes 1% of the nodes in the data center to fail, the probability that the crash caused the exact three machines that store our chunk to fail is only %. However, assume now that instead of replicating just one chunk, we replicate 10,000 chunks, and we want to ensure that every single one of these chunks will survive the massive failure. Even though each individual chunk is fairly safe, in aggregate across the entire cluster, some chunk is expected to be at risk. That is, the probability of losing all the copies of at least one of the chunks is much higher. To understand this problem quantitatively, consider the following analysis. In a random replication scheme, each chunk is replicated onto R machines chosen randomly from a cluster with N machines. In the case of F simultaneous machine failures, a chunk that has been replicated on one of these machines will be completely lost if its other replicas are also among those F failed machines. The probability of losing a single chunk is: ( F ( N Where ( F is the number of combinations of failed replicas and ( N is the maximum number of combinations of replicas. Therefore, the probability of losing at least one chunk in the cluster in the case of F simultaneous failures is given by: ( ( ) F N C 1 1 ( N WhereN C is the total number of chunks in the cluster. Figure 1 plots the equation above as a function ofn, assuming 1% of the nodes have failed (F = 0.01N), with R = 3. Unless stated otherwise, Table 1 contains the node and chunk sizes, which we used for all the graphs presented in the paper 1. For simplicity s sake, through- 1 Windows Azure is currently intended to support only about out the paper we use the random replication scheme for all three systems, even though the systems actual schemes are slightly different (e.g., HDFS replicates the second and third replicas on the same rack [21]), since we found very little difference in terms of data loss probability between the different schemes. We also assume the systems do not use any data coding (we will show in Section 5.4 that coding doesn t affect the data loss probability in these scenarios). As we saw in Figure 1 data loss is nearly guaranteed if at least three nodes fail. The reason is that the random replication scheme creates a large number of copysets. A copyset is defined as a set of nodes that contain the replicas of a particular data chunk. In other words, a copyset is a single unit of failure, i.e., when a copyset fails at least one data chunk is irretrievably lost. With random replication, every new chunk with a very high probability creates a distinct copyset, up to a certain point. This point is determined by the maximum number of copysets possible in a cluster: ( N. This number is the total number of possible sets of replicas of nodes. Hence, as the number of chunks in a system grows larger and approaches ( N, under large-scale failures at least a few data chunks copysets will fail, leading to data loss. The intuition above is the reason why simply changing the parameters of the random replication scheme will not fully protect against simultaneous node failures. In the next two subsections we discuss the effects of such parameter changes. 2.1 Alternative I: Increase the Replication Factor The first possible parameter change is to increase the number of replicas per chunk. The benefit of this change is that the maximum number of copysets possible in a cluster grows with the replication factorras ( N. However, this is a losing battle, since the amount of data stored in distributed storage systems continues to grow, as clusters become larger and store more chunks. As the number ( of chunks increases and inevitably approaches N ) R, the probability of data loss will increase. Using the same computation from before, we varied the replication factor from 3 to 6 on a RAMCloud cluster. The results of this simulation are shown in Figure 2. As we would expect, increasing the replication factor does indeed increase the durability of the system against simultaneous failures. However, simply increasing the replication factor from 3 to 4, does not seem to provide the necessary resiliency against simultaneous failures on large clusters with thousands of nodes. As a reference, Table 1 shows that HDFS is usually run on clusters with nodes in a cluster (or stamp ). Figure 1 projects the data loss if Azure continues to use its current replication scheme. 3

4 Probability of data loss when 1% of the nodes fail concurrently 10 Chunk 1 Chunk 2 Chunk 3 Chunk 4 Probability of data loss 8 2 R=3 R=4 R=5 R= Number of RAMCloud nodes (log scale) Figure 2: Simulation of the data loss probabilities of a RAMCloud cluster with 8000 chunks per node, varying the number of replicas per chunk. Probability of data loss Probability of data loss when 1% of the nodes fail concurrently MB Chunks, 8192 Chunks/Node 16MB Chunks, 4096 Chunks/Node 32MB Chunks, 2048 Chunks/Node 64MB Chunks, 1024 Chunks/Node 128MB Chunks, 512 Chunks/Node 256MB Chunks, 256 Chunks/Node 512MB Chunks, 128 Chunks/Node 1GB Chunks, 64 Chunks/Node 64GB Chunks, 1 Chunks/Node Number of RAMCloud nodes (log scale) Figure 3: Data loss probabilities of a RAMCloud cluster, varying the number of chunks generated per node. thousands of nodes [21], while RAMCloud is intended to support clusters of ten thousand nodes or more [19]. In order to support thousands of nodes reliably for current systems, the replication factor would have to be at least 5. Increasing the replication factor to 5 significantly hurts the system s performance. For each write operation, the system needs to write out more replicas. This increases the network latency and disk bandwidth of write operations. 2.2 Alternative II: Increase the Chunk Size Instead of increasing the replication factor, we can reduce the number of distinct copysets. A straightforward way is to increase the chunk size, so that for a fixed amount of total data, the total number of chunks in the system and consequently the number of copysets is reduced. Using the same model as before, we plotted the data loss probability of a RAMCloud cluster with varying chunk sizes in Figure 3. The figure shows that increasing the size of chunks increases the durability of the system. However, we would reach the desired durability of the Chunk 2 Chunk 2 Node 2 Chunk 2 Primary Node 83 Node 8 Replication Group 1 Chunk 1 Primary Chunk 3 Node 7 Chunk 1 Chunk 3 Primary Node 55 Chunk 1 Chunk 3 Node 24 Replication Group 2 Chunk 4 Chunk 4 Primary Node 1 Chunk 4 Node 22 Node 47 Replication Group 3 Figure 4: Illustration of the MinCopysets replication scheme. The first replica (primary replica) is chosen at random from the entire cluster. The other replicas (secondary replicas) are replicated on the statically defined replication group of the first replica. system only if we increase the chunk size by 3-4 orders of magnitude. If we take this scheme to the extreme, we can make the size of the chunk equal to the size of an entire node. This is in fact the same as implementing disk mirroring (i.e., RAID 1 [20]). In disk mirroring, every storage node has identical backup copies that are used for recovery. Therefore, the number of copysets with disk mirroring is equal to N. As we can see from R Figure 3, disk mirroring significantly reduces the probability of data loss in case of simultaneous failures. Increasing the size of chunks incurs a significant cost, since it has the undesirable property that data objects are distributed to much fewer nodes. This compromises the parallel operation and load balancing of data center applications. For instance, if we have 10 GBs of data and we use 1 GB chunks, a data center application like MapReduce [12] could operate only across 10 R machines, while if we use 64 MB chunks, MapReduce could run in parallel on 160 R machines. Similarly, this would significantly degrade the performance of a system like RAMCloud, which relies on distributing its data widely for fast crash recovery [19]. Our goal is therefore to design a replication scheme that reduces the probability of any data loss to a negligible value (i.e., close to zero), while retaining the benefits of uniformly distributing chunks across the cluster for parallelization. In the following section, we describe MinCopysets, a scheme that possesses both of these properties. 3. DESIGN In this section we describe the design of MinCopysets, a replication technique that retains the parallelization benefits of uniformly distributing chunks across the 4

5 Probability of data loss when 1% of the nodes fail concurrently 10 Probability of data loss with varying percentage of concurrent failures 10 Probability of data loss 8 2 R=3, Random Replication R=4, Random Replication R=2, MinCopysets R=5, Random Replication R=3, MinCopysets R=6, Random Replication Probability of data loss Nodes, R=3, MinCopysets Nodes, R=3, MinCopysets Nodes, R=3, MinCopysets 4000 Nodes, R=3, MinCopysets 2000 Nodes, R=3, MinCopysets 1000 Nodes, R=3, MinCopysets 500 Nodes, R=3, MinCopysets Number of RAMCloud nodes (log scale) Percentage of RAMCloud nodes that fail concurrently Figure 5: MinCopysets compared with random replication, using different numbers of replicas with 8000 chunks per node. entire cluster, while ensuring low data loss probabilities even in the presence of cluster-wide failures. Previous random replication schemes randomly select a node for each one of their chunks replicas. Random replication causes these systems to have a large number of copysets and become vulnerable to simultaneous node failures. MinCopysets s key insight is to decouple the placement of the first replica, which provides uniform data distribution, from the placement of the other replicas, which provide durability, without increasing the number of copysets. The idea behind MinCopysets is simple. The servers are divided statically into replication groups (each server belongs to a single group). When a chunk has to be replicated, a primary node is selected randomly for its first replica, and the other replicas are stored deterministically on secondary nodes, which are the other nodes of the primary node s replication group. Figure 4 illustrates the design of our new replication scheme. The primary node is chosen at random, while the secondary nodes are always the members of the primary node s replication group. MinCopysets decouples data distribution from chunk replication. Similar to the random replication scheme, it enables data center applications to operate in parallel on large amounts of data, because it distributes chunks uniformly across the data center. In addition, similar to disk mirroring, it limits the number of copysets generated by the cluster, because each node is only a member of a single replication group. Therefore, the number of copysets remains constant and is equal to N R. Unlike random replication, the number of copysets does not increase with the number of chunks in the system. In the next few subsections, we will go into further detail about the properties of MinCopysets. 3.1 Durability of MinCopysets Figure 6: Probability of data loss, when we vary the percentage of simultaneously failed nodes beyond the reported 1% of typical simultaneous failures. Each node has 8000 chunks per node. Figure 5 is the central figure of this paper. It compares the durability of the MinCopysets and random replication with different replication factors. We can make several interesting observations from the graph. First, on a 1000 node cluster under a power outage, MinCopysets reduces the probability of data loss from 99.7% to 0.02%. Second, if a system uses MinCopysets with 3 replicas, it has lower data loss probabilities than a system using random replication with 5 replicas, and a system using MinCopysets with 2 replicas has lower probabilities than random replication with 4 replicas. In other words, Min- Copysets has significantly lower data loss probabilities than random replication with more replicas. Third, Min- Copysets with 3 replicas can scale up to 100,000 nodes, which means that MinCopysets can support the cluster sizes of all widely used large-scale distributed storage systems (see Table 1). The typical number of simultaneous failures observed in commercial data centers is 0.5-1% of the nodes in the cluster [21]. Figure 6 depicts the probability of data loss as we increase the percentage of simultaneous failures much beyond the reported 1%. Note that commercial storage systems commonly operate in the range of nodes per cluster (e.g., see Table 1 and GFS [16]). For these cluster sizes MinCopysets prevents data loss with a high probability, even in the very unlikely and hypothetical scenario where 4% of the nodes fail simultaneously. 3.2 Compatibility with Existing Systems An important property of MinCopysets is that it does not require changing the architecture of the storage system. Unlike the alternative solutions that we discussed in Section 2, MinCopysets frees the storage system to use chunks of any size with any replication factor. In addition, MinCopysets provides the storage system with the freedom to decide how to split the nodes into 5

11 not impact the probability of data loss when using random replication, and the MinCopysets solution can be applied together with data coding. Codes are typically designed to allow chunks to be resilient against failures of two nodes using a reduced amount of storage, but not against three failures or more. If the coded data is still distributed on a very large number of copysets, multiple simultaneous failures will still cause data loss. In practice, existing storage system parity code implementations do not significantly affect the number of copysets. For example, the HDFS-RAID [2, 14] implementation encodes groups of 5 chunks in a RAID 5 and mirroring scheme, which reduces the number of nondistinct copysets by a factor of 5. However, this is exactly equivalent to increasing the chunk size by a factor of 5, and we showed before (in Fig 3) that such a reduction does not significantly decrease the data loss probabilities of the system. 6. RELATED WORK The related work is split into two categories. The first category is large scale storage systems that intentionally constrain the placement of replicas to prevent data loss due to concurrent node failures. The second category is RAID mirroring schemes, which were not originally designed to support the scale of Big Data applications. 6.1 Different Placement Policies Similar to MinCopysets, Facebook s proprietary HDFS implementation constrains the placement of replicas into smaller groups, to protect against concurrent failures [3, 8]. As we discussed in the previous section, while this scheme improves the original random replication scheme, it is much more exposed to concurrent failures than MinCopysets and is not general-purpose. Ford et al. from Google [15] analyze different failure loss scenarios on GFS clusters, and have proposed geo-replication as an effective technique to prevent data loss under large scale concurrent node failures. Georeplication is a fail-safe way to ensure data durability under a power outage. However, geo-replication incurs high latency and bandwidth costs. In addition, not all storage providers have the capability to support georeplication. In contrast to geo-replication, MinCopysets can mitigate the probability of data loss under concurrent failures without moving data to a separate location. 6.2 RAID Another replication scheme that is related to Min- Copysets is RAID (for a detailed overview of RAID technology, see IBM s guide [6]). RAID 1 [20] is the basic disk mirroring scheme. In RAID 1, each disk is fully mirrored on a set of additional disks, in order to provide higher durability. This scheme was originally designed as a hardware solution for a single machine. In order to extend RAID 1 to multiple disks, RAID 1+0 or RAID 10 is designed as a stripe of mirrors. In this scheme, odd numbered chunks are replicated on one mirror, while even numbered chunks are replicated on the second mirror. This scheme was extended further to RAID 100 and RAID In these schemes, chunks are split into even more mirror groups. As an extension to standard RAID systems, RAIDx [17] was designed as a distributed software RAID scheme for small clusters. Similar to our paper, it also identifies the need to decouple the distribution and data loss probability of data, using a combination of striping and mirroring. While MinCopysets is inspired by RAID s concept of mirroring, in practice the two schemes are very different. MinCopysets operates on a data center scale, while these RAID schemes are designed to serve small clusters of large servers. In addition, the RAID scheme is rigid. Nodes cannot join or leave mirror groups, and each chunk is deterministically replicated on a certain mirror group. In contrast, MinCopysets can choose the first replica at random from the entire cluster, and nodes can be divided into replication groups using different flexible policies. 7. CONCLUSION Conventional wisdom holds that randomization is a general-purpose technique for solving a variety of problems in large-scale distributed systems. This paper questions this assumption, and shows that randomization leads to poor data durability. In particular, we show that existing widely deployed systems such as HDFS and Azure that use random replication can lose data under common events such as power outages. This paper presents MinCopysets, a simple generalpurpose scalable replication scheme, that derandomizes data replication in order to achieve better data durability properties. MinCopysets decouples the mechanisms used for data distribution and durability. It allows system designers to use randomized node selection for data distribution to reap the benefits of parallelization and load balancing, while using deterministic replica selection to significantly improve data durability. We showed that with MinCopysets, systems can use only 3 replicas yet achieve the same data loss probabilities as systems using random replication with 5 replicas, and can safely scale up to 100,000 nodes. With relatively straightforward implementations, we added support for MinCopysets to RAMCloud and HDFS. MinCopysets does not introduce any significant overhead on normal storage operations, and can support any data locality or network 11

### Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

### Distributed File Systems

Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)

### Copysets: Reducing the Frequency of Data Loss in Cloud Storage

Copysets: Reducing the Frequency of Data Loss in Cloud Storage Asaf Cidon, Stephen Rumble, Ryan Stutsman, Sachin Katti, John Ousterhout and Mendel Rosenblum Stanford University cidon@stanford.edu, {rumble,stutsman,skatti,ouster,mendel}@cs.stanford.edu

### Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind

Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,

### Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

### MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT

MINIMIZING STORAGE COST IN CLOUD COMPUTING ENVIRONMENT 1 SARIKA K B, 2 S SUBASREE 1 Department of Computer Science, Nehru College of Engineering and Research Centre, Thrissur, Kerala 2 Professor and Head,

Table of contents 1 Purpose...2 2 Overview...2 3 Prerequisites...3 4 Web Interface...3 5 Shell Commands... 3 5.1 DFSAdmin Command...4 6 Secondary NameNode...4 7 Checkpoint Node...5 8 Backup Node...6 9

### THE HADOOP DISTRIBUTED FILE SYSTEM

THE HADOOP DISTRIBUTED FILE SYSTEM Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Presented by Alexander Pokluda October 7, 2013 Outline Motivation and Overview of Hadoop Architecture,

### Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop

WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable

### Fast Crash Recovery in RAMCloud

Fast Crash Recovery in RAMCloud Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum Stanford University ABSTRACT RAMCloud is a DRAM-based storage system that provides

### Fast Crash Recovery in RAMCloud

Fast Crash Recovery in RAMCloud Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout, and Mendel Rosenblum Stanford University ABSTRACT RAMCloud is a DRAM-based storage system that provides

### Massive Data Storage

Massive Data Storage Storage on the "Cloud" and the Google File System paper by: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung presentation by: Joshua Michalczak COP 4810 - Topics in Computer Science

### Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Journal of science e ISSN 2277-3290 Print ISSN 2277-3282 Information Technology www.journalofscience.net STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) S. Chandra

### Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics

Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)

### Google File System. Web and scalability

Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might

### Snapshots in Hadoop Distributed File System

Snapshots in Hadoop Distributed File System Sameer Agarwal UC Berkeley Dhruba Borthakur Facebook Inc. Ion Stoica UC Berkeley Abstract The ability to take snapshots is an essential functionality of any

### BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop

### CS2510 Computer Operating Systems

CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

### CS2510 Computer Operating Systems

CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

### Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

### Hadoop Distributed File System. Jordan Prosch, Matt Kipps

Hadoop Distributed File System Jordan Prosch, Matt Kipps Outline - Background - Architecture - Comments & Suggestions Background What is HDFS? Part of Apache Hadoop - distributed storage What is Hadoop?

### RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction

### marlabs driving digital agility WHITEPAPER Big Data and Hadoop

marlabs driving digital agility WHITEPAPER Big Data and Hadoop Abstract This paper explains the significance of Hadoop, an emerging yet rapidly growing technology. The prime goal of this paper is to unveil

### The Google File System (GFS)

The Google File System (GFS) Google File System Example of clustered file system Basis of Hadoop s and Bigtable s underlying file system Many other implementations Design constraints Motivating application:

### Cloud Computing at Google. Architecture

Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale

### The Recovery System for Hadoop Cluster

The Recovery System for Hadoop Cluster Prof. Priya Deshpande Dept. of Information Technology MIT College of engineering Pune, India priyardeshpande@gmail.com Darshan Bora Dept. of Information Technology

### The Hadoop Distributed File System

The Hadoop Distributed File System Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu HDFS

The Google File System By Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (Presented at SOSP 2003) Introduction Google search engine. Applications process lots of data. Need good file system. Solution:

### Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

### Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela

Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance

### The Hadoop Distributed File System

The Hadoop Distributed File System The Hadoop Distributed File System, Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, Yahoo, 2010 Agenda Topic 1: Introduction Topic 2: Architecture

### BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

### CSE-E5430 Scalable Cloud Computing Lecture 2

CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing

### Distributed File Systems

Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

### Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem

Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem October, 2006 Introduction Data de-duplication has recently gained significant industry attention,

### Design and Evolution of the Apache Hadoop File System(HDFS)

Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop

### Energy Efficient MapReduce

Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing

### Benchmarking Hadoop & HBase on Violin

Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction

### Intro to Map/Reduce a.k.a. Hadoop

Intro to Map/Reduce a.k.a. Hadoop Based on: Mining of Massive Datasets by Ra jaraman and Ullman, Cambridge University Press, 2011 Data Mining for the masses by North, Global Text Project, 2012 Slides by

### HDFS Architecture Guide

by Dhruba Borthakur Table of contents 1 Introduction... 3 2 Assumptions and Goals... 3 2.1 Hardware Failure... 3 2.2 Streaming Data Access...3 2.3 Large Data Sets... 3 2.4 Simple Coherency Model...3 2.5

### GraySort and MinuteSort at Yahoo on Hadoop 0.23

GraySort and at Yahoo on Hadoop.23 Thomas Graves Yahoo! May, 213 The Apache Hadoop[1] software library is an open source framework that allows for the distributed processing of large data sets across clusters

### Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Non-Stop for Apache HBase: -active region server clusters TECHNICAL BRIEF Technical Brief: -active region server clusters -active region server clusters HBase is a non-relational database that provides

### PARALLELS CLOUD STORAGE

PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...

### Hadoop Distributed File System (HDFS) Overview

2012 coreservlets.com and Dima May Hadoop Distributed File System (HDFS) Overview Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized

### Jeffrey D. Ullman slides. MapReduce for data intensive computing

Jeffrey D. Ullman slides MapReduce for data intensive computing Single-node architecture CPU Machine Learning, Statistics Memory Classical Data Mining Disk Commodity Clusters Web data sets can be very

### Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Benchmarking Couchbase Server for Interactive Applications By Alexey Diomin and Kirill Grigorchuk Contents 1. Introduction... 3 2. A brief overview of Cassandra, MongoDB, and Couchbase... 3 3. Key criteria

### A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

### MASSIVE DATA PROCESSING (THE GOOGLE WAY ) 27/04/2015. Fundamentals of Distributed Systems. Inside Google circa 2015

7/04/05 Fundamentals of Distributed Systems CC5- PROCESAMIENTO MASIVO DE DATOS OTOÑO 05 Lecture 4: DFS & MapReduce I Aidan Hogan aidhog@gmail.com Inside Google circa 997/98 MASSIVE DATA PROCESSING (THE

### DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of

### Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election

### Fault Tolerance in Hadoop for Work Migration

1 Fault Tolerance in Hadoop for Work Migration Shivaraman Janakiraman Indiana University Bloomington ABSTRACT Hadoop is a framework that runs applications on large clusters which are built on numerous

### Processing of Hadoop using Highly Available NameNode

Processing of Hadoop using Highly Available NameNode 1 Akash Deshpande, 2 Shrikant Badwaik, 3 Sailee Nalawade, 4 Anjali Bote, 5 Prof. S. P. Kosbatwar Department of computer Engineering Smt. Kashibai Navale

### Comparative analysis of Google File System and Hadoop Distributed File System

Comparative analysis of Google File System and Hadoop Distributed File System R.Vijayakumari, R.Kirankumar, K.Gangadhara Rao Dept. of Computer Science, Krishna University, Machilipatnam, India, vijayakumari28@gmail.com

### Parallels Cloud Storage

Parallels Cloud Storage White Paper Best Practices for Configuring a Parallels Cloud Storage Cluster www.parallels.com Table of Contents Introduction... 3 How Parallels Cloud Storage Works... 3 Deploying

### BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011

BookKeeper Flavio Junqueira Yahoo! Research, Barcelona Hadoop in China 2011 What s BookKeeper? Shared storage for writing fast sequences of byte arrays Data is replicated Writes are striped Many processes

### Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com

Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...

### Weekly Report. Hadoop Introduction. submitted By Anurag Sharma. Department of Computer Science and Engineering. Indian Institute of Technology Bombay

Weekly Report Hadoop Introduction submitted By Anurag Sharma Department of Computer Science and Engineering Indian Institute of Technology Bombay Chapter 1 What is Hadoop? Apache Hadoop (High-availability

### Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee dhruba@apache.org June 3 rd, 2008

Hadoop Distributed File System Dhruba Borthakur Apache Hadoop Project Management Committee dhruba@apache.org June 3 rd, 2008 Who Am I? Hadoop Developer Core contributor since Hadoop s infancy Focussed

### Four Orders of Magnitude: Running Large Scale Accumulo Clusters. Aaron Cordova Accumulo Summit, June 2014

Four Orders of Magnitude: Running Large Scale Accumulo Clusters Aaron Cordova Accumulo Summit, June 2014 Scale, Security, Schema Scale to scale 1 - (vt) to change the size of something let s scale the

### R.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5

Distributed data processing in heterogeneous cloud environments R.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5 1 uskenbaevar@gmail.com, 2 abu.kuandykov@gmail.com,

### Comparative analysis of mapreduce job by keeping data constant and varying cluster size technique

Comparative analysis of mapreduce job by keeping data constant and varying cluster size technique Mahesh Maurya a, Sunita Mahajan b * a Research Scholar, JJT University, MPSTME, Mumbai, India,maheshkmaurya@yahoo.co.in

### White Paper. Managing MapR Clusters on Google Compute Engine

White Paper Managing MapR Clusters on Google Compute Engine MapR Technologies, Inc. www.mapr.com Introduction Google Compute Engine is a proven platform for running MapR. Consistent, high performance virtual

### White Paper. Big Data and Hadoop. Abhishek S, Java COE. Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP

White Paper Big Data and Hadoop Abhishek S, Java COE www.marlabs.com Cloud Computing Mobile DW-BI-Analytics Microsoft Oracle ERP Java SAP ERP Table of contents Abstract.. 1 Introduction. 2 What is Big

### Quantcast Petabyte Storage at Half Price with QFS!

9-131 Quantcast Petabyte Storage at Half Price with QFS Presented by Silvius Rus, Director, Big Data Platforms September 2013 Quantcast File System (QFS) A high performance alternative to the Hadoop Distributed

### 16.1 MAPREDUCE. For personal use only, not for distribution. 333

For personal use only, not for distribution. 333 16.1 MAPREDUCE Initially designed by the Google labs and used internally by Google, the MAPREDUCE distributed programming model is now promoted by several

### Networking in the Hadoop Cluster

Hadoop and other distributed systems are increasingly the solution of choice for next generation data volumes. A high capacity, any to any, easily manageable networking layer is critical for peak Hadoop

Big Data Storage Architecture Design in Cloud Computing Xuebin Chen 1, Shi Wang 1( ), Yanyan Dong 1, and Xu Wang 2 1 College of Science, North China University of Science and Technology, Tangshan, Hebei,

Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop

### Web Technologies: RAMCloud and Fiz. John Ousterhout Stanford University

Web Technologies: RAMCloud and Fiz John Ousterhout Stanford University The Web is Changing Everything Discovering the potential: New applications 100-1000x scale New development style New approach to deployment

### IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY

IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY Hadoop Distributed File System: What and Why? Ashwini Dhruva Nikam, Computer Science & Engineering, J.D.I.E.T., Yavatmal. Maharashtra,

### Accelerating and Simplifying Apache

Accelerating and Simplifying Apache Hadoop with Panasas ActiveStor White paper NOvember 2012 1.888.PANASAS www.panasas.com Executive Overview The technology requirements for big data vary significantly

### Reduction of Data at Namenode in HDFS using harballing Technique

Reduction of Data at Namenode in HDFS using harballing Technique Vaibhav Gopal Korat, Kumar Swamy Pamu vgkorat@gmail.com swamy.uncis@gmail.com Abstract HDFS stands for the Hadoop Distributed File System.

DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate

### Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com

Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com What s Hadoop Framework for running applications on large clusters of commodity hardware Scale: petabytes of data

### Data-Intensive Computing with Map-Reduce and Hadoop

Data-Intensive Computing with Map-Reduce and Hadoop Shamil Humbetov Department of Computer Engineering Qafqaz University Baku, Azerbaijan humbetov@gmail.com Abstract Every day, we create 2.5 quintillion

Reliable Adaptable Network RAM Tia Newhall, Daniel Amato, Alexandr Pshenichkin Computer Science Department, Swarthmore College Swarthmore, PA 19081, USA Abstract We present reliability solutions for adaptable

### HDFS: Hadoop Distributed File System

Istanbul Şehir University Big Data Camp 14 HDFS: Hadoop Distributed File System Aslan Bakirov Kevser Nur Çoğalmış Agenda Distributed File System HDFS Concepts HDFS Interfaces HDFS Full Picture Read Operation

### Future Prospects of Scalable Cloud Computing

Future Prospects of Scalable Cloud Computing Keijo Heljanko Department of Information and Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 7.3-2012 1/17 Future Cloud Topics Beyond

### International Journal of Advance Research in Computer Science and Management Studies

Volume 2, Issue 8, August 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

Nutanix Tech Note Failure Analysis A Failure Analysis of Storage System Architectures Nutanix Scale-out v. Legacy Designs Types of data to be protected Any examination of storage system failure scenarios

### Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

### Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing

Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer timo.aaltonen@tut.fi Assistants: Henri Terho and Antti

### Parallel Processing of cluster by Map Reduce

Parallel Processing of cluster by Map Reduce Abstract Madhavi Vaidya, Department of Computer Science Vivekanand College, Chembur, Mumbai vamadhavi04@yahoo.co.in MapReduce is a parallel programming model

### Big Fast Data Hadoop acceleration with Flash. June 2013

Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional

### Storage Architectures for Big Data in the Cloud

Storage Architectures for Big Data in the Cloud Sam Fineberg HP Storage CT Office/ May 2013 Overview Introduction What is big data? Big Data I/O Hadoop/HDFS SAN Distributed FS Cloud Summary Research Areas

### MapReduce and Hadoop Distributed File System

MapReduce and Hadoop Distributed File System 1 B. RAMAMURTHY Contact: Dr. Bina Ramamurthy CSE Department University at Buffalo (SUNY) bina@buffalo.edu http://www.cse.buffalo.edu/faculty/bina Partially

### Real-time Protection for Hyper-V

1-888-674-9495 www.doubletake.com Real-time Protection for Hyper-V Real-Time Protection for Hyper-V Computer virtualization has come a long way in a very short time, triggered primarily by the rapid rate

### Mobile Cloud Computing for Data-Intensive Applications

Mobile Cloud Computing for Data-Intensive Applications Senior Thesis Final Report Vincent Teo, vct@andrew.cmu.edu Advisor: Professor Priya Narasimhan, priya@cs.cmu.edu Abstract The computational and storage

### Distributed Filesystems

Distributed Filesystems Amir H. Payberah Swedish Institute of Computer Science amir@sics.se April 8, 2014 Amir H. Payberah (SICS) Distributed Filesystems April 8, 2014 1 / 32 What is Filesystem? Controls