Replication and Consistency in Cloud File Systems

Size: px
Start display at page:

Download "Replication and Consistency in Cloud File Systems"

Transcription

1 Replication and Consistency in Cloud File Systems Alexander Reinefeld und Florian Schintke Zuse-Institut Berlin Cloud-Computing-Tag im IKMZ der BTU Cottbus A. Reinefeld, F. Schintke, ZIB

2 Let s start with a little quiz Who invented Cloud Computing? a) Werner Vogels b) Ian Foster c) Konrad Zuse The correct answer is c) Schließlich werden auch Rechenzentren über Fernmeldeleitungen miteinander vernetzt werden, Konrad Zuse in Rechnender Raum (1969) Konrad Zuse A. Reinefeld, F. Schintke, ZIB 2

3 ZuseInstitute Berlin Research institute for applied mathematics and computer science Peter Deuflhard chair for scientific computing, FU Berlin Martin Grötschel chair for discrete mathematics, TU Berlin Alexander Reinefeld chair for computer science, HU Berlin A. Reinefeld, F. Schintke, ZIB 3

4 HPC ZIB 1984 Cray 1M 160 MFlops 1987 Cray X-MP 471 MFlops 1994 Cray T3D 38 GFlops 1997 Cray T3E 486 GFlops 2002 IBM p690 2,5 TFlops 2008/09 SGI ICE, XE 150 TFlops fold performance increase in 25 years 2009 A. Reinefeld, F. Schintke, ZIB 4

5 H L R N 2 sites 98 computer racks CPU cores 128 TB memory 1620 TB disk 300 TF peakperf.

6 S T O R A G E 3 SL8500 robots 39 tape drives slots

7 What is Cloud Computing? Cloud Computing = Grid Computing on Datacenters? not that simple Cloud and Grid both abstract resources through interfaces. Grid: via new middleware. Requires Grid APIs. Cloud: via virtualization. Allows legacy APIs. Software as a Service (SaaS) Applications Application Services Platform as a Service (PaaS) Programming Environment Execution Environment Infrastructure as a Service (IaaS) Infrastructure Services Resource Set A. Reinefeld, F. Schintke, ZIB 7

8 Why Cloud? Pros It scales because it s their resources, not yours It s simple because they operate it Pay for what you need don t pay for empty spinning disks Cons It s expensive Amazon S3 charges $.15 / GB / month. = $1800 / TB / year It s not 100% secure S3 now allows to bring your own RSA key-pair. But: Would you put your bank account into the cloud? It s not 100% available S3 provides service credits if availability drops (10% for % availability) Alexander Reinefeld, ZIB 8

9 File System Landscape PC, local system Network FS/ Centralized Cluster FS/ Datacenter Cloud/Grid ext3, ZFS, NTFS NFS, SMB AFS/Coda Lustre, Panasas, GPFS, CEPH... Grid File System GFarm GDM "gridftp" Alexander Reinefeld, ZIB 9

10 Consistency, Availability, Partition tolerance: Pick two of three! C + A singleserver, Linux HA (one data center) A A + P Amazon S3 Mercurial Coda/AFS C P Consistency: All clients have the same view of the data. Availability: Each client can always read and write. Partitiontolerance: Operations will complete, even if individual components are unavailable. C + P distributeddatabases, distributed file systems Brewer, Eric. Towards Robust Distributed Systems. PODC Keynote, Alexander Reinefeld, ZIB 10

11 Which semantic do you expect? Distributed file systems should provide C + P But recent hype was on A + P + eventual consistency (e.g. Amazon S3) Alexander Reinefeld, ZIB 11

12 GridFile System provides access to heterogeneous storage resources, but middleware causes additional complexity, vulnerability requires explicit file transfer whole file: latency to 1 st access, bandwidth, disk storage also partial file access (gridftp) and pattern access (falls) no consistency among replicas user must take care no access control on replicas Alexander Reinefeld, ZIB 12

13 CloudFile System: XtreemFS Focus on data distribution data replication object based Key features MRCs are separated from OSDs fat Client is the link MRC = metadata & replica catalogue OSD = object storage device Client = file system interface Alexander Reinefeld, ZIB 13

14 A closer look at XtreemFS Features distributed, replicated POSIX compliant file system Server software (Java) runs on Linux, OS X, Solaris Client software (C++) runs on Linux, OS X, Windows secure: X.509 and SSL open source (GPL) Assumptions synchronous clocks with max. time drift (needed for OSD lease negotiation, reasonable assumption in clouds) upper limit on round trip time no need for FIFO channels (runs on either TCP or UDP) A. Reinefeld, F. Schintke, ZIB 14

15 XtreemFSInterfaces A. Reinefeld, F. Schintke, ZIB 15

16 File access protocol User appl. (Linux VFS) XtreemFS Client (fuse) MRC OSD Update(Cap, FileSize=128k) FileSize = 128k Alexander Reinefeld, ZIB 16

17 Client gets list of OSDs from MRC get a capability (signed by MRC) per file selects best OSD(s) for parallel I/O various striping policies: scatter/gather, RAIDx, erasure codes scalable and fast access no communication between OSD and MRC needed client is the missing link A. Reinefeld, F. Schintke, ZIB 17

18 MRC Metadata and Replication Catalogue provides open(), close(), readdir(), rename(), attributes per file: size, last access, access rights, location (OSDs), capability (file handle) to authorize a client to access objects on OSDs implemented with a key/value store (BabuDB) fast index append-only DB allows snapshots A. Reinefeld, F. Schintke, ZIB 18

19 OSD Object Storage Device serves file content operations read(), write(), truncate(), flush(), implements object replication also partial replicas for read-access data is filled on demand gets OSD list from MRC slave OSD redirects to master OSD write ops only on master OSD POSIX requires linearizable reads, hence reads are also redirected A. Reinefeld, F. Schintke, ZIB 19

20 OSD Object Storage Device Which OSD to select? object list bandwidth rarest first network coordinates, datacenter map, prefetching (for partial replicas) A. Reinefeld, F. Schintke, ZIB 20

21 OSD Object Storage Device implements concurrency control for replica consistency POSIX compliant master/slave replication with failover group membership service provided by MRC lease service Flease : distributed, scalable and failure-tolerant 50,000 leases/sec with 30 OSDs based on quorum consensus (Paxos) A. Reinefeld, F. Schintke, ZIB 21

22 Quorum consensus Basic algorithm When a majority is informed, each other majority has at least one member with up-to-date information. A minority may crash at any time. Paxos Consensus 1 Step: Check whether a consensus c was already established 2 Step: Re-establish c or try to establish own proposal x x x x xx x x A. Reinefeld, F. Schintke, ZIB 22

23 Proposer Init r = 1 r latest = 0 latest v = // Neues Proposal senden ack num = 0 Sende prepare(r) an alle acceptors Empfange ack(r ack,v i,r i ) von acceptor i Falls r == r ack ack num ++ Falls r i > r latest r latest = r i // lokale Runden-Nr // Nr der höchsten bestätigten Runde // Wert d. höchsten bestätigten Runde // Anzahl gültiger Bestätigungen // jüngere akzeptierte Runde // jüngerer Wert latest v = v i Falls ack num maj Falls latest v == schlage selbst einen Wert latest v vor sende accept(r, latest v ) an alle acceptors Acceptor Init r ack = 0 r accepted = 0 v = Empfange prepare(r) von proposer Falls r > r ack r > r accepted r ack = r // zuletzt bestätigte Runde // zuletzt akzeptierte Runde // aktueller lokaler Wert Sende ack(r ack, v, r accepted ) an Proposer Empfange accept(r, w) // höhere Runde Ende 1. Phase Learner num accepted = 0 // Anzahl gesammelter accepts Empfange accepted(r, v) von acceptor i Wenn r steigt: num accepted = 0 num accepted ++ Falls num accepted ==maj decide v; inform client // v ist Konsens Falls r r ack r > r accepted r accepted = r Alexander Reinefeld, ZIB 23 v = w Sende accepted (r accepted, v) an Learners

24 Striping Performance on Cluster Striping parallel transfer from/to many OSDs READ bandwidth scales with the number of OSDs client is the bottleneck: (slower reads are caused by TCP ingress problem) WRITE One client writes/reads a single 4GB file using asynchr. writes, read-ahead, 1MB chunk size, 29 OSDs. Nodes are connected with IP over IB (1.2 GB/s). A. Reinefeld, F. Schintke, ZIB 24

25 Snapshots & Backups Metadata snapshots (MRC) need atomic operation without service interrupt asynchronous consolidation in background granularity: subdirectories or volumes implemented by BabuDB or Scalaris File snapshots (OSD) taken implicitly when file is idle or explicitly when closing file or fsync() versioning of file objects: copy-on-write A. Reinefeld, F. Schintke, ZIB 25

26 Atomic Snapshots in MRC implemented with BabuDB backend a large-scale DB for data that exceeds the system s main memory 2 components: small mutable overlay trees(lsm trees) large immutable memory-mapped index on disk non-transactional key-value store prefix and range queries primary design goal: Performance! 300,000 lookups/sec (30M entries) fast crash recovery fast start-up A. Reinefeld, F. Schintke, ZIB 26

27 Log Structured Merge Trees:. A lookup takes O(s log(n)) with s: #snapshots, n: #files A. Reinefeld, F. Schintke, ZIB 27

28 Replicating MRC, OSDs Master/Slave Scheme Pros fast local read no distributed transactions easy to implement Cons master is performance bottleneck interrupt when master fails: needs stable master election Replicated State Machine (Paxos) Pros Cons no master, no single point of failure no extra latency on failure slower: 2 round trips per op needs distrib. consensus Alexander Reinefeld, ZIB 28

29 XtreemFSFeatures Release (current) RAID and parallel I/O POSIX compatibility Read-only replication Partial replicas (on-demand) Security (SSL, X.509) Internet ready Checksums Extensions OSD and replica selection (Vivaldi, datacenter maps) Asynchronous MRC backups Metadata caching Graphical admin console Hadoop file system driver (experimental) Release 1.3 (very soon) DIR and MRC replication with automatic failover Read/write replication Release 2.x Consistent Backups Snapshots Automatic replica creation, deletion and maintenance Alexander Reinefeld, ZIB 29

30 Source Code XtreemFS lines of C++ and Java code GNU GPL v2 license BabuDB lines of Java code new BSD license Scalaris lines of Erlangand C++ code Apache 2.0 license Scalaris A. Reinefeld, F. Schintke, ZIB 30

31 Summary Cloud file systems require replication availability fast access, striping Replication requires consistency algorithm when crashes are rare: use master/slave replication with frequent crashes: use Paxos Only Consistency + Partition tolerance from CAP theorem Our next step: Faster high-level data services for MapReduce, Dryad, key/value store, SQL, A. Reinefeld, F. Schintke, ZIB 31

Data Storage in Clouds

Data Storage in Clouds Data Storage in Clouds Jan Stender Zuse Institute Berlin contrail is co-funded by the EC 7th Framework Programme 1 Overview Introduction Motivation Challenges Requirements Cloud Storage Systems XtreemFS

More information

XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1

XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1 XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1 The XtreemOS Project Research project funded by the

More information

BabuDB: Fast and Efficient File System Metadata Storage

BabuDB: Fast and Efficient File System Metadata Storage BabuDB: Fast and Efficient File System Metadata Storage Jan Stender, Björn Kolbeck, Mikael Högqvist Felix Hupfeld Zuse Institute Berlin Google GmbH Zurich Motivation Modern parallel / distributed file

More information

XtreemFS - a distributed and replicated cloud file system

XtreemFS - a distributed and replicated cloud file system XtreemFS - a distributed and replicated cloud file system Michael Berlin Zuse Institute Berlin DESY Computing Seminar, 16.05.2011 Who we are Zuse Institute Berlin operates the HLRN supercomputer (#63+64)

More information

XtreemFS Extreme cloud file system?! Udo Seidel

XtreemFS Extreme cloud file system?! Udo Seidel XtreemFS Extreme cloud file system?! Udo Seidel Agenda Background/motivation High level overview High Availability Security Summary Distributed file systems Part of shared file systems family Around for

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.

More information

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

BlobSeer: Towards efficient data storage management on large-scale, distributed systems : Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu

More information

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk Benchmarking Couchbase Server for Interactive Applications By Alexey Diomin and Kirill Grigorchuk Contents 1. Introduction... 3 2. A brief overview of Cassandra, MongoDB, and Couchbase... 3 3. Key criteria

More information

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) 15-799 10/21/2013 F1: A Distributed SQL Database That Scales Presentation by: Alex Degtiar ([email protected]) 15-799 10/21/2013 What is F1? Distributed relational database Built to replace sharded MySQL back-end of AdWords

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance

More information

High Throughput Computing on P2P Networks. Carlos Pérez Miguel [email protected]

High Throughput Computing on P2P Networks. Carlos Pérez Miguel carlos.perezm@ehu.es High Throughput Computing on P2P Networks Carlos Pérez Miguel [email protected] Overview High Throughput Computing Motivation All things distributed: Peer-to-peer Non structured overlays Structured

More information

The Google File System

The Google File System The Google File System By Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung (Presented at SOSP 2003) Introduction Google search engine. Applications process lots of data. Need good file system. Solution:

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari

Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari 1 Agenda Introduction on the objective of the test activities

More information

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF Non-Stop for Apache HBase: -active region server clusters TECHNICAL BRIEF Technical Brief: -active region server clusters -active region server clusters HBase is a non-relational database that provides

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

Sun Storage Perspective & Lustre Architecture. Dr. Peter Braam VP Sun Microsystems

Sun Storage Perspective & Lustre Architecture. Dr. Peter Braam VP Sun Microsystems Sun Storage Perspective & Lustre Architecture Dr. Peter Braam VP Sun Microsystems Agenda Future of Storage Sun s vision Lustre - vendor neutral architecture roadmap Sun s view on storage introduction The

More information

Network File System (NFS) Pradipta De [email protected]

Network File System (NFS) Pradipta De pradipta.de@sunykorea.ac.kr Network File System (NFS) Pradipta De [email protected] Today s Topic Network File System Type of Distributed file system NFS protocol NFS cache consistency issue CSE506: Ext Filesystem 2 NFS

More information

HDFS Architecture Guide

HDFS Architecture Guide by Dhruba Borthakur Table of contents 1 Introduction... 3 2 Assumptions and Goals... 3 2.1 Hardware Failure... 3 2.2 Streaming Data Access...3 2.3 Large Data Sets... 3 2.4 Simple Coherency Model...3 2.5

More information

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system Christian Clémençon (EPFL-DIT)  4 April 2013 GPFS Storage Server Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " Agenda" GPFS Overview" Classical versus GSS I/O Solution" GPFS Storage Server (GSS)" GPFS Native RAID

More information

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle Agenda Introduction Database Architecture Direct NFS Client NFS Server

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.

More information

Design and Evolution of the Apache Hadoop File System(HDFS)

Design and Evolution of the Apache Hadoop File System(HDFS) Design and Evolution of the Apache Hadoop File System(HDFS) Dhruba Borthakur Engineer@Facebook Committer@Apache HDFS SDC, Sept 19 2011 Outline Introduction Yet another file-system, why? Goals of Hadoop

More information

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store

<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb, Consulting MTS The following is intended to outline our general product direction. It is intended for information

More information

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT SS Data & Storage CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT HEPiX Fall 2012 Workshop October 15-19, 2012 Institute of High Energy Physics, Beijing, China SS Outline

More information

GraySort on Apache Spark by Databricks

GraySort on Apache Spark by Databricks GraySort on Apache Spark by Databricks Reynold Xin, Parviz Deyhim, Ali Ghodsi, Xiangrui Meng, Matei Zaharia Databricks Inc. Apache Spark Sorting in Spark Overview Sorting Within a Partition Range Partitioner

More information

In Memory Accelerator for MongoDB

In Memory Accelerator for MongoDB In Memory Accelerator for MongoDB Yakov Zhdanov, Director R&D GridGain Systems GridGain: In Memory Computing Leader 5 years in production 100s of customers & users Starts every 10 secs worldwide Over 15,000,000

More information

GeoGrid Project and Experiences with Hadoop

GeoGrid Project and Experiences with Hadoop GeoGrid Project and Experiences with Hadoop Gong Zhang and Ling Liu Distributed Data Intensive Systems Lab (DiSL) Center for Experimental Computer Systems Research (CERCS) Georgia Institute of Technology

More information

Storage Architectures for Big Data in the Cloud

Storage Architectures for Big Data in the Cloud Storage Architectures for Big Data in the Cloud Sam Fineberg HP Storage CT Office/ May 2013 Overview Introduction What is big data? Big Data I/O Hadoop/HDFS SAN Distributed FS Cloud Summary Research Areas

More information

Distributed Data Stores

Distributed Data Stores Distributed Data Stores 1 Distributed Persistent State MapReduce addresses distributed processing of aggregation-based queries Persistent state across a large number of machines? Distributed DBMS High

More information

Distributed Storage Systems

Distributed Storage Systems Distributed Storage Systems John Leach [email protected] twitter @johnleach Brightbox Cloud http://brightbox.com Our requirements Bright box has multiple zones (data centres) Should tolerate a zone failure

More information

RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters

RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters Sage Weil, Andrew Leung, Scott Brandt, Carlos Maltzahn {sage,aleung,scott,carlosm}@cs.ucsc.edu University of California,

More information

Scala Storage Scale-Out Clustered Storage White Paper

Scala Storage Scale-Out Clustered Storage White Paper White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current

More information

Google File System. Web and scalability

Google File System. Web and scalability Google File System Web and scalability The web: - How big is the Web right now? No one knows. - Number of pages that are crawled: o 100,000 pages in 1994 o 8 million pages in 2005 - Crawlable pages might

More information

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University

RAMCloud and the Low- Latency Datacenter. John Ousterhout Stanford University RAMCloud and the Low- Latency Datacenter John Ousterhout Stanford University Most important driver for innovation in computer systems: Rise of the datacenter Phase 1: large scale Phase 2: low latency Introduction

More information

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF Panasas at the RCF HEPiX at SLAC Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory Centralized File Service Single, facility-wide namespace for files. Uniform, facility-wide

More information

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! [email protected]

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl Big Data Processing, 2014/15 Lecture 5: GFS & HDFS!! Claudia Hauff (Web Information Systems)! [email protected] 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Distributed File Systems

Distributed File Systems Distributed File Systems Mauro Fruet University of Trento - Italy 2011/12/19 Mauro Fruet (UniTN) Distributed File Systems 2011/12/19 1 / 39 Outline 1 Distributed File Systems 2 The Google File System (GFS)

More information

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies [email protected] 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

Diagram 1: Islands of storage across a digital broadcast workflow

Diagram 1: Islands of storage across a digital broadcast workflow XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,

More information

Snapshots in Hadoop Distributed File System

Snapshots in Hadoop Distributed File System Snapshots in Hadoop Distributed File System Sameer Agarwal UC Berkeley Dhruba Borthakur Facebook Inc. Ion Stoica UC Berkeley Abstract The ability to take snapshots is an essential functionality of any

More information

Tushar Joshi Turtle Networks Ltd

Tushar Joshi Turtle Networks Ltd MySQL Database for High Availability Web Applications Tushar Joshi Turtle Networks Ltd www.turtle.net Overview What is High Availability? Web/Network Architecture Applications MySQL Replication MySQL Clustering

More information

Lessons learned from parallel file system operation

Lessons learned from parallel file system operation Lessons learned from parallel file system operation Roland Laifer STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association

More information

Ceph. A file system a little bit different. Udo Seidel

Ceph. A file system a little bit different. Udo Seidel Ceph A file system a little bit different Udo Seidel Ceph what? So-called parallel distributed cluster file system Started as part of PhD studies at UCSC Public announcement in 2006 at 7 th OSDI File system

More information

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) Journal of science e ISSN 2277-3290 Print ISSN 2277-3282 Information Technology www.journalofscience.net STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) S. Chandra

More information

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014 Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ Cloudera World Japan November 2014 WANdisco Background WANdisco: Wide Area Network Distributed Computing Enterprise ready, high availability

More information

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB

Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB Overview of Databases On MacOS Karl Kuehn Automation Engineer RethinkDB Session Goals Introduce Database concepts Show example players Not Goals: Cover non-macos systems (Oracle) Teach you SQL Answer what

More information

Big Table A Distributed Storage System For Data

Big Table A Distributed Storage System For Data Big Table A Distributed Storage System For Data OSDI 2006 Fay Chang, Jeffrey Dean, Sanjay Ghemawat et.al. Presented by Rahul Malviya Why BigTable? Lots of (semi-)structured data at Google - - URLs: Contents,

More information

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation

Facebook: Cassandra. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation Facebook: Cassandra Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/24 Outline 1 2 3 Smruti R. Sarangi Leader Election

More information

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)

More information

Bigdata High Availability (HA) Architecture

Bigdata High Availability (HA) Architecture Bigdata High Availability (HA) Architecture Introduction This whitepaper describes an HA architecture based on a shared nothing design. Each node uses commodity hardware and has its own local resources

More information

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL

Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL Architectures Haute-Dispo Joffrey MICHAÏE Consultant MySQL 04.20111 High Availability with MySQL Higher Availability Shared nothing distributed cluster with MySQL Cluster Storage snapshots for disaster

More information

Cloud Optimize Your IT

Cloud Optimize Your IT Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release

More information

Investigation of storage options for scientific computing on Grid and Cloud facilities

Investigation of storage options for scientific computing on Grid and Cloud facilities Investigation of storage options for scientific computing on Grid and Cloud facilities Overview Context Test Bed Lustre Evaluation Standard benchmarks Application-based benchmark HEPiX Storage Group report

More information

Ceph. A complete introduction.

Ceph. A complete introduction. Ceph A complete introduction. Itinerary What is Ceph? What s this CRUSH thing? Components Installation Logical structure Extensions Ceph is An open-source, scalable, high-performance, distributed (parallel,

More information

Use of Hadoop File System for Nuclear Physics Analyses in STAR

Use of Hadoop File System for Nuclear Physics Analyses in STAR 1 Use of Hadoop File System for Nuclear Physics Analyses in STAR EVAN SANGALINE UC DAVIS Motivations 2 Data storage a key component of analysis requirements Transmission and storage across diverse resources

More information

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007 Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms Cray User Group Meeting June 2007 Cray s Storage Strategy Background Broad range of HPC requirements

More information

Contents. 1. Introduction

Contents. 1. Introduction Summary Cloud computing has become one of the key words in the IT industry. The cloud represents the internet or an infrastructure for the communication between all components, providing and receiving

More information

Datacenter Operating Systems

Datacenter Operating Systems Datacenter Operating Systems CSE451 Simon Peter With thanks to Timothy Roscoe (ETH Zurich) Autumn 2015 This Lecture What s a datacenter Why datacenters Types of datacenters Hyperscale datacenters Major

More information

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Prepared By : Manoj Kumar Joshi & Vikas Sawhney Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Introduction to Hadoop Architecture Acknowledgement Thanks to all the authors who left their selfexplanatory images on the internet. Thanks

More information

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms Distributed File System 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributed File System Don t move data to workers move workers to the data! Store data on the local disks of nodes

More information

<Insert Picture Here> Managing Storage in Private Clouds with Oracle Cloud File System OOW 2011 presentation

<Insert Picture Here> Managing Storage in Private Clouds with Oracle Cloud File System OOW 2011 presentation Managing Storage in Private Clouds with Oracle Cloud File System OOW 2011 presentation What We ll Cover Today Managing data growth Private Cloud definitions Oracle Cloud Storage architecture

More information

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska Oracle Maximum Availability Architecture with Exadata Database Machine Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska MAA is Oracle s Availability Blueprint Oracle s MAA is a best practices

More information

A simple object storage system for web applications Dan Pollack AOL

A simple object storage system for web applications Dan Pollack AOL A simple object storage system for web applications Dan Pollack AOL AOL Leading edge web services company AOL s business spans the internet 2 Motivation Most web content is static and shared Traditional

More information

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill AFS Usage and Backups using TiBS at Fermilab Presented by Kevin Hill Agenda History and current usage of AFS at Fermilab About Teradactyl How TiBS (True Incremental Backup System) and TeraMerge works AFS

More information

Release Notes. LiveVault. Contents. Version 7.65. Revision 0

Release Notes. LiveVault. Contents. Version 7.65. Revision 0 R E L E A S E N O T E S LiveVault Version 7.65 Release Notes Revision 0 This document describes new features and resolved issues for LiveVault 7.65. You can retrieve the latest available product documentation

More information

Lecture 18: Reliable Storage

Lecture 18: Reliable Storage CS 422/522 Design & Implementation of Operating Systems Lecture 18: Reliable Storage Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions of

More information

Hadoop & its Usage at Facebook

Hadoop & its Usage at Facebook Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System [email protected] Presented at the The Israeli Association of Grid Technologies July 15, 2009 Outline Architecture

More information

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets

More information

Introduction to NOSQL

Introduction to NOSQL Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems CS2510 Computer Operating Systems HADOOP Distributed File System Dr. Taieb Znati Computer Science Department University of Pittsburgh Outline HDF Design Issues HDFS Application Profile Block Abstraction

More information

Investigation of storage options for scientific computing on Grid and Cloud facilities

Investigation of storage options for scientific computing on Grid and Cloud facilities Investigation of storage options for scientific computing on Grid and Cloud facilities Overview Hadoop Test Bed Hadoop Evaluation Standard benchmarks Application-based benchmark Blue Arc Evaluation Standard

More information

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters COSC 6374 Parallel I/O (I) I/O basics Fall 2012 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network card 1 Network card

More information

Dell High Availability and Disaster Recovery Solutions Using Microsoft SQL Server 2012 AlwaysOn Availability Groups

Dell High Availability and Disaster Recovery Solutions Using Microsoft SQL Server 2012 AlwaysOn Availability Groups Dell High Availability and Disaster Recovery Solutions Using Microsoft SQL Server 2012 AlwaysOn Availability Groups Dell servers and storage options available for AlwaysOn Availability Groups deployment.

More information

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters COSC 6374 Parallel Computation Parallel I/O (I) I/O basics Spring 2008 Concept of a clusters Processor 1 local disks Compute node message passing network administrative network Memory Processor 2 Network

More information

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc [email protected]

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc [email protected] What s Hadoop Framework for running applications on large clusters of commodity hardware Scale: petabytes of data

More information

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas 3. Replication Replication Goal: Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas Problems: Partial failures of replicas and messages No

More information

The Design and Implementation of the Zetta Storage Service. October 27, 2009

The Design and Implementation of the Zetta Storage Service. October 27, 2009 The Design and Implementation of the Zetta Storage Service October 27, 2009 Zetta s Mission Simplify Enterprise Storage Zetta delivers enterprise-grade storage as a service for IT professionals needing

More information

HDFS Under the Hood. Sanjay Radia. [email protected] Grid Computing, Hadoop Yahoo Inc.

HDFS Under the Hood. Sanjay Radia. Sradia@yahoo-inc.com Grid Computing, Hadoop Yahoo Inc. HDFS Under the Hood Sanjay Radia [email protected] Grid Computing, Hadoop Yahoo Inc. 1 Outline Overview of Hadoop, an open source project Design of HDFS On going work 2 Hadoop Hadoop provides a framework

More information

CS 6343: CLOUD COMPUTING Term Project

CS 6343: CLOUD COMPUTING Term Project CS 6343: CLOUD COMPUTING Term Project Group A1 Project: IaaS cloud middleware Create a cloud environment with a number of servers, allowing users to submit their jobs, scale their jobs Make simple resource

More information

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline

References. Introduction to Database Systems CSE 444. Motivation. Basic Features. Outline: Database in the Cloud. Outline References Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of

More information

Introduction to Database Systems CSE 444

Introduction to Database Systems CSE 444 Introduction to Database Systems CSE 444 Lecture 24: Databases as a Service YongChul Kwon References Amazon SimpleDB Website Part of the Amazon Web services Google App Engine Datastore Website Part of

More information

The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc.

The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc. The Panasas Parallel Storage Cluster What Is It? What Is The Panasas ActiveScale Storage Cluster A complete hardware and software storage solution Implements An Asynchronous, Parallel, Object-based, POSIX

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Summary Need for Cloud computing Cloud computing Architecture Cloud Services Possible challenges related to parallel processing Wolfson et al optimal data replication strategy

More information

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska Oracle Cloud Storage Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska Oracle Cloud Storage Automatic Storage Management (ASM) Oracle Cloud File System ASM Dynamic

More information