XtreemFS - a distributed and replicated cloud file system



Similar documents
Data Storage in Clouds

XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1

XtreemFS Extreme cloud file system?! Udo Seidel

BabuDB: Fast and Efficient File System Metadata Storage

Diagram 1: Islands of storage across a digital broadcast workflow

Replication and Consistency in Cloud File Systems

QoS-Aware Storage Virtualization for Cloud File Systems. Christoph Kleineweber (Speaker) Alexander Reinefeld Thorsten Schütt. Zuse Institute Berlin

HDFS Users Guide. Table of contents

Distributed File Systems

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

HDFS Under the Hood. Sanjay Radia. Grid Computing, Hadoop Yahoo Inc.

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

The Hadoop Distributed File System

Enterprise Private Cloud Storage

Distributed File Systems

Google File System. Web and scalability

HDFS Architecture Guide

Big data management with IBM General Parallel File System

Indexes for Distributed File/Storage Systems as a Large Scale Virtual Machine Disk Image Storage in a Wide Area Network

Release Notes. CTERA Portal 4.0. November CTERA Portal 4.0 Release Notes 1

Ceph. A file system a little bit different. Udo Seidel

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

ZooKeeper. Table of contents

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Veeam Cloud Connect. Version 8.0. Administrator Guide

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

A Virtual Filer for VMware s Virtual SAN A Maginatics and VMware Joint Partner Brief

Maginatics Cloud Storage Platform Feature Primer

An Oracle White Paper July Oracle ACFS

Postgres Plus xdb Replication Server with Multi-Master User s Guide

Flexible Identity Federation

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

GeoGrid Project and Experiences with Hadoop

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

BookKeeper overview. Table of contents

Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari

GlusterFS Distributed Replicated Parallel File System

THE HADOOP DISTRIBUTED FILE SYSTEM

Last class: Distributed File Systems. Today: NFS, Coda

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

Avoid a single point of failure by replicating the server Increase scalability by sharing the load among replicas

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Distributed File Systems An Overview. Nürnberg, Dr. Christian Boehme, GWDG

Release Notes. CTERA Portal May CTERA Portal Release Notes 1

Apache Hadoop. Alexandru Costan

Maginatics Cloud Storage Platform A primer

vcloud Director User's Guide

Chapter 11 Distributed File Systems. Distributed File Systems

EMC SYNCPLICITY FILE SYNC AND SHARE SOLUTION

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Sanbolic s SAN Storage Enhancing Software Portfolio

F1: A Distributed SQL Database That Scales. Presentation by: Alex Degtiar (adegtiar@cmu.edu) /21/2013

CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION

Database Replication

A Comparison of Fault-Tolerant Cloud Storage File Systems

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

A Survey of Shared File Systems

Web-Based Data Backup Solutions

ZFS Backup Platform. ZFS Backup Platform. Senior Systems Analyst TalkTalk Group. Robert Milkowski.

Snapshots in Hadoop Distributed File System

Cloud Store & Share Frequently Ask Questions

Data Management in the Cloud

Protect your data, against any disaster, in a safe place. backupremotebackupremotebackupremoteb.

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features

Technical Brief: Global File Locking

COSC 6397 Big Data Analytics. Distributed File Systems (II) Edgar Gabriel Spring HDFS Basics

RAID Storage, Network File Systems, and DropBox

CTERA Portal Datacenter Edition

Sheepdog: distributed storage system for QEMU

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Access All Your Files on All Your Devices

Hitachi Cloud Service for Content Archiving On-Ramps Guide for Rocket Arkivio Autostor

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

High Availability Solutions for the MariaDB and MySQL Database

Features of AnyShare

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.

Software to Simplify and Share SAN Storage Sanbolic s SAN Storage Enhancing Software Portfolio

WISE-4000 Series. WISE IoT Wireless I/O Modules

Object Storage A Dell Point of View

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

Backing Up the CTERA Portal Using Veeam Backup & Replication. CTERA Portal Datacenter Edition. May 2014 Version 4.0

Mezeo Software for the Enterprise

Web DNS Peer-to-peer systems (file sharing, CDNs, cycle sharing)

Parallels Cloud Storage

Red Hat Cluster Suite

International Journal of Advance Research in Computer Science and Management Studies

Transcription:

XtreemFS - a distributed and replicated cloud file system Michael Berlin Zuse Institute Berlin DESY Computing Seminar, 16.05.2011

Who we are Zuse Institute Berlin operates the HLRN supercomputer (#63+64) Research in Computer Science and Mathematics Parallel and Distributed Systems Group lead by Prof. Alexander Reinefeld (Humboldt University) Distributed and failure-tolerant storage systems

Who we are Michael Berlin PhD student since 03/2011 studied Informatik at Humboldt Universität zu Berlin Diplom thesis dealt with XtreemFS currently working on the XtreemFS client 3

Motivation Problem: Multiple copies of data Where? Copy complete? Different versions? PC internal Nodes external Nodes local file server internal storage external storage 4

Motivation (2) Problem: Different access interfaces Laptop via 3G/Wi-Fi VPN+?/ SSHFS local file server PC NFS/ Samba SCP external storage external Nodes <parallel file system> 5

Motivation (3) XtreemFS goals: Transparency Availability Laptop via 3G/Wi-Fi PC internal Nodes external Nodes XtreemFS 6

File Systems Landscape 7

Outline 1. XtreemFS Architecture 2. Client Interfaces 3. Read-Only Replication 4. Read-Write Replication 5. Metadata Replication 6. Customization through Policies 7. Security 8. Use Case: Mosgrid 9. Snapshots 8

XtreemFS Architecture (1) Volume on a Metadata Server: provides hierarchical namespace File Content on Storage servers: accessed directly by clients PC internal Nodes local file server internal storage 9

XtreemFS Architecture (2) Metadata and Replica Catalog (MRC): holds volumes Object Storage Devices (OSDs): file content split into objects objects can be striped across OSDs object-based file system architecture 10

WRITE READ Scalability File I/O Throughput parallel I/O: scales with number of OSDs Storage Capacity add and removal of OSDs possible OSDs may be used by multiple volumes Metadata Throughput limited by MRC hardware use many volumes spread over multiple MRCs 11

Accessing Components Directory Service (DIR) central registry all servers (MRC, OSD) register there with their id provides: list of available volumes mapping id URL to service list of available OSDs 12

Client Interfaces XtreemFS supports POSIX interface and semantics mount.xtreemfs: using FUSE runs on Linux, FreeBSD, OS X and Windows (Dokan) libxtreemfs for Java and C++ Laptop via 3G/WiFi PC internal Nodes external Nodes mount.xtreemfs mount.xtreemfs mount.xtreemfs XtreemFS 13

Read-Only Replication Requirement: Mark file as read-only Replica types: a. Full replica: requires complete copy b. Partial replica: fills itself on demand instantly ready to use external Nodes internal storage external storage 14

Read-Only Replication (2) 15

Read-Only Replication (3) Receiver-initiated transfer at object level OSDs exchange object lists Filling strategies: Fetch objects in order rarest first Prefetching available On-Close Replication: automatic replica creation 16

Read-Write Replication Availability Data safety Allow Modifications PC local file server important.cpp internal storage important.cpp 17

Read-Write Replication (2) Primary/Backup: 18

Read-Write Replication (3) Primary/Backup: 1. Lease Acquisition at most one valid lease per file revocation = lease timeout 19

Read-Write Replication (4) Primary/Backup: 1. Lease Acquisition at most one valid lease per file revocation = lease timeout 2. Data Dissemination 20

Read-Write Replication (5) Lease Acquisition XtreemFS: Flease scalable majority-based Central Lock Service Flease Data Dissemination Update Strategies: Write All, Read 1 Write Quorum, Read Quorum 21

Metadata Replication Primary/backup replication volume = database transparently replicate database use leases to elect primary replicate insert/update/delete Database = Key/Value Store own implementation: BabuDB 22

Customization through Policies Example: Which replica shall the client select? determined by policies internal storage??? external storage external Nodes Policies: Authentication Authorization UID/GID mappings Replica placement Replica selection 23

Customization through Policies (2) Replica Placement/Selection Policies: filter / sort / group replica list available default policies: FQDN-based datacenter map Vivaldi (latency estimation) can be chained own policies possible (Java) MRC sorted replica list open() external Nodes node1.ext-cluster internal storage osd1.int-cluster external storage osd1.ext-cluster 24

Security X.509 certificates support for authentication SSL to encrypt communication Laptop via 3G/Wi-Fi external Nodes mount.xtreemfs w/ user certificate XtreemFS mount.xtreemfs w/ host certificate 25

Use case: Mosgrid Mosgrid: ease running experiments in computational chemistry use grid resources through a web portal portal allows to submit and retrieve compute jobs XtreemFS: global data repository 26

Use case: Mosgrid (2) PC Submit Job Browser Retrieve Results Input Data Nodes Results mount.xtreemfs w/ user certificate Web Portal libxtreemfs (Java) Unicore Frontend mount.xtreemfs w/ host certificate XtreemFS XtreemFS scope Berlin Dresden Köln 27

Snapshots Backups needed in case of accidental deletion/modification virus infections Snapshot stable image of the file system at a given point in time PC unlink( important.cpp ) local file server important.cpp internal storage important.cpp 28

Snapshots (2) MRC: create snapshot if requested OSDs: Copy-on-Write on modify: create new object instead of overwriting on delete: only mark as deleted write("file.txt ) snapshot() write("file.txt ) t 0 t file.txt: V1, t 1 file.txt: V2, t 2 29

Snapshots (3) No exact global time: Loosely synchronized clocks assumption: maximum drift ε Time span-based snapshots write("file.txt ) snapshot() write("file.txt ) t 0 write("file.txt ) t 0 - ε t 0 + ε t file.txt: V1, t 1 file.txt: V2, t 2 file.txt: V2, t 2 30

Snapshots (4) OSDs: limit number of versions not version-on-every-write Instead: close-to-open problem: client sends no explicit close implicit close: create new version if last write at least X seconds ago Cleanup tool: deletes versions which belong to no snapshot Snapshots on directory level possible 31

Future Research Self-Tuning Quota support Data de-duplication Hierarchical Storage Management 32

XtreemFS Software Open source: www.xtreemfs.org Development: 5 core developers at ZIB integration tests for quality assurance Community: users and bug reporters mailing list with 102 subscribers Release 1.3: Experimental support for read/write replication and snapshots 33

Thank You! References: http://www.xtreemfs.org/publications.php www.contrail-project.eu The Contrail project is supported by funding under the Seventh Framework Programme of the European Commission: ICT, Internet of Services, Software and Virtualization. GA nr.: FP7-ICT-257438. 34