RozoFS: a fault tolerant I/O intensive distributed file system based on Mojette erasure code



Similar documents
The Mojette Erasure Code: Application to fault tolerant Distributed File System (DFS)

New Storage System Solutions

Milestone Solution Partner IT Infrastructure MTP Certification Report Scality RING Software-Defined Storage

THE SUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

Building Storage Clouds for Online Applications A Case for Optimized Object Storage

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

THESUMMARY. ARKSERIES - pg. 3. ULTRASERIES - pg. 5. EXTREMESERIES - pg. 9

Scientific Computing Data Management Visions

Architecting a High Performance Storage System

Freezing Exabytes of Data at Facebook s Cold Storage. Kestutis Patiejunas (kestutip@fb.com)

Certification Document macle GmbH Grafenthal-S1212M 24/02/2015. macle GmbH Grafenthal-S1212M Storage system

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

F600Q 8Gb FC Storage Performance Report Date: 2012/10/30

Performance characterization report for Microsoft Hyper-V R2 on HP StorageWorks P4500 SAN storage

Parallels Cloud Storage

Building low cost disk storage with Ceph and OpenStack Swift

MESOS CB220. Cluster-in-a-Box. Network Storage Appliance. A Simple and Smart Way to Converged Storage with QCT MESOS CB220

Sheepdog: distributed storage system for QEMU

Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate. Nytro Flash Accelerator Cards

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

SMB Direct for SQL Server and Private Cloud

Using Synology SSD Technology to Enhance System Performance Synology Inc.

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Convergence-A new keyword for IT infrastructure transformation

Achieving Higher VDI Scalability and Performance on Microsoft Hyper-V with Seagate 1200 SAS SSD Drives & Proximal Data AutoCache Software

I/O Performance of Cisco UCS M-Series Modular Servers with Cisco UCS M142 Compute Cartridges

Scala Storage Scale-Out Clustered Storage White Paper

The Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage Platforms. Abhijith Shenoy Engineer, Hedvig Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Mit Soft- & Hardware zum Erfolg. Giuseppe Paletta

ntier Verde Simply Affordable File Storage

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

The Data Placement Challenge

Introducing NetApp FAS2500 series. Marek Stopka Senior System Engineer ALEF Distribution CZ s.r.o.

How SSDs Fit in Different Data Center Applications

Certification Document macle GmbH GRAFENTHAL R2208 S2 01/04/2016. macle GmbH GRAFENTHAL R2208 S2 Storage system

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Evaluation of Dell PowerEdge VRTX Shared PERC8 in Failover Scenario

EMC SYMMETRIX VMAX 20K STORAGE SYSTEM

HP reference configuration for entry-level SAS Grid Manager solutions

Best Practices for Increasing Ceph Performance with SSD

Isilon IQ Scale-out NAS for High-Performance Applications

High-Availability and Scalable Cluster-in-a-Box HPC Storage Solution

Hadoop: Embracing future hardware

HP Proliant BL460c G7

Performance in a Gluster System. Versions 3.1.x

LEVERAGING FLASH MEMORY in ENTERPRISE STORAGE. Matt Kixmoeller, Pure Storage


Advanced Knowledge and Understanding of Industrial Data Storage

No matter what you need for Managed IT services, High-Performance Storage, you can count on us for low cost, fast and effective service.

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Solid State Storage in the Evolution of the Data Center

FUJITSU Enterprise Product & Solution Facts

Protect Data... in the Cloud

Deploying Flash- Accelerated Hadoop with InfiniFlash from SanDisk

POSIX and Object Distributed Storage Systems

Data management challenges in todays Healthcare and Life Sciences ecosystems

What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1

Comparing Dynamic Disk Pools (DDP) with RAID-6 using IOR

Big Fast Data Hadoop acceleration with Flash. June 2013

Certification Document bluechip STORAGEline R54300s NAS-Server 03/06/2014. bluechip STORAGEline R54300s NAS-Server system

BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011

The Design and Implementation of the Zetta Storage Service. October 27, 2009

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

LSI MegaRAID CacheCade Performance Evaluation in a Web Server Environment

Michael Kagan.

Alexandria Overview. Sept 4, 2015

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

PARALLELS CLOUD STORAGE

IBM System x GPFS Storage Server

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

GraySort on Apache Spark by Databricks

Data Communications Hardware Data Communications Storage and Backup Data Communications Servers

Integrated Grid Solutions. and Greenplum

PowerVault MD1200/MD1220 Storage Solution Guide for Applications

Design and Evolution of the Apache Hadoop File System(HDFS)

High Performance Computing OpenStack Options. September 22, 2015

Hadoop & its Usage at Facebook

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Boost Database Performance with the Cisco UCS Storage Accelerator

Windows 8 SMB 2.2 File Sharing Performance

Hyper-converged IT drives: - TCO cost savings - data protection - amazing operational excellence

EMC VMAX3 FAMILY - VMAX 100K, 200K, 400K

Low-cost

ebay Storage, From Good to Great

Boosting Hadoop Performance with Emulex OneConnect 10GbE Network Adapters

DELL TM PowerEdge TM T Mailbox Resiliency Exchange 2010 Storage Solution

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

M710 - Max 960 Drive, 8Gb/16Gb FC, Max 48 ports, Max 192GB Cache Memory

Fusionstor NAS Enterprise Server and Microsoft Windows Storage Server 2003 competitive performance comparison

Building All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun Storage Tech. Lab SK Telecom

Transcription:

RozoFS: a fault tolerant I/O intensive distributed file system based on Mojette erasure code Workshop Autonomic Oct. 16-17 2014 Laas, Europe, Toulouse Benoît Parrein, Dimitri Pertin*, Nicolas Normand, Didier Féron Université de Nantes, IRCCyN Lab, UMR 6597 * Université de Nantes - Fizians SAS Fizians SAS

Outline Storage and file systems FEC4Cloud project Mojette Erasure code RozoFS global architecture Performances 2

The storage in the world 40 Exabytes (1018 bytes) stored in 2020 15 EB (37%) in the Cloud(s) 7,5 EB (50%) video, images,... 3

High availability means... 99.999999...% reachable Copies and copies and copies... (up to 7 times) Hard disks and hard disks and hard disks... High consumption of energy Privacy problems 4

5

Requirements Convergence archiving, big data, virtualisation, hpc,... = Convergence of cold and hot data 6

Distributed File Systems HDFS CephFS, GlusterFS,... Scality (based on Chord) Facebook file system (f4)... Mix of replicas and erasure coding (sometime): none uses erasure codes always 7

FEC4Cloud Project ANR 2012 (appel Emergence) Partners: IRCCyN (lead), ISAE, SATT-Ouest Valorisation Budget: 256 K Duration: 24 months (product oriented) Goal: promoting erasure codes within Cloud storage infrastructure 8

Erasure codes (MDS property) k messages Encoding n code-words k received code-words Decoding k messages 9

Implementations Reed-Solomon (by Cauchy matrices [Byers, 1995]) Reed-Solomon (by Vandermonde matrices [Rizzo, 1998] now a RFC5510) Cauchy Good [Planck, 2008] in Jerasure 1.2 Intel ISA-L (includes SSE instructions)... 10

Mojette erasure coding Based on Radon transform (1+Ɛ)MDS Code Linear Complexity Systematic and Non Systematic Fig.1 : Computing of two projections (-2,1) and (0,1) 11

Optimizations Deterministic path of reconstruction (function of available projections pattern) [Normand, 2006] +Drastic reduction in writes [engineers of Fizians, 2013] 12

RozoFS I/O Centric Distributed File System POSIX Scale-out storage Commodity hardware Fault tolerance (up to 4 failures) Based on erasure coding (Mojette coding) Dedicated to cold and hot data Open source project 15

Client Node Metadata Server ol p r t n Co ath Rozofsmount exportd th a pa Dat metadata n Mo g rin ito Storage Node Storaged Storage Node Storage Node Storaged... Pool of storage Nodes Storaged 16

Read/Write function (in non-sys coding) 17

Testbed 18

Performances Sequential access (4K blocks) 19

Performances Random access (4K blocks) 20

RozoFS + Exportd External Network Rozofsmount Storage Rozofsmount Storage Rozofsmount Storage Rozofsmount Storage Niveau clients/applications Standard GigE Infrastructure GigE infrastructure (data storage and metadata)

Conclusions RozoFS is an I/O centric distributed file system based on a erasure code (always) Performances: 100K IOPS, throughput of 6 Gbps... RozoFS follows up the infrastructure Apps: on line video editing, virtualisation (QEMU), database... participate to the convergence of cold and hot data Next: privacy (to check), grid5000 experiments (to come), deduplication (to attach) 22

23

Credits https://github.com/rozofs UMR 6597 24

Backup slides 25

Server type Fujistu RX300-S8 (R3008S0035FR) CPU model name 2 x Intel Xeon CPU E5-2650 v2 @ 2.60GHz (8 cores & 16 threads/core) Memory (GB) 64 GB RAID card RAID Controller SAS 6Gbit/s 1GB (D3116C) Virtual DRIVE 0 - Seagate Constellaton.2, SAS 6Gb/s, 1TB, 2.5", 7200 RPM (ST91000640SS) - 11 drives - RAID 5 Virtual DRIVE 1 - Seagate Pulsar.2, SAS 6Gb/s, 100GB, 2.5", MLC (ST100FM0002) - 1 drive - RAID 0 Virtual DRIVE 2 - WD Xe, SAS 6Gb/s, 900GB, 2.5", 10000 RPM (WD9001BKHG) - 4 drives - RAID 0 Ethernet controllers - Intel 82599EB 10-Gigabit SFI/SFP+ - 2*10Gb - Intel I350 Gigabit Network - 2*1Gb - Intel I350 Gigabit Network - 4*1Gb 26