Scalable filesystems boosting Linux storage solutions



Similar documents
Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

New Storage System Solutions

The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc.

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage

Cluster Implementation and Management; Scheduling

Best Practices for Data Sharing in a Grid Distributed SAS Environment. Updated July 2010

Scala Storage Scale-Out Clustered Storage White Paper

Lessons learned from parallel file system operation

Introduction to Gluster. Versions 3.0.x

Building Storage as a Service with OpenStack. Greg Elkinbard Senior Technical Director

Moving Virtual Storage to the Cloud

Distributed File System Choices: Red Hat Storage, GFS2 & pnfs

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Storage Architectures for Big Data in the Cloud

Scalable Windows Storage Server File Serving Clusters Using Melio File System and DFS

Building Storage Service in a Private Cloud

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

Sun Storage Perspective & Lustre Architecture. Dr. Peter Braam VP Sun Microsystems

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Cloud storage reloaded:

Key Messages of Enterprise Cluster NAS Huawei OceanStor N8500

Data storage considerations for HTS platforms. George Magklaras -- node manager

Accelerating and Simplifying Apache

POWER ALL GLOBAL FILE SYSTEM (PGFS)

Large Scale Storage. Orlando Richards, Information Services LCFG Users Day, University of Edinburgh 18 th January 2013

pnfs State of the Union FAST-11 BoF Sorin Faibish- EMC, Peter Honeyman - CITI

Cray DVS: Data Virtualization Service

Parallele Dateisysteme für Linux und Solaris. Roland Rambau Principal Engineer GSE Sun Microsystems GmbH

Introduction. Scalable File-Serving Using External Storage

Shared Parallel File System

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

LUSTRE FILE SYSTEM High-Performance Storage Architecture and Scalable Cluster File System White Paper December Abstract

Red Hat Enterprise Linux as a

Quantum StorNext. Product Brief: Distributed LAN Client

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Clusters: Mainstream Technology for CAE

Microsoft Windows Server Hyper-V in a Flash

PolyServe Matrix Server for Linux

Clustering Windows File Servers for Enterprise Scale and High Availability

Highly-Available Distributed Storage. UF HPC Center Research Computing University of Florida

Software to Simplify and Share SAN Storage Sanbolic s SAN Storage Enhancing Software Portfolio

Open-E Data Storage Software and Intel Modular Server a certified virtualization solution

Four Reasons To Start Working With NFSv4.1 Now

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Red Hat Global File System for scale-out web services

Performance Analysis of Mixed Distributed Filesystem Workloads

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

VMware Virtual Machine File System: Technical Overview and Best Practices

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

Scalable Windows Server File Serving Clusters Using Sanbolic s Melio File System and DFS

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week

High Availability with Windows Server 2012 Release Candidate

Understanding Enterprise NAS

COMPARING STORAGE AREA NETWORKS AND NETWORK ATTACHED STORAGE

Linux Powered Storage:

StorPool Distributed Storage Software Technical Overview

Sanbolic s SAN Storage Enhancing Software Portfolio

InfiniBand in the Enterprise Data Center

System Requirements. Version 8.2 November 23, For the most recent version of this document, visit our documentation website.

Implementing a Digital Video Archive Using XenData Software and a Spectra Logic Archive

Ultimate Guide to Oracle Storage

The Design and Implementation of the Zetta Storage Service. October 27, 2009

Big data management with IBM General Parallel File System

Scalable NAS for Oracle: Gateway to the (NFS) future

GlusterFS Distributed Replicated Parallel File System

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

This document may be freely distributed provided that it is not modified and that full credit is given to the original author.

Ceph. A file system a little bit different. Udo Seidel

Red Hat Storage Server Administration Deep Dive

How to Choose your Red Hat Enterprise Linux Filesystem

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

A Survey of Shared File Systems

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

File Services. File Services at a Glance

Red Hat Cluster Suite

Distributed File Systems

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

Object storage in Cloud Computing and Embedded Processing

Increasing Storage Performance, Reducing Cost and Simplifying Management for VDI Deployments

Flexible Scalable Hardware independent. Solutions for Long Term Archiving

Hadoop Distributed File System. Dhruba Borthakur June, 2007

Best Practice and Deployment of the Network for iscsi, NAS and DAS in the Data Center

Red Hat Storage Server

Lustre SMB Gateway. Integrating Lustre with Windows

BlueArc s Architecture for NFS v4.1 and pnfs. Delivering Performance Through Standards

XtreemFS Extreme cloud file system?! Udo Seidel

Dell High Availability Solutions Guide for Microsoft Hyper-V

Data management challenges in todays Healthcare and Life Sciences ecosystems

<Insert Picture Here> Managing Storage in Private Clouds with Oracle Cloud File System OOW 2011 presentation

Transcription:

Scalable filesystems boosting Linux storage solutions Daniel Kobras science + computing ag IT-Dienstleistungen und Software für anspruchsvolle Rechnernetze Tübingen München Berlin Düsseldorf

Motivation User's view: Storage is a scarce resource that is always too small too slow If it isn't today, it will be in shorter time than expected Admin's view: Storage is a precious resource that is always too unreliable too inflexible Boss's view: Storage is a necessary evil that is always too expensive Seite 2

Motivation Ideal storage solution can grow in capacity and bandwidth allows to transparently move around data is cheap is easy to use doesn't fail Scalable filesystems can't do miracles, but they get you closer Seite 3

Agenda Storage solutions that scale (and those that don't) Terminology Theory of operation Implementations Case study Present and future developments Seite 4

I need this future-proof storage infrastructure what are my options today? Seite 5

Grandma's networked storage (NAS) Client 1 Server LAN/WAN (NFS, CIFS) fs1 Block Device Client 2 Seite 6

Traditional NAS: Properties Re-exports of local filesystems Proxy filesystem Network protocol only, no dedicated on-disk data layout Concurrency of local and remote access Implementations: NFS, SMB/CIFS Seite 7

Traditional NAS: Pros and Cons Mature concepts and implementations Ubiqitous, usually part of standard installation Convenient integration Single fileservers cannot scale with rising demands on bandwidth and capacity Multiple fileservers partition namespace at physical boundaries Client-side virtualisation techniques (autofs, amd, DFS) mitigate problem, but still are subject to partition boundaries Virtualisation of backend storage (SAN) solves capacity constraints and increases reliability, but still requires access through conventional fileservers Seite 8

Clustered storage in the SAN age Client 1 Locking SAN/ iscsi Block Device Client 2 Seite 9

SAN-based filesystems: Properties Access same filesystem on shared block device from multiple hosts Filesystem manages concurrent access through locking service Dumb server (block device), complexity handled on client side Terminology: SAN filesystem, cluster filesystem Free implementations: OCFS2, GFS Proprietary implementations: mostly from major storage vendors (cxfs, MPFS, PolyServe, TotalStorage SFS, Veritas CFS,...) Seite 10

SAN-based filesystems: Pros and Cons Requires SAN (or SAN-like) infrastructure Seite 11 additional fabric (FibreChannel) suboptimal fabric (iscsi over Ethernet, etc.) Virtualized backend storage allows Replication Relocation Resizing Typical problems: Quorum of clients needed for filesystem operation Limited scalability, dependent on implementation characteristics of locking service and supported access patterns Usually limited to servers, uncommon on end-user machines NFS/CIFS re-export necessary

SAN-based clustered storage NFS/CIFS Client 1 Locking SAN/ iscsi Block Device NFS/CIFS Client 2 Seite 12

Serving files from a distributed system Client LAN/WAN Data Server 1 Block Device Metadata Server fs1 Data Server 2 Block Device Seite 13

Distributed Filesystems: Properties Data distributed to local storage on multiple servers Metadata service ties distributed data into single filesystem decouples namespace from physical layout metadata either on single server, or distributed across several nodes Implementations: Special purpose: Hadoop, GoogleFS,... Open Source: AFS, Lustre/HP SFS, Ceph, PVFS2 Proprietary: GPFS, PanFS, FhGFS Seite 14

Distributed Filesystems: Pros and Cons Complex system on client and server side High scalability: additional servers increase bandwidth and capacity High flexibility due to decoupling of namespace and physical storage Data servers: Block devices with intelligence attached Some locking complexity can be offloaded on single server instance, allowing to serve high numbers of clients Ubiqituous deployment to all servers and end-user systems possible Seite 15

I fancy my data - does it work for real? Seite 16

Case study: Scalable filesystem setup Automotive engineering, computational fluid dynamics Pre-existing storage: Stand-alone Linux servers local storage NFS/CIFS export partitioned namespace New storage solution: Single Lustre filesystem Linux cluster nodes, servers, workstations as Lustre clients CIFS export to Windows clients 2 redundant metadata servers, external SCSI storage 2 data servers (Sun Thumper ) Seite 17

Case study: Benefits Scalable storage link for compute clusters extensible capacity extensible bandwidth Single, unified filesystem for clusters and workstations Avoids temporary local storage on cluster nodes to prevent data loss on node failure Simplified workflow due to central storage Increased job turnaround times as copy processes become superfluous Seite 18

Case study: Network layout Lustre Storage Infiniband Backbone GM/MX GM/MX IB Cluster 1 Cluster 2 Cluster 3 Infiniband Myrinet SCSI IB (optical) IB-Switch IB-Switch Pre/Post server X4100 X4100 Pre/Post server Pre/Post server StorEdge 3320 X4500 X4500 Pre/Post server Seite 19

Case study: Lustre bulk I/O 550000 500000 450000 400000 Lustre - 8 Stripes, 2 OSS, IB bandwidth (kb/s) 350000 300000 250000 200000 150000 100000 50000 0 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 record size (kb) Write Re-write Read Re-read Seite 20

Case study: Lustre random I/O bandwidth (kb/s) Lustre - 8 Stripes, 2 OSS, IB 400000 375000 350000 325000 300000 275000 250000 225000 200000 175000 150000 125000 100000 75000 50000 25000 0 4 8 16 32 64 128 256 512 1024 2048 4096 8192 1638 4 record size (kb) Random read Random write Seite 21

Case study: Lustre scaling 1000000 900000 800000 Lustre - 8 Stripes, 2 OSS, IB agg. bandwidth (kb/s) 700000 600000 500000 400000 300000 200000 100000 1 clients 2 clients 3 clients 4 clients 0 Write Rewrite Read Reread Seite 22

Case study: Lustre scaling kb/s 32 nodes, 1GB/client, 16MB regions, fixed OSTs 32 1300000 1200000 1100000 1000000 900000 800000 700000 600000 500000 400000 300000 200000 100000 0 read write rewrite reread random read random write Back ward read test R ecor d Stri de read frea d fwrite frewrite Freread 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Seite 23

Case study: Current status > 1 year in production Ca. 150 clients access filesystem > 1 GB/s aggregate bandwidth 33 TB net capacity Combination of open-source software (Lustre, heartbeat, Linux...) and (more or less) standard hardware components! Central storage for all project data within workgroup Additional storage servers planned in the near future Similiar installations in several neighbouring groups have followed Seite 24

Case study: Problem areas Storage servers: redundant hardware components, but systemlevel failure stalls filesystem No (trivially) scalable backup concept Backup/restore via multiple filesystem clients possible, but requires manual tuning Full/differential backups undesirable Backup software needs to support incremental-only schemes Integration with HSM systems desirable Client-side modifications required Seite 25

Does it get any more convenient? Seite 26

Scalable filesystems + NAS Idea: Seite 27 Install scalable filesystem on cluster of NAS servers Re-export filesystem from all server nodes simultaneously to many clients Requirement: Cluster-aware NFS/CIFS servers to ensure lock consistency Benefits: Easy access from clients via native protocol to whole namespace Scalable bandwidth and capacity High availability Implementations: CTDB (in Samba 3.2) with GPFS, GFS, Lustre... Alternative: pnfs/nfsv4.1 (draft standard)

Wrapping it up Seite 28

Scalable filesystems in a Linux world For numerous scalable filesystems, Linux is the primary platform Linux-based scalable filesystems allow to build fast, capable, reliable and affordable, custom-tailored storage clusters from commodity hardware, and proprietary or open-source software components We've had this before: Beowulf clusters revolutionized highperformance computing, and made Linux the pre-dominant supercomputing platform Available software provides the potential for Linux to assume a similar role for scalable storage solutions in the near future Seite 29

Thank you for your attention. Daniel Kobras science + computing ag www.science-computing.de Telefon 07071 9457-493 d.kobras@science-computing.de