The Panasas Parallel Storage Cluster. Acknowledgement: Some of the material presented is under copyright by Panasas Inc.



Similar documents
Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

HPC Advisory Council

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Best Practices for Data Sharing in a Grid Distributed SAS Environment. Updated July 2010

10th TF-Storage Meeting

Introduction to Gluster. Versions 3.0.x

Storage Architectures for Big Data in the Cloud

Accelerating and Simplifying Apache

Four Reasons To Start Working With NFSv4.1 Now

Distributed File System Choices: Red Hat Storage, GFS2 & pnfs

760 Veterans Circle, Warminster, PA Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA

New Storage System Solutions

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

Large Scale Storage. Orlando Richards, Information Services LCFG Users Day, University of Edinburgh 18 th January 2013

(Scale Out NAS System)

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

Quantum StorNext. Product Brief: Distributed LAN Client

CONFIGURATION GUIDELINES: EMC STORAGE FOR PHYSICAL SECURITY

UCS M-Series Modular Servers

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Scalable filesystems boosting Linux storage solutions

VTrak SATA RAID Storage System

Storage Virtualization from clusters to grid

Future technologies for storage networks. Shravan Pargal Director, Compellent Consulting

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

WHITEPAPER: Understanding Pillar Axiom Data Protection Options

Network Attached Storage. Jinfeng Yang Oct/19/2015

HITACHI VIRTUAL STORAGE PLATFORM FAMILY MATRIX

Violin: A Framework for Extensible Block-level Storage

Technology Insight Series

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

SAN TECHNICAL - DETAILS/ SPECIFICATIONS

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

How To Virtualize A Storage Area Network (San) With Virtualization

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Implementing an Automated Digital Video Archive Based on the Video Edition of XenData Software

Scalable Windows Storage Server File Serving Clusters Using Melio File System and DFS

Lab Validation Report

VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS

Array Performance 101 Part 4

Performance in a Gluster System. Versions 3.1.x

Installation Guide July 2009

The BIG Data Era has. your storage! Bratislava, Slovakia, 21st March 2013

Scala Storage Scale-Out Clustered Storage White Paper

Physical Security EMC Storage with ISS SecurOS

STORAGE CENTER. The Industry s Only SAN with Automated Tiered Storage STORAGE CENTER

Implementing a Digital Video Archive Based on XenData Software

pnfs State of the Union FAST-11 BoF Sorin Faibish- EMC, Peter Honeyman - CITI

FlexArray Virtualization

EMC Backup and Recovery for Microsoft Exchange 2007 SP2

Hadoop: Embracing future hardware

Lessons learned from parallel file system operation

List of Figures and Tables

WHITE PAPER BRENT WELCH NOVEMBER

Cisco and EMC Solutions for Application Acceleration and Branch Office Infrastructure Consolidation

Introduction to Highly Available NFS Server on scale out storage systems based on GlusterFS

HPC Storage Solutions at transtec. Parallel NFS with Panasas ActiveStor

EMC DATA DOMAIN OPERATING SYSTEM

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

June Blade.org 2009 ALL RIGHTS RESERVED

EMC ISILON NL-SERIES. Specifications. EMC Isilon NL400. EMC Isilon NL410 ARCHITECTURE

PolyServe Matrix Server for Linux

DELL RAID PRIMER DELL PERC RAID CONTROLLERS. Joe H. Trickey III. Dell Storage RAID Product Marketing. John Seward. Dell Storage RAID Engineering

EMC Celerra NS Series/Integrated

Scalable Performance of the Panasas Parallel File System

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Discover Smart Storage Server Solutions

NOTICE ADDENDUM NO. TWO (2) JULY 8, 2011 CITY OF RIVIERA BEACH BID NO SERVER VIRTULIZATION/SAN PROJECT

Isilon: Scalable solutions using clustered storage

Scalable NAS for Oracle: Gateway to the (NFS) future

1. Specifiers may alternately wish to include this specification in the following sections:

Scalable Performance of the Panasas Parallel File System

Why is it a better NFS server for Enterprise NAS?

System Requirements Version 8.0 July 25, 2013

EMC VNX FAMILY. Copyright 2011 EMC Corporation. All rights reserved.

IBM System x GPFS Storage Server

HITACHI VIRTUAL STORAGE PLATFORM FAMILY MATRIX

Red Hat Cluster Suite

Huawei N2000 NAS Storage System Technical White Paper

IBM General Parallel File System (GPFS ) 3.5 File Placement Optimizer (FPO)

ADVANCED NETWORK CONFIGURATION GUIDE

29/07/2010. Copyright 2010 Hewlett-Packard Development Company, L.P.

Isilon IQ Scale-out NAS for High-Performance Applications

Isilon OneFS. Version 7.2. OneFS Migration Tools Guide

21 st Century Storage What s New and What s Changing

Synology High Availability (SHA)

System Requirements. Version 8.2 November 23, For the most recent version of this document, visit our documentation website.

IP Storage in the Enterprise Now? Why? Daniel G. Webster Unified Storage Specialist Commercial Accounts

Big data management with IBM General Parallel File System

Open-E Data Storage Software and Intel Modular Server a certified virtualization solution

COSC 6374 Parallel Computation. Parallel I/O (I) I/O basics. Concept of a clusters

Introduction. Need for ever-increasing storage scalability. Arista and Panasas provide a unique Cloud Storage solution

Data Communications Hardware Data Communications Storage and Backup Data Communications Servers

Transcription:

The Panasas Parallel Storage Cluster

What Is It? What Is The Panasas ActiveScale Storage Cluster A complete hardware and software storage solution Implements An Asynchronous, Parallel, Object-based, POSIX compliant Filesystem A Global Namespace Strict client cache coherency

Physically, How Is It Organized? A shelf is 4U high and contains slots from 11 blades 0-3 DirectorBlades per shelf, with the remaining slots for StorageBlades 1 DirectorBlade 10 StorageBlades

Terminology Metadata The information that describes the data contained in files Size, create time, modify time, location on disk, permissions Block Based Filesystem A filesystem in which the client accessed files based on the physical location on disk. File Based Filesystem A filesystem where a client requests a file by name. Object Based Filesystem In this case, the filename is abstracted into a identifier. We will discuss this later. RAID multiple disks arranged into one physical disk tuned for redundancy or speed. JBOD (just a bunch of disks) multiple disks access directly rather than in an array.

Direct Attached Storage (local filesystem) Private storage for a host operating system IDE connected internal hard drive Serial ATA or SCSI attached drives USB drives Examples: ext3, reiserfs, NTFS, ufs, FAT32 This discussion is mostly about distributed file systems Problems of scale require lots of storage computers working together

Network Attached Storage File Server exports storage at the file level NFS/CIFS are widely deployed NFS is the only official file system standard Scalability limited by server hardware Moderate number of clients (10 s to 100 s) Moderate amount of storage (few TB) A nice model until it runs out of steam Islands of storage Bandwidth to a file limited by its server NetApp (ONTAP 7.x), Sun, HP, SnapServer, EMC Celerra, StorEdge NAS, IBM TotalStorage NAS, whitebox Linux NAS Head

Clustered NAS More scalable than single-headed NAS Multiple NAS heads share back-end storage In-band NAS head still limits performance and drives up cost Two primary architectures Forward requests to owner Head Export NAS from shared file system NFS does not provide a good mechanism for dynamic load balancing Clients permanently mount a particular Head GPFS, Isilon OneFS, IBRIX, Polyserve, NetApp- GX, BlueArc, Exanet ExaStore, ONStor, Pillar Data, IBM/Transarc AFS, IBM DFS NAS Heads

Storage Area Network Common management and provisioning for host storage Block Devices (JBOD or RAID) accessible via iscsi or FC network Wire-speed/RAID-speed performance potential Proprietary solutions for shared file systems Scalability limited by block management on metadata server (e.g., 32 nodes) NAS access provided by file head that re-exports the SAN file system Asymmetric (pictured) or Symmetric implementations SAN Clients Metadata Server(s) Storage

Object Based Storage Clusters Block and file interfaces replaced with an object abstraction. Block management pushed all the way out to the disks. Allows for parallel and direct access to disks Requires non-standards based Client Luster, Panasas data OSD Clients Object (OSD) Storage Metadata Server

pnfs: Standard Storage Clusters pnfs is an extension to the Network File System v4 protocol standard Allows for parallel and direct access From Parallel Network File System clients To Storage Devices over multiple storage protocols Moves the Network File System server out of the data path data pnfs Clients NFSv4.1 Server Block (FC) / Object (OSD) / File (NFS) Storage

RAID Redundant Array of Independent Drives Many physical disks bound together with hardware or software. Multiple layouts to accommodate performance and fault tolerance requirements. Used to create larger filesystems out of standard drive technology.

Comparing Technology How does an object-based, parallel filesystem compare to traditional storage solutions? vs. Direct Attached Storage oseparate control and data paths. Metadata and data workloads are distributed. omultiple access points for redundancy and scalability ono need to balance expensive server resources between applications and storage access vs. Network Attached Storage oscalability and ease of management in very large installations vs. Storage Area Networks oclients access storage directly, no intermediary gateway oall communication is IP based, choose your infrastructure Low cost, high bandwidth Gigabit or 10-Gigabit Ethernet Higher cost, low latency Infiniband

Panasas Object-Based Storage Cluster Consist of two primary components Object Storage Devices (OSD): StorageBlades MetaData Manager: DirectorBlades Directors implement file system semantics Access control, cache consistency, user identity, etc. Directors have rights to perform these object operations Create, delete, create group, delete group Get attributes and set attributes Clone group, copy-on-right support for snapshots Clients perform direct I/O with these object operations Read, write Get attributes, set (some) attributes

Panasas StorageBlade (OSD) Balanced storage device CPU, SDRAM, GE NIC and 2 spindles, 2x2TB SATA Commodity parts drive low cost Performance scales with capacity Single Seamless Namespace!

DirectFLOW Client DirectFLOW client is kernel loadable FS module Implements standard Vnode interface Uses native Panasas network protocols (RPC and iscsi) Caches data, directories, attributes, capabilities Responds to callbacks for cache consistency Does RAID I/O directly to StorageBlades w/ iscsi/osd

DirectorBlades Metadata manager Realm Control admit blades, start/stop services, failover File Manager access control, cache consistency, file system semantics Storage Manager file virtualization (maps), recovery, reconstruction Management console Web-based GUI or Command Line Interface (CLI) Status, charts, reporting Storage management Gateway function (NFS/CIFS) collocated on DirectorBlade Fast processor and large main memory Multiple DirectorBlades allow service replication for fault tolerance

Environment AC Power Each shelf has dual power supplies and battery Automatic graceful shutdown if you lose AC power Masks brownouts and short (5-sec) power glitches Thermal 800 Watts in 4u! Power supplies and batteries have fans that cool the shelf Blades, power supplies, batteries, network cards all monitor tempurature Warnings generated near temperature limit Unilateral blade shutdown if a blade gets very hot Graceful shutdown of a whole shelf if multiple blades are hot

Bladesets and Volumes Bladeset is a storage (OSD) failure domain Single OSD failure results in degraded operation and reconstruction Two OSD failures results in data unavailability Bladesets can be expanded or merged (but not unmerged) for growth Capacity balancing occurs within a bladeset Volume is a file hierarchy with a quota One or more volumes compete for space within a bladeset No physical boundaries between volumes, except quota limits Volume is unit of DirectFlow metadata work Each director blade manages one or more volumes NFS/CIFS gateway workload is orthogonal to DirectFlow metadata All director blades provide uniform/symmetric NFS/CIFS access

What Problems Does It Solve? It s all about removing the bottlenecks in traditional storage No RAID engine bottleneck oclient driven RAID scales as the number of clients increases omultiple Volumes or DirectorBlades for Scalable Reconstruction No Network Uplink bottleneck o10gige port or 4-Port Gig-E Link Aggregation Group per Shelf Flexible, per File layouts (SDK Required) oraid1/5 for large streaming I/O oraid10 for N-to-1 Writes or Random I/O ocustomizable Stripe width and depth Control the number of spindles Parity Overhead Global Namespace Single, web browser based management interface of 100 s of TBs

Customizing For Your Environment Pick your protocol DirectFLOW, NFS, CIFS, any combination at one time omore Director Blades for NFS / CIFS performance omore Storage Blades for DirectFLOW performance Interactive vs. Batch Processing ActiveStor 5000 w/ larger cache sizes on Storage Blades for Interactive work Fault tolerance Configurable spares for multiple sequential Storage Blade failures Configurable bladeset sizes for simultaneous Blade failure risk mitigation Redundant network links Storage Capacity Options Smaller Capacity Blades omore spindles, less data to reconstruct, more shelves Larger Capacity Blades ofewer shelves, reduced double-disk failure risk

Logically, How Does Data Flow Linux Client w/ DirectFLOW Filesystem Client Storage Blades NFS / CiFS Clients IP Network Metadata Manager NFS / CiFS Gateway Director Blades

Logically, How Does Data Flow An Example Six Shelf System Linux Client w/ DirectFLOW Filesystem Client Storage Blades NFS / CiFS Clients IP Network Metadata Manager NFS 1 / CiFS 2 Gateway 3 4 5 6 Director Blades

Logically, How Does Data Flow An Example Six Shelf System, with Three Bladesets NFS / CiFS Clients Linux Client w/ DirectFLOW Filesystem Client IP Network Metadata Manager Storage Blades Bladeset 1 Bladeset 2 Bladeset 3 NFS 1 / CiFS 2 Gateway 3 4 5 6 Director Blades

Logically, How Does Data Flow? An Example Six Shelf System, with Three Bladesets and Eight Volumes NFS / CiFS Clients Linux Client w/ DirectFLOW Filesystem Client IP Network Metadata Manager Vol1 Vol2 Vol3 Vol4 Vol6 Vol5 Vol7 Vol8 Storage Blades Vol1 Vol2 Vol4 Vol5 Shelf Vol8 10 SBs Vol6 Vol3 Vol7 Bladeset 1 Bladeset 2 Bladeset 3 NFS 1 / CiFS 2 Gateway 3 4 5 6 Director Blades

How Do I Manage 100 s of TB? All from a single http:// or command line interface PanActive Manager: Single GUI for entire namespace management Simple out-of-box experience Seamlessly adopt new blades Capacity & load balancing Volumes and quotas Snapshots 1-touch reporting capabilities for capacity trends, asset ID, and performance Email and/or pager notification of errors and warnings Scriptable CLI for all features