DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

Similar documents

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

CERNBox + EOS: Cloud Storage for Science

DSS. Diskpool and cloud storage benchmarks used in IT-DSS. Data & Storage Services. Geoffray ADDE

Storage strategy and cloud storage evaluations at CERN Dirk Duellmann, CERN IT

Storage Architectures for Big Data in the Cloud

Parallels Cloud Storage

Mass Storage System for Disk and Tape resources at the Tier1.

HDFS Under the Hood. Sanjay Radia. Grid Computing, Hadoop Yahoo Inc.

The dcache Storage Element

Big Data Storage Options for Hadoop Sam Fineberg, HP Storage

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Google File System. Web and scalability

Resource control in ATLAS distributed data management: Rucio Accounting and Quotas

Hadoop: Embracing future hardware

Big Science and Big Data Dirk Duellmann, CERN Apache Big Data Europe 28 Sep 2015, Budapest, Hungary

Data storage at CERN

New Storage System Solutions

Running a typical ROOT HEP analysis on Hadoop/MapReduce. Stefano Alberto Russo Michele Pinamonti Marina Cobal

Design and Evolution of the Apache Hadoop File System(HDFS)

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

GraySort on Apache Spark by Databricks

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

(Possible) HEP Use Case for NDN. Phil DeMar; Wenji Wu NDNComm (UCLA) Sept. 28, 2015

The Microsoft Large Mailbox Vision

Managed GRID or why NFSv4.1 is not enough. Tigran Mkrtchyan for dcache Team

Investigation of storage options for scientific computing on Grid and Cloud facilities

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

POSIX and Object Distributed Storage Systems

StorPool Distributed Storage Software Technical Overview

Data storage services at CC-IN2P3

Distributed File Systems

Planning Domain Controller Capacity

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Next Generation Tier 1 Storage

Infortrend ESVA Family Enterprise Scalable Virtualized Architecture

THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.

Big Data With Hadoop

High Availability Storage

QoS-Aware Storage Virtualization for Cloud File Systems. Christoph Kleineweber (Speaker) Alexander Reinefeld Thorsten Schütt. Zuse Institute Berlin

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

Implementing Storage Concentrator FailOver Clusters

Scala Storage Scale-Out Clustered Storage White Paper

PARALLELS CLOUD STORAGE

New Design and Layout Tips For Processing Multiple Tasks

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Big Data Technology Core Hadoop: HDFS-YARN Internals

Computing at the HL-LHC

Using S3 cloud storage with ROOT and CernVMFS. Maria Arsuaga-Rios Seppo Heikkila Dirk Duellmann Rene Meusel Jakob Blomer Ben Couturier

Parallels Cloud Server 6.0

SOFTWARE DEFINED STORAGE IN ACTION

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

THE HADOOP DISTRIBUTED FILE SYSTEM

Scale out NAS on the outside, Object storage on the inside

Traditional v/s CONVRGD

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Diagram 1: Islands of storage across a digital broadcast workflow

Hypertable Architecture Overview

Step-by-Step Guide. to configure Open-E DSS V7 Active-Active iscsi Failover on Intel Server Systems R2224GZ4GC4. Software Version: DSS ver. 7.

Multiple Public IPs (virtual service IPs) are supported either to cover multiple network segments or to increase network performance.

The Hadoop Distributed File System

ovirt and Gluster Hyperconvergence

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

Windows Server 2012 授權說明

Flash 101. Violin Memory Switzerland. Violin Memory Inc. Proprietary 1

Zadara Storage Cloud A

The Hadoop Distributed File System

Managing your Domino Clusters

Tier0 plans and security and backup policy proposals

Business Intelligence Competency Partners

High Performance Computing OpenStack Options. September 22, 2015

High Availability with Windows Server 2012 Release Candidate

Moving Virtual Storage to the Cloud

Best Practices for Implementing High Availability for SAS 9.4

HADOOP MOCK TEST HADOOP MOCK TEST I

Apache Hadoop. Alexandru Costan

Rajesh Gupta Best Practices for SAP BusinessObjects Backup & Recovery Including High Availability and Disaster Recovery Session #2747

Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft. dcache Introduction

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Volume Replication INSTALATION GUIDE. Open-E Data Storage Server (DSS )

Preview of a Novel Architecture for Large Scale Storage

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

CHAPTER 7 SUMMARY AND CONCLUSION

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

SCALABLE DATA SERVICES

Transcription:

DSS High performance storage pools for LHC Łukasz Janyst on behalf of the CERN IT-DSS group CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it

Introduction The goal of EOS is to provide a relatively simple and scalable solution for the analysis users In this presentation I will: Introduce EOS Discuss some of the implementation details Present test results 2

EOS requirements RW file access (random and sequential reads, updates) Fast Hierarchical Namespace Target capacity: 109 files, 106 containers (directories): we re not quite there yet although things look promising Strong authorization Quota Checksums Redundancy of services and data Dynamic hardware pool scaling and replacement without downtimes 3

EOS timeline Started in April 2010 with storage discussions within the IT-DSS group The development started in May Prototype is under evaluation since August Tested within the ATLAS Large Scale Test (pool of 1.5PB) Will be open to individual users from Nov 15 Will be run by the CERN operations team Still much work left to do Currently we re focusing on ironing out the operational procedures 4

Selected features of EOS Is a set of XRootd plug-ins Speaks XRoot protocol Just a Bunch Of Disks (JBOD - no RAID arrays) Network RAID within node groups Tunable per-directory settings User have a choice, by choosing the number of replicas, to achieve arbitrary level of availability/performance Fault tolerant From client s point of view, all files are always readable and writable 5

EOS Architecture Head node Head node async sync MQ async File server NS Namespace, Quota Strong Authentication Capability Engine File Placement File Location Message Queue Service State Messages File Transaction Reports File Server File & File Meta Data Store Capability Authorization Checksumming & Verification (adler,crc32,md5,sha1) Disk Error Detection (Scrubbing) 6

EOS Namespace In the typical HEP use case a number of active files at any given moment is rather small even though the overall number of files stored in the system may be huge. EOS tries to leverage this fact to provide a memorybased namespace 7

EOS Namespace Version 1 (current implementation) In-memory hierarchical namespace using google hash Stored on disk as a changelog file Rebuilt in memory on startup Two views: hierarchical view (directory view) view storage location (filesystem view) very fast, but limited by the size of memory 1GB = ~1M files 8

EOS Namespace Version 2 (current development) Only view index in memory Metadata read from disk/buffer cache Perfect use case for SSDs (need random IOPS) 109 files = ~20GB per view 9

HA and Namespace scaling HA & Read Scale out active in rw mode passive failover in rw mode active-passive RW master Write Scale out active in ro mode active-active RO slaves /atlas /cms /alice /lhcb /atlas/data 10 /atlas/user

File creation test FST MGM 1 khz NS Size: 10 Mio Files * 22 ROOT clients 1 khz * 1 ROOT client 220 Hz 11

File read test FST MGM 7 khz NS Size: 10 Mio Files * 100 Million read open * 350 ROOT clients 7 khz * CPU usage 20% 12

File layouts EOS supports layouts plain replica (synchronous replication) RAID5 (untested prototype) 13

Replica layout Client Network IO for file creations with 3 replicas: 1 write(offset,len) FS1 2 return code 6 write(offset,len) 5 3 4 FS2 500 MB/s injection result in - 1 GB/s output on eth0 of all disk servers - 1.5 GB/s input on eth0 of all disk servers FS3 14

Replica healing CLIENT CLIENT open file for update (rw) come back in <N> sec MGM MGM schedule transfer 2-4 Rep 1 Online Rep 2 Online Rep 3 Offline Rep 1 Online Rep 2 Online Rep 3 Offline Client rw reopen of an existing file triggers - creation of a new replica - dropping of offline replica 15 Rep 4 Sched.

Node 1 Node 2 Replica placement Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 In order to minimize the risk of data loss we couple disks into scheduling groups (current default is 8 disks per group) The system selects a scheduling group to store a file in in a round-rubin manner Then all the other replicas of this file are stored within the same group 16

EOS Atlas Initial Setup August 2010 17

Hammer Cloud Test HC 10001181 EOSATLAS POOL 27 Disk Server RAID0 HC 10001381 EOSATLAS POOL reconfigured to 8 Disk Server JBOD 18

Hardware replacement test Life Cycle Management Exercise to migrate 27 disk server with 10 Raid-0 FS to 8 new with 20 JBOD FS [ partially overlapped with HC Tests ] HC Reading Scratch Disk Migration 19

Data verification test Data Verification Exercise to rescan checksums for 120 TB of data (50h) 20

Plans Operation Opening EOS Atlas to Atlas users on Nov 15 Run by the CERN Castor operations team Development Implement Version 2 of the namespace Larger Testbed (if available) Scale the instance from todays 600 disks to 2000-8000 disks (4-16PB) 21

Summary We have been able to build the prototype rapidly We ve been able to reuse some of the existing software The prototype has performed well during various exercises and tests We are looking forward to testing it with individual users 22