Scientific Computing Data Management Visions



Similar documents
StorPool Distributed Storage Software Technical Overview

VM Image Hosting Using the Fujitsu* Eternus CD10000 System with Ceph* Storage Software

Low-cost

Product Spotlight. A Look at the Future of Storage. Featuring SUSE Enterprise Storage. Where IT perceptions are reality

Installing Hadoop over Ceph, Using High Performance Networking

Building low cost disk storage with Ceph and OpenStack Swift

Building All-Flash Software Defined Storages for Datacenters. Ji Hyuck Yun Storage Tech. Lab SK Telecom

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER


Deploying Ceph with High Performance Networks, Architectures and benchmarks for Block Storage Solutions

Can High-Performance Interconnects Benefit Memcached and Hadoop?

Solid State Storage in the Evolution of the Data Center

DreamObjects. Cloud Object Storage Powered by Ceph. Monday, November 5, 12

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Boas Betzler. Planet. Globally Distributed IaaS Platform Examples AWS and SoftLayer. November 9, IBM Corporation

IronPOD Piston OpenStack Cloud System Commodity Cloud IaaS Platforms for Enterprises & Service

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Mobile Cloud Computing T Open Source IaaS

CUDA in the Cloud Enabling HPC Workloads in OpenStack With special thanks to Andrew Younge (Indiana Univ.) and Massimo Bernaschi (IAC-CNR)

Hyperscale Use Cases for Scaling Out with Flash. David Olszewski

Hedvig Distributed Storage Platform with Cisco UCS

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Scaling from Datacenter to Client

The path to the cloud training

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

IBM System x SAP HANA

How To Build A Cloud Server For A Large Company

Convergence-A new keyword for IT infrastructure transformation

21 st Century Storage What s New and What s Changing

SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

Product Overview. Marc Skinner Principal Solutions Architect Red Hat RED HAT ENTERPRISE LINUX OPENSTACK PLATFORM

Intel Solid- State Drive Data Center P3700 Series NVMe Hybrid Storage Performance

F600Q 8Gb FC Storage Performance Report Date: 2012/10/30

UCS Storage Options. July Bertalan Dergez Consulting Systems Engineer

Best Practices for Increasing Ceph Performance with SSD

Nutanix Complete Cluster Reference Architecture for Virtual Desktop Infrastructure

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1

NexentaStor Enterprise Backend for CLOUD. Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V.

SUSE Enterprise Storage Highly Scalable Software Defined Storage. Gábor Nyers Sales

Outline. Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models

High Performance Computing in CST STUDIO SUITE

MaxDeploy Hyper- Converged Reference Architecture Solution Brief

Marvell DragonFly. TPC-C OLTP Database Benchmark: 20x Higher-performance using Marvell DragonFly NVCACHE with SanDisk X110 SSD 256GB

Scaling from Workstation to Cluster for Compute-Intensive Applications

The path to the cloud training

Entry level solutions: - FAS 22x0 series - Ontap Edge. Christophe Danjou Technical Partner Manager

Introduction to Gluster. Versions 3.0.x

Enabling Technologies for Distributed and Cloud Computing

Fujitsu PRIMERGY Servers Portfolio

Enabling Technologies for Distributed Computing

The virtualization of SAP environments to accommodate standardization and easier management is gaining momentum in data centers.

Preparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment.

Testing of several distributed file-system (HadoopFS, CEPH and GlusterFS) for supporting the HEP experiments analisys. Giacinto DONVITO INFN-Bari

Maxta Storage Platform Enterprise Storage Re-defined

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage

Introducing NetApp FAS2500 series. Marek Stopka Senior System Engineer ALEF Distribution CZ s.r.o.

SALSA Flash-Optimized Software-Defined Storage

OpenStack IaaS. Rhys Oxenham OSEC.pl BarCamp, Warsaw, Poland November 2013

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Software-defined Storage at the Speed of Flash

Cloud on TEIN Part I: OpenStack Cloud Deployment. Vasinee Siripoonya Electronic Government Agency of Thailand Kasidit Chanchio Thammasat University

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software

Springpath Data Platform with Cisco UCS Servers

Storage Virtualization in Cloud

POSIX and Object Distributed Storage Systems

NOTICE ADDENDUM NO. TWO (2) JULY 8, 2011 CITY OF RIVIERA BEACH BID NO SERVER VIRTULIZATION/SAN PROJECT

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

State of the Art Cloud Infrastructure

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Intro to Virtualization

RED HAT STORAGE PORTFOLIO OVERVIEW

Accelerate POC to Production: OpenStack with FlexPod

New Storage System Solutions

VxRACK : L HYPER-CONVERGENCE AVEC L EXPERIENCE VCE JEUDI 19 NOVEMBRE Jean-Baptiste ROBERJOT - VCE - Software Defined Specialist

(Scale Out NAS System)

Moving Virtual Storage to the Cloud

Large Scale Storage. Orlando Richards, Information Services LCFG Users Day, University of Edinburgh 18 th January 2013

Investigation of storage options for scientific computing on Grid and Cloud facilities

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

HP Cloudline Overview

Integrated Grid Solutions. and Greenplum

Scala Storage Scale-Out Clustered Storage White Paper

Distributed Block-level Storage Management for OpenStack

Unified Computing System When Delivering IT as a Service. Tomi Jalonen DC CSE 2015

東 海 大 學 資 訊 工 程 研 究 所 碩 士 論 文

Lustre SMB Gateway. Integrating Lustre with Windows

RED HAT STORAGE SERVER TECHNICAL OVERVIEW

Connecting Flash in Cloud Storage

ovirt and Gluster Hyperconvergence

Building a Private Cloud with Eucalyptus

Virtualization. Nelson L. S. da Fonseca IEEE ComSoc Summer Scool Trento, July 9 th, 2015

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Transcription:

Scientific Computing Data Management Visions ELI-Tango Workshop Szeged, 24-25 February 2015 Péter Szász Group Leader Scientific Computing Group ELI-ALPS

Scientific Computing Group Responsibilities

Data storage and processing Challenges Very high data rates from detectors Online near-real time data analysis and reduction Online data visualization aggregation of detectors Dynamically changing needs and diversity (by experiments) Data rates/volumes Computing capacity Data analysis software Growing data volumes expected -> continuously growing IT infrastucture Design goals Flexible, scalable solutions (no hard upper limit) Scaling-out -> cost efficiency Avoid vendor lock-in Follow widely accepted standards/solutions Prefer well supported evolving solutions Possibility for customization Control Total-Cost-of-Ownership Open source state-of-the-art private cloud solution on a reliable, high capacity, high performance scale-out software-defined storage platform.

OpenStack 0 23 3 Drive Shelf M6720 Data flow vision DAQ RAM GPU Nodes VM VM Job Scheduler VM Ceph Torque RAM CPU Cluster Online Data processing Online Data processing Online Data processing Online (scratch) Storage Offline Storage

Ceph for storage platform Open source software-defined storage platform Owned by Red Hat Distributed object store, block device, file system Client is in Linux kernel since v2.6.34 No single point of failure Scalability (scaling-out!) until exabyte level Fault-tolerancy Erasure coding (no need for RAID) Cache tiering Improved support for SSDs (in newer versions)

Ceph as storage platform As Object Storage RESTFul API partial or complete reads and writes snapshots atomic transactions As Block Device (RBD) thinly provisioned resizable images image import/export automatic stripe and replication across the cluster integrates with OpenStack -> unlimited storage can be mounted As File System (CephFS) POSIX-compliant network file system strong data safety for mission-critical applications virtually unlimited storage separates metadata from data automatically balances to deliver maximum performance

OpenStack for private cloud Open source cloud infrastructure as a service (IaaS) platform Co-founded by Rackspace and NASA in 2010, since then dozens of big players became its supporter (IBM, HP, Intel, Red Hat, Cisco, NetApp etc.) Supports different hypervisors (Xen, VMware, KVM) Supports several virtualization technologies (such as container, bare metal, high-performance computing) No vendor lock-in, cost efficiency Scale-out architecture Robust role-based access controls Significant references world-wide (e.g. CERN)

......... Clusters Overview DAQ HPC Cluster Storage Cluster Ethernet switch 10/40GbE for data at front-end Ethernet switch 1GbE for monitoring CPU Cluster Head Node CPU: Dual Intel Xeon E5-2600 v2 RAM: 128 GB SSD: 2x128GB HDD: 2x1TB Ethernet switch 10/40 GbE for front-end Ethernet switch 1GbE for management Monitor Node 1 Monitor Node 2 Monitor Node 3 CPU: Dual Intel Xeon E5-2630 V2 RAM: 64 GB HDD: 4x 300GB (SAS3) CPU Comp. Node 1 CPU Comp. Node 2 CPU Comp. Node n GPU Cluster Head Node GPU Comp. Node 1 GPU Comp. Node 2 GPU Comp. Node m CPU: Dual Intel Xeon E5-2600 v2 RAM: 512 GB SSD: 2x128GB CPU: Dual Intel Xeon E5-2600 v2 GPU: 6x NVIDIA Tesla K40 RAM: 128 GB SSD: 4x256GB OSD Node 1 OSD Node 2 OSD Node 3 OSD Node 4 OSD Node k CPU: Intel Xeon E5-2630 V2 RAM: 128GB SSD: 800GB PCI-E 3.0 x4 HDD: 6TB SATA3 x30-60 Infiniband FDR (56Gbs) for back-end connection of CPU/GPU nodes 40GbE or InfiniBand FDR for back-end network

...... HPC Cluster Plans DAQ Front-end network 40 GbE hp sur est ore VA710 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Cluster back-end network Infiniband (FDR) Storage cluster Head node Head node CPU Computing GPU Computing Cluster monitor/ management 1GbE

Storage Cluster Vision CPU/GPU Cluster Front-end network 10 or 40 GbE 10-16x HDD (per SSD) SATAIII (800MB/s) Location I/O Cluster back-end network 40GbE or InfiniBand FDR Object/Block I/O Sequential writing: 150MB/s (per HDD) CRUSH Monitor Nodes OSD Nodes 4x SSD PCIe 2GB/s Cluster management 1GbE (IPMI)

Next steps 1. Setup and test Ceph/OpenStack on commodity hardware (workstations with GPU cards) 2. Procure a small-grade HPC and Storage cluster 4 CPU Computing Node (+1 Head Node) 2 GPU Computing Node (+1 GPU Head Node) 4 Storage OSD Node 2 Storage Monitor Node 3. Benchmark solutions on small-grade 4. Define and procure production-ready clusters

THANK YOU FOR YOUR ATTENTION!