Preview of a Novel Architecture for Large Scale Storage



Similar documents
KIT Site Report. Andreas Petzold. STEINBUCH CENTRE FOR COMPUTING - SCC

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Violin Memory Arrays With IBM System Storage SAN Volume Control

Maurice Askinazi Ofer Rind Tony Wong. Cornell Nov. 2, 2010 Storage at BNL

Virtual SAN Design and Deployment Guide

Object storage in Cloud Computing and Embedded Processing

I/O Virtualization Using Mellanox InfiniBand And Channel I/O Virtualization (CIOV) Technology

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

SMB Direct for SQL Server and Private Cloud

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

Solid State Storage in Massive Data Environments Erik Eyberg

Large Scale Storage. Orlando Richards, Information Services LCFG Users Day, University of Edinburgh 18 th January 2013

OSG Hadoop is packaged into rpms for SL4, SL5 by Caltech BeStMan, gridftp backend

SAN Conceptual and Design Basics

LSI SAS inside 60% of servers. 21 million LSI SAS & MegaRAID solutions shipped over last 3 years. 9 out of 10 top server vendors use MegaRAID

Data storage services at CC-IN2P3

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Data Center Evolution without Revolution

Ceph Distributed Storage for the Cloud An update of enterprise use-cases at BMW

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

I/O Virtualization The Next Virtualization Frontier

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Mass Storage System for Disk and Tape resources at the Tier1.

SFA10K-X & SFA10K-E. ddn.com. Storage Fusion Architecture TM. DDN Whitepaper

Increasing Storage Performance

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

Quantum StorNext. Product Brief: Distributed LAN Client

Oracle Database Scalability in VMware ESX VMware ESX 3.5

E4 UNIFIED STORAGE powered by Syneto

Brocade and EMC Solution for Microsoft Hyper-V and SharePoint Clusters

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

Oracle Maximum Availability Architecture with Exadata Database Machine. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

How To Connect Virtual Fibre Channel To A Virtual Box On A Hyperv Virtual Machine

Sun Constellation System: The Open Petascale Computing Architecture

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

SOLUTION BRIEF. Resolving the VDI Storage Challenge

Architecting a High Performance Storage System

Cloud-Optimized Performance: Enhancing Desktop Virtualization Performance with Brocade 16 Gbps

Benchmarking Microsoft SQL Server Using VMware ESX Server 3.5

Hitachi Data Systems and Brocade Disaster Recovery Solutions for VMware Environments

Storage Architectures for Big Data in the Cloud

KIT Site Report. Andreas Petzold. STEINBUCH CENTRE FOR COMPUTING - SCC

A Survey of Shared File Systems

How To Virtualize A Storage Area Network (San) With Virtualization

Lessons learned from parallel file system operation

Analysis of VDI Storage Performance During Bootstorm

ZD-XL SQL Accelerator 1.5

Preparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment.

3 Red Hat Enterprise Linux 6 Consolidation

Microsoft Private Cloud Fast Track

Using Multipathing Technology to Achieve a High Availability Solution

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Windows 8 SMB 2.2 File Sharing Performance

Dell PowerVault MD32xx Deployment Guide for VMware ESX4.1 Server

Implementing Enterprise Disk Arrays Using Open Source Software. Marc Smith Mott Community College - Flint, MI Merit Member Conference 2012

NexentaStor Enterprise Backend for CLOUD. Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V.

Dell Converged Infrastructure

High Performance MySQL Cluster Cloud Reference Architecture using 16 Gbps Fibre Channel and Solid State Storage Technology

REPLIES TO PRE BID REPLIES FOR REQUEST FOR PROPOSAL FOR SUPPLY, INSTALLATION AND MAINTENANCE OF SERVERS & STORAGE

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Technology Insight Series

Cray DVS: Data Virtualization Service

The next step in Software-Defined Storage with Virtual SAN

New Storage System Solutions

Virtualized Converged Data Centers & Cloud how these trends are effecting Optical Networks

Introduction to MPIO, MCS, Trunking, and LACP

Windows Server 2012 R2 Hyper-V: Designing for the Real World

præsentation oktober 2011

The Future of Computing Cisco Unified Computing System. Markus Kunstmann Channels Systems Engineer

Virtualization, Business Continuation Plan & Disaster Recovery for EMS -By Ramanj Pamidi San Diego Gas & Electric

RED HAT ENTERPRISE LINUX 7

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

RFP-MM Enterprise Storage Addendum 1

Choosing Storage Systems

VMware vsphere 5.1 Advanced Administration

Business Continuity with the. Concerto 7000 All Flash Array. Layers of Protection for Here, Near and Anywhere Data Availability

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Solution Brief Network Design Considerations to Enable the Benefits of Flash Storage

Architecting Your SAS Grid : Networking for Performance

Cloud Optimized Performance: I/O-Intensive Workloads Using Flash-Based Storage

High Availability Databases based on Oracle 10g RAC on Linux

Big data management with IBM General Parallel File System

Nutanix NOS 4.0 vs. Scale Computing HC3

VMWARE VSPHERE 5.0 WITH ESXI AND VCENTER

How to Choose your Red Hat Enterprise Linux Filesystem

The Evolution of Microsoft SQL Server: The right time for Violin flash Memory Arrays

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Transcription:

Preview of a Novel Architecture for Large Scale Storage Andreas Petzold, Christoph-Erdmann Pfeiler, Jos van Wezel Steinbuch Centre for Computing STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu

Storage Management Systems at GridKa dcache for ATLAS, CMS, LHCb 6 PB disk-only 3 PB tape-buffers 287 pools on 58 servers Agnostic to underlying storage technology Scalla xrootd for ALICE 2.7 PB disk-only/tape buffer 15 servers Agnostic to underlying storage technology SRM Storage Management System Linux File System Storage Controller Disks 2

Current GridKa Disk Storage Technologies 9 x DDN S2AA9900 150 enclosures 9000 disks 796 LUNs SAN Brocade DCX 1 x DDN SFA10K 10 enclosures 600 disks 1 x DDN SFA12K 5 enclosures 360 disks 3

Current GridKa Tape Storage Technologies 2 Oracle/Sun/STK SL8500 2 x 10088 slots 22 LTO5, 16 LTO4 1 IBM TS3500 5800 slots 24 LTO4 1 GRAU XL 5376 slots 16 LTO3, 8 LTO4 4

GridKa Storage Units Servers connect directly to storage or via SAN but not explicitly needed 10G Ethernet Cluster Filesystem (GPFS) connects several servers filesystem visible on all nodes predictable IO throughput nice storage virtualisation layer Currently evaluating alternative filesystems XFS, ZFS, BTRFS, EXT4 Fibre Channel 6

Novel Storage Solutions for GridKa Expect large resource increase in 2015 Chance to look at new solutions during LHC LS1 Simplification in operations and deployment required Solution 1: DataDirectNetworks SFA12K-E Server VMs run embedded in storage controller Solution 2: Rausch Netzwerktechnik BigFoot More conventional setup; server directly connected to local disks 7

Shortening Long IO Paths Traditional storage at GridKa application begin { do_io } end Switch Worker node NIC NIC Router SAN Server HBA HBA Switch Storage Controller SAS/FC fabric disk disk Previewed storage at GridKa potentially redundant components application begin { do_io } end Switch Worker node NIC NIC Router Server SAS/FC fabric disk disk BigFoot application begin { do_io } end application begin { do_io } end Switch Worker node NIC NIC Router Switch Worker node NIC NIC Router Storage Controller Server network IB HCA SAS/FC fabric IB HCA disk disk Storage Controller DDN SFA12K-E SAS/FC fabric DDN SFA12K disk disk 8

DDN SFA E Architecture Worker node Worker Worker node node 10 GE NIC dedicated to VM 8 VMs per SFA pair NIC IO bridge VM1 xrootd IO bridge IO bridge IO bridge VM2 VM3 VM4 IO bridge IO bridge IO bridge IO bridge VM1 VM2 VM3 VM4 DDN VM driver dma access to storage sfablkdriver Cache sfablkdriver Cache SFA OS SFA OS SAS Switch SAS Switch SSD SSD SSD SSD 9

BigFoot Architecture Worker node Worker Worker node node 10 GE NIC NIC OS SAS HBA/RAID Controller Cache SSD SSD 10

Thoughts on Effect of SAN and Long Distance Latency IO is different for read and write writes can be done asynchronously but if you do them sync (like NFS) you have to wait for the ACK workloads are read mostly SSD speeds read are synchronous reading huge files linearly defeats caches in servers example: Enterprise MLC SSD, Violin memory, Fusion IO drive number of IOPS: 250000/s (lower estimate) at 4 k/io 1 s 250000 SAN Latency = 4 μs 4 μs per IO DCX director 2.1 μs 4.2 μs round trip not accounting HBA, fibre length (300 m = 1 μs) 11

Thoughts on Effect of SAN and Long Distance Latency Effect on high speed SSDs 1 2 2.1μs+4μs = 121951 IOPS not 250000 IOPS similar for disk controller caches when used for VMs or data servers Negligible impact of SAN for magnetic disks 1 2 2.1μs+5000 μs = 199 IOPS Putting the SSD next to the server can possibly increase the number of IOPs assumption will be tested in the coming weeks 12

Expected Benefits Reduced Latency HBA and SAN: 2.1 μs (e.g. Brocade DCX, blade to blade) storage controller Improved IO rates Reduced power Server and controller HBA, SAN Switch ~300-400W = ~600Euro/server/year Reduced investment 500-600 /HBA, 400-600 /switch port =1800-2400 Improved MTBF Less components 13

Possible Drawbacks Loss in flexibility w/o SAN storage building blocks are larger Limited server access to storage blocks Storage systems are only connected via LAN VMs inside storage controller (DDN SFA12K-E) Competition for resources Limited number of VMs limits server/tb ratio Loss of redundancy Simple server attached storage (BigFoot) Limited by simple hardware controller HW admin doesn t scale to 100s of boxes No redundancy 14

Glimpse at Performance Preliminary performance evaluation IOZONE testing 30-100 parallel threads on XFS filesystem Xrootd data server in VM, performance similar to IOZONE Out-of-the-box settings, no tuning Performance below expectations, reasons still to be understood ZFS tested on BigFoot IOZONE 15

Conclusions Storage extension at GridKa requires expensive upgrade of GridKa disk SAN or novel storage solution Tight integration of server and storage looks promising Many possible benefits further evaluation required Less components Less power consumption Less complexity Performance needs to be understood together with vendors More tests with other vendors in near future 16

17