Scientific Computing Data Management Visions ELI-Tango Workshop Szeged, 24-25 February 2015 Péter Szász Group Leader Scientific Computing Group ELI-ALPS
Scientific Computing Group Responsibilities
Data storage and processing Challenges Very high data rates from detectors Online near-real time data analysis and reduction Online data visualization aggregation of detectors Dynamically changing needs and diversity (by experiments) Data rates/volumes Computing capacity Data analysis software Growing data volumes expected -> continuously growing IT infrastucture Design goals Flexible, scalable solutions (no hard upper limit) Scaling-out -> cost efficiency Avoid vendor lock-in Follow widely accepted standards/solutions Prefer well supported evolving solutions Possibility for customization Control Total-Cost-of-Ownership Open source state-of-the-art private cloud solution on a reliable, high capacity, high performance scale-out software-defined storage platform.
OpenStack 0 23 3 Drive Shelf M6720 Data flow vision DAQ RAM GPU Nodes VM VM Job Scheduler VM Ceph Torque RAM CPU Cluster Online Data processing Online Data processing Online Data processing Online (scratch) Storage Offline Storage
Ceph for storage platform Open source software-defined storage platform Owned by Red Hat Distributed object store, block device, file system Client is in Linux kernel since v2.6.34 No single point of failure Scalability (scaling-out!) until exabyte level Fault-tolerancy Erasure coding (no need for RAID) Cache tiering Improved support for SSDs (in newer versions)
Ceph as storage platform As Object Storage RESTFul API partial or complete reads and writes snapshots atomic transactions As Block Device (RBD) thinly provisioned resizable images image import/export automatic stripe and replication across the cluster integrates with OpenStack -> unlimited storage can be mounted As File System (CephFS) POSIX-compliant network file system strong data safety for mission-critical applications virtually unlimited storage separates metadata from data automatically balances to deliver maximum performance
OpenStack for private cloud Open source cloud infrastructure as a service (IaaS) platform Co-founded by Rackspace and NASA in 2010, since then dozens of big players became its supporter (IBM, HP, Intel, Red Hat, Cisco, NetApp etc.) Supports different hypervisors (Xen, VMware, KVM) Supports several virtualization technologies (such as container, bare metal, high-performance computing) No vendor lock-in, cost efficiency Scale-out architecture Robust role-based access controls Significant references world-wide (e.g. CERN)
......... Clusters Overview DAQ HPC Cluster Storage Cluster Ethernet switch 10/40GbE for data at front-end Ethernet switch 1GbE for monitoring CPU Cluster Head Node CPU: Dual Intel Xeon E5-2600 v2 RAM: 128 GB SSD: 2x128GB HDD: 2x1TB Ethernet switch 10/40 GbE for front-end Ethernet switch 1GbE for management Monitor Node 1 Monitor Node 2 Monitor Node 3 CPU: Dual Intel Xeon E5-2630 V2 RAM: 64 GB HDD: 4x 300GB (SAS3) CPU Comp. Node 1 CPU Comp. Node 2 CPU Comp. Node n GPU Cluster Head Node GPU Comp. Node 1 GPU Comp. Node 2 GPU Comp. Node m CPU: Dual Intel Xeon E5-2600 v2 RAM: 512 GB SSD: 2x128GB CPU: Dual Intel Xeon E5-2600 v2 GPU: 6x NVIDIA Tesla K40 RAM: 128 GB SSD: 4x256GB OSD Node 1 OSD Node 2 OSD Node 3 OSD Node 4 OSD Node k CPU: Intel Xeon E5-2630 V2 RAM: 128GB SSD: 800GB PCI-E 3.0 x4 HDD: 6TB SATA3 x30-60 Infiniband FDR (56Gbs) for back-end connection of CPU/GPU nodes 40GbE or InfiniBand FDR for back-end network
...... HPC Cluster Plans DAQ Front-end network 40 GbE hp sur est ore VA710 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Cluster back-end network Infiniband (FDR) Storage cluster Head node Head node CPU Computing GPU Computing Cluster monitor/ management 1GbE
Storage Cluster Vision CPU/GPU Cluster Front-end network 10 or 40 GbE 10-16x HDD (per SSD) SATAIII (800MB/s) Location I/O Cluster back-end network 40GbE or InfiniBand FDR Object/Block I/O Sequential writing: 150MB/s (per HDD) CRUSH Monitor Nodes OSD Nodes 4x SSD PCIe 2GB/s Cluster management 1GbE (IPMI)
Next steps 1. Setup and test Ceph/OpenStack on commodity hardware (workstations with GPU cards) 2. Procure a small-grade HPC and Storage cluster 4 CPU Computing Node (+1 Head Node) 2 GPU Computing Node (+1 GPU Head Node) 4 Storage OSD Node 2 Storage Monitor Node 3. Benchmark solutions on small-grade 4. Define and procure production-ready clusters
THANK YOU FOR YOUR ATTENTION!