KIT Site Report. Andreas Petzold. www.kit.edu STEINBUCH CENTRE FOR COMPUTING - SCC

Similar documents
KIT Site Report. Andreas Petzold. STEINBUCH CENTRE FOR COMPUTING - SCC

Data storage services at CC-IN2P3

Preview of a Novel Architecture for Large Scale Storage

OSG Hadoop is packaged into rpms for SL4, SL5 by Caltech BeStMan, gridftp backend

Maurice Askinazi Ofer Rind Tony Wong. Cornell Nov. 2, 2010 Storage at BNL

Low-cost

IBM System x SAP HANA

Lustre tools for ldiskfs investigation and lightweight I/O statistics

What is the real cost of Commercial Cloud provisioning? Thursday, 20 June 13 Lukasz Kreczko - DICE 1

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

Building a Top500-class Supercomputing Cluster at LNS-BUAP


C460 M4 Flexible Compute for SAP HANA Landscapes. Judy Lee Released: April, 2015

Lessons learned from parallel file system operation

Sun Constellation System: The Open Petascale Computing Architecture

Integration of Virtualized Workernodes in Batch Queueing Systems The ViBatch Concept

FLOW-3D Performance Benchmark and Profiling. September 2012

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

HPC Update: Engagement Model

HTCondor at the RAL Tier-1

Why long time storage does not equate to archive

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Lustre SMB Gateway. Integrating Lustre with Windows

Tier0 plans and security and backup policy proposals

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Altix Usage and Application Programming. Welcome and Introduction

SMB Direct for SQL Server and Private Cloud

Crossing the Performance Chasm with OpenPOWER

Self service for software development tools

Maharashtra State Data Centre (MH-SDC) MahaGov Cloud

Michael Kagan.

Fujitsu PRIMERGY Servers Portfolio

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Mit Soft- & Hardware zum Erfolg. Giuseppe Paletta

Kriterien für ein PetaFlop System

Are Blade Servers Right For HEP?

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

Hadoop on the Gordon Data Intensive Cluster

Panasas at the RCF. Fall 2005 Robert Petkus RHIC/USATLAS Computing Facility Brookhaven National Laboratory. Robert Petkus Panasas at the RCF

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

LONI Provides UNO High Speed Business Con9nuity A=er Katrina. Speakers: Lonnie Leger, LONI Chris Marshall, UNO

Klaus Gottschalk

Mellanox Accelerated Storage Solutions

præsentation oktober 2011

High Performance Computing in CST STUDIO SUITE

Data Sheet FUJITSU Server PRIMERGY CX420 S1 Out-of-the-box Dual Node Cluster Server

Annex 1: Hardware and Software Details

Pedraforca: ARM + GPU prototype

Fujitsu PRIMERGY BX Blade Server

The Greenplum Analytics Workbench

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Data Sheet FUJITSU Server PRIMERGY CX400 S2 Multi-Node Server Enclosure

The Department of Computer Science and Engineering established in 2002 offers the following academic programmes:

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

Storage strategy and cloud storage evaluations at CERN Dirk Duellmann, CERN IT

Report from SARA/NIKHEF T1 and associated T2s

Scientific Computing Data Management Visions

IBM Spectrum Scale vs EMC Isilon for IBM Spectrum Protect Workloads

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

FUJITSU Enterprise Product & Solution Facts

Parallel Large-Scale Visualization

NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB

JuRoPA. Jülich Research on Petaflop Architecture. One Year on. Hugo R. Falter, COO Lee J Porter, Engineering

7 Real Benefits of a Virtual Infrastructure

SR-IOV In High Performance Computing

Open Cirrus: Towards an Open Source Cloud Stack

System Requirements Table of contents

Data management challenges in todays Healthcare and Life Sciences ecosystems

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

DATA WAREHOuSIng eb Data Warehouse 6700

Overview: X5 Generation Database Machines

Flora Muglia Azure Solution Sales Professional We are partners in learning. November 2015

CERN local High Availability solutions and experiences. Thorsten Kleinwort CERN IT/FIO WLCG Tier 2 workshop CERN

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Cloud Storage. Parallels. Performance Benchmark Results. White Paper.

Transcription:

KIT Site Report Andreas Petzold STEINBUCH CENTRE FOR COMPUTING - SCC KIT University of the State of Baden-Württemberg and National Laboratory of the Helmholtz Association www.kit.edu

Data Intensive Science at KIT 2

Computational Science and Engineering at KIT 4 HPC machines for local, state wide, nation wide use 3

Tier-1 Batch System & Farm 180kHS06 154 new WNs in production since two weeks Dual Intel Xeon E5-2630v3 (8-Core, 2.40GHz), 96 GB RAM, 3x500 GB HDD 24 job slots/node, 4GB RAM/slot UGE with cgroups Multi Core Jobs Don't let UGE kill jobs, but let cgroups do their job! Machine Job Features fully implemented KIT participation HEPiX Benchmarking WG, WLCG Multicore TF, WLCG Machine Job Features TF 4

Tier-1 Disk Storage & dcache/xrootd 14PB disk storage currently all DDN S2A9900, SFA10K, SFA12K 2.4PB extension for 2015 delayed but in pipeline 3.2PB replacement scheduled for 2016 6 dcache instances ATLAS, CMS, LHCb, shared including Belle II, national resources, testing recently updated to 2.13 new DB hosts added xrootd for ALICE disk-only & tape SE updated to 4.1.3 5

Tape Storage Tier-1 TSM 19PB, 3 libraries, library visualization ERMM recently switched to T10K technology simplified setup improved reliability LSDF TSM 6PB, 1 library HPSS 1 library, currently only T10KD Testing migration for Tier-1&LSDF Talk at HPSS Users Forum about Power8 performance 6

Config Management and Deployment pushing hard to puppetize as many things as possible gitlab & gitlab-ci (shared), foreman, puppet masters (separate per project) SDIL completely puppetized via RedHat Satellite connected to gitlab INSTITUTS-, manages FAKULTÄTS-, ABTEILUNGSNAME (in der Masteransicht ändern) x86, Power8 BE, Power8 LE (not supported by RH) machines AIX?? many more details in Dimitri Nilsen's talk on Friday 7

ELK many ideas but limited manpower existing ELK prototype for dcache, UGE LSDF file (access) statistics >400M files, txt name space dump >100GB handles the data of a single INSTITUTS-, ELK FAKULTÄTS-, ABTEILUNGSNAME (in der Masteransicht ändern) dump easily, now need to implement history 8

HPC New ForHLR II Compute: Transtec 23040 cores (1152 nodes) Lenovo NeXtScale nx360 M5 Server Dual Intel Xeon Haswell E5-2660v3, 2,6 GHz, 10 Core 64 GB RAM DDR4, 480 GB SSD Mellanox InfiniBand HCA FDR 56 Gbit/s Storage: DELL DDN storage systems 3 Lustre file systems /home 611TiB@11GB/s; /work1 1222TiB@22GB/s; /work2 3055TiB@55GB/s New building Offices Visualization lab 9

Last year 10

This week 11

Central Cooling at KIT Campus North new combined heat/power/cooling plant at KIT CN cooling line on campus will replace many small cooling installations across campus base load provided by SCC: max 2.4MW existing cooling installation needs to be cut open to attach central cooling line Can we keep at least all disks running during work on cooling? recent maintenance downtime used for testing bypass with external cold water supplied increased capacity of air cooling in one room test successful, with a few lessons learned 12

Cooling Bypass Test Temp run-away caught by opening additional floor plates water cooling switched on again Additional air cooling switched on water cooling switched off, racks opened 13

Central Cooling at KIT Campus North 14

15

Thank you! 16