Background and introduction Using the cluster Summary. The DMSC datacenter. Lars Melwyn Jensen. Niels Bohr Institute University of Copenhagen

Size: px
Start display at page:

Download "Background and introduction Using the cluster Summary. The DMSC datacenter. Lars Melwyn Jensen. Niels Bohr Institute University of Copenhagen"

Transcription

1 Niels Bohr Institute University of Copenhagen

2

3 Who am I Theoretical physics (KU, NORDITA, TF, NSC) Computing non-standard superconductivity and superfluidity condensed matter / statistical physics several low budget clusters lots of research simulations and computations System administration and development Koncern IT, Copenhagen University, >100 employees 9000 employees, students, > 800 servers, > 100 systems Big migration project: 1 2 new DCs, DS484/ISO27001 ITIL / IT governance

4 What is DMSC? DMSC is a collection of in-kind contributions hosted at the Niels Bohr institute led by Stig Skelboe DMSC CC is run by CC Computing centre DU Design update IS Instrument simulations Admins: 1 Lars + 1/2 Christian + 1 /2 Jesper DMSC CC is in principle Computational facility for ESS but in reality also virtualization enviroment various web services, web servers and databases datacenter infrastructure

5 The idmsc CC The future DMSC : 2019 and forward building a new datacenter Design update task The interim DMSC : 2010 to 201x computational facility three racks of hardware basically hosted in the DCSC serverroom minimal organizational overlap with DCSC but shared backup Serverroom pics https://portal.esss.dk/images/ess-cluster-jan-2012/

6 Background and introduction Public view

7 Background and introduction Private view

8 Activities High performance computing Interconnect = Infiniband HPC software High performance storage Storage is the bottle-neck Storage computing Building large datacenter IT infrastructure Users see the top of the iceberg Average user may interact with 10 out of 100 systems The will be many systems which no users will interact with Skeleton for the other systems design principles (modular, scalable, no sprawl, coherency, simplicity, open source, virtualized, appliance oriented) Gathering expertice for 2019

9 Non-HPC Internal services Infrastructure Components DNS name service BIND DHCP / Deployment Cobbler PXE boot Firewalls OpenBSD Packet Filter Remote management IPMI server Backup system IBM TSM tape user management and policies LAM / LDAP / PAM Virtualization hosts KVM / Libvirt Admin wiki Development environment / Build servers Web proxy / web statistics Web servers Plone / Drupal Project / Document management Redmine Monitoring M/Monit / Cacti / Nagios / Jabber Automation SNMP / XMPP Network Switches

10 Non-HPC External services User services Wikis Mailing list Request tracker / Ticket system SCM / mercurial YUM package repos Bugzilla Artefactory Hudson / Jenskin continuous integration Build servers Designated users VMs Moodle Virtual simulation web TDR CVS web system Java deployment

11 Run-of-the-mill HPC setup Standard cluster 2 Login nodes for user login 2 Head nodes for batch system n Compute nodes for raw computing Standard servers m n cpus cores 2 to 4 cpus 4 to 16 cores local storage on each server OS interconnect usually ethernet and infiniband global network storage data / scratch space multi GPUs : extra > 1000 computing cores Price/performance ratio aspect next-to-newest technology at half the price

12 Call for Cluster Call for tender EU call using pre-negotiated framework specification requirements month proces timeline (call, standstill, response) month evaluation of offers contract negotiations months breach of contract months delivery install and test more specifications longer and more rigid proces

13 The Dell solution Compute nodes 42 compute nodes GHz 6 cores per node GB 6 GbpsBPS 48 GB RAM per node 2 1 GbE for management 1 Infiniband adapter for storage and interconnect (MPI) Storage system with lustre 2 metadata servers 2 object storage servers direct attached storage arrays 1 1 TB of metadata raid TB of storage raid 6 active/passive and active/active redundant and failover

14 Why Infiniband Infiniband solutions Mellanox bought Voltaire Intel bought QLogic (going mainstream?) Truescale Intel QLE 7340 / 5 x IB switches Performance characteristica Low latency 1 µs scalable High bandwidth 40 Gbps (effective 32 Gbps) PCI Express Gen2 x8 Topology, fat tree = max bandwidth and failover (Almost) Infiniband only Avoid interconnect sprawl (1 GbE + 10 GbE + Infiniband) combined interconnect + storage cheaper and faster than 10 GbE even long range with Infiniband range extender

15 Evolution of interconnects Infiniband tech for the future System share Performance share

16 High performance file system Why high performance file system Modular setup Scalable modules both in bandwidth and size Redundant setup with fail-over / resilience Low latency and high bandwidth uses infiniband network Top 500 filesystems 50% IBM GPFS 50% Lustre

17 Why Lustre Why Lustre 7 out of the top 10 Supported by EOFS, SFS, LLNL + many more Lustre is the first and foremost production-tested, object-based Linux cluster file system used by some of the world s largest commercial, university, research, and government environments. Current performance numbers File I/O as a percent of raw bandwidth: ~90% Achieved single object storage server I/O: >6 GB/sec Achieved single client I/O: >3 GB/sec Achieved aggregate I/O: 500 GB/sec Metadata transaction rate: 24,000 ops/sec Maximum file size/ maximum file system size: 2 PB / > 512 PB Maximum number of clients: > 26,000

18 Lustre setup Scalable design striping data on many OST

19 Lustre roadmap I

20 Lustre roadmap II Relevant roadmap ldiskfs = ext4 better backend parallel nfs (pnfs) backend server (ISO standard like nfs) zfs backend filesystem (snapshots, pooling, deduplication) btrfs backend filesystem hierarchical storage management (move old files to slow storage) chroma console and gui (monitoring debugging planning)

21 Some comments Preliminary benchmarks 300 MB/s per OST > 1 GB/s aggregated Simple NFS Con lustre 80 MB/s aggregated is worse that 1/N for N simultaneous need to control striped versus unstriped dirs avoid small file may need NFS server for homedir may need extra global scratch dir may need to use local scratch dir metadata server setup is active / passive best for large files (streaming to N OST)

22 Batch system Batch jobs Effective use of the hardware Make it run 24 7 shared resource is cost effective Interactive runs can be done for matlab type of jobs is not effective require larger installation reduce overall utilization Good users practise try to estimate your runs make sure your parallel jobs scale half ) plan to max out your timeslot darwinian survival strategy (prey on CPU cycles)

23 Why SLURM What is SLURM Simple Linux Utility for Resource Management Highly scalable up to 65,536 nodes and Hundreds of thousands of processors Throughput rate of over 120,000 jobs per hour large user group and good support possibilities Highly tolerant of system failures active development and highly recommended Nice features Good default scheduler flexible configuration premption scheduling gang scheduling backfilll node range accounting

24 Cluster services and software Getting access Send a request through your group leader Login Use! this could be improved ssh -X -C -o CompressionLevel=9 password change on login cd /users/examples windows users need putty and an X server for X forwarding program compile run Problems send an to

25 Cluster services and software Modules environment Effective and systematic management of the environment multiple users multiple compiler and libraries multiple applications multiple versions module commands module avail module load intel/12.1 module switch intel/12.1 intel/11.2 module unload intel/12.1 module help/list write you own modules put files in ~/.modules, see /etc/modulefiles module load use.own module avail

26 Cluster services and software SLURM essentials SLURM current partitions express 1 node, 2 hours for test long 39 nodes, 1 day for production SLURM commands sinfo partition (queue) information squeue list pending and running jobs sbatch submit a batch job scancel cancel a job scontrol change job parameter / partition smap graphical overview srun running interactively https://computing.llnl.gov/linux/slurm/quickstart.html

27 Cluster services and software SLURM script #!/bin/bash #SBATCH job-name=myjob #SBATCH output=slurm.out #SBATCH error=slurm.err #SBATCH mail-type=end #SBATCH #SBATCH partition=long #SBATCH ntasks-per-core 1 #SBATCH nodes #SBATCH exclusive ##SBATCH time=1:00:00 ##SBATCH mem-per-cpu=1000 mpirun mpi-hello.mpi

28 Cluster services and software Compiling programs Compiler Libraries GNU gcc / fortran Intel 12.1 icc / ifort 2 seat license BLAS LAPACK ScaLAPACK FFTW2/3 ATLAS GSL Intel MKL (CPU optimized libs) Hybrid games hyperthreading mixing OpenMP and MPI mixing fortran and C/C++

29 Cluster services and software Running parallel jobs Choosing Message Parsing Interface (not in module env) mpi-selector list mpi-selector set mpi-selector-gui Compiling for MPI mpicc depends on MPI implementations use QLogic Intel MPI drivers use OpenMPI or MVAPICH2

30 That s it History so far How to use

The Asterope compute cluster

The Asterope compute cluster The Asterope compute cluster ÅA has a small cluster named asterope.abo.fi with 8 compute nodes Each node has 2 Intel Xeon X5650 processors (6-core) with a total of 24 GB RAM 2 NVIDIA Tesla M2050 GPGPU

More information

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert

JUROPA Linux Cluster An Overview. 19 May 2014 Ulrich Detert Mitglied der Helmholtz-Gemeinschaft JUROPA Linux Cluster An Overview 19 May 2014 Ulrich Detert JuRoPA JuRoPA Jülich Research on Petaflop Architectures Bull, Sun, ParTec, Intel, Mellanox, Novell, FZJ JUROPA

More information

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research

Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research ! Introduction to Running Computations on the High Performance Clusters at the Center for Computational Research! Cynthia Cornelius! Center for Computational Research University at Buffalo, SUNY! cdc at

More information

An introduction to Fyrkat

An introduction to Fyrkat Cluster Computing May 25, 2011 How to get an account https://fyrkat.grid.aau.dk/useraccount How to get help https://fyrkat.grid.aau.dk/wiki What is a Cluster Anyway It is NOT something that does any of

More information

New Storage System Solutions

New Storage System Solutions New Storage System Solutions Craig Prescott Research Computing May 2, 2013 Outline } Existing storage systems } Requirements and Solutions } Lustre } /scratch/lfs } Questions? Existing Storage Systems

More information

SLURM Workload Manager

SLURM Workload Manager SLURM Workload Manager What is SLURM? SLURM (Simple Linux Utility for Resource Management) is the native scheduler software that runs on ASTI's HPC cluster. Free and open-source job scheduler for the Linux

More information

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical

More information

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute

PADS GPFS Filesystem: Crash Root Cause Analysis. Computation Institute PADS GPFS Filesystem: Crash Root Cause Analysis Computation Institute Argonne National Laboratory Table of Contents Purpose 1 Terminology 2 Infrastructure 4 Timeline of Events 5 Background 5 Corruption

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Session # LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton 3, Onur Celebioglu

More information

Cluster Implementation and Management; Scheduling

Cluster Implementation and Management; Scheduling Cluster Implementation and Management; Scheduling CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Cluster Implementation and Management; Scheduling Spring 2013 1 /

More information

Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre

Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre University of Cambridge, UIS, HPC Service Authors: Wojciech Turek, Paul Calleja, John Taylor

More information

Advanced Techniques with Newton. Gerald Ragghianti Advanced Newton workshop Sept. 22, 2011

Advanced Techniques with Newton. Gerald Ragghianti Advanced Newton workshop Sept. 22, 2011 Advanced Techniques with Newton Gerald Ragghianti Advanced Newton workshop Sept. 22, 2011 Workshop Goals Gain independence Executing your work Finding Information Fixing Problems Optimizing Effectiveness

More information

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster

Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Introduction to Linux and Cluster Basics for the CCR General Computing Cluster Cynthia Cornelius Center for Computational Research University at Buffalo, SUNY 701 Ellicott St Buffalo, NY 14203 Phone: 716-881-8959

More information

www.thinkparq.com www.beegfs.com

www.thinkparq.com www.beegfs.com www.thinkparq.com www.beegfs.com KEY ASPECTS Maximum Flexibility Maximum Scalability BeeGFS supports a wide range of Linux distributions such as RHEL/Fedora, SLES/OpenSuse or Debian/Ubuntu as well as a

More information

Building a Top500-class Supercomputing Cluster at LNS-BUAP

Building a Top500-class Supercomputing Cluster at LNS-BUAP Building a Top500-class Supercomputing Cluster at LNS-BUAP Dr. José Luis Ricardo Chávez Dr. Humberto Salazar Ibargüen Dr. Enrique Varela Carlos Laboratorio Nacional de Supercómputo Benemérita Universidad

More information

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007

More information

LANL Computing Environment for PSAAP Partners

LANL Computing Environment for PSAAP Partners LANL Computing Environment for PSAAP Partners Robert Cunningham rtc@lanl.gov HPC Systems Group (HPC-3) July 2011 LANL Resources Available To Alliance Users Mapache is new, has a Lobo-like allocation Linux

More information

Best Practice Guide Anselm

Best Practice Guide Anselm Bull Extreme Computing at IT4Innovations / VSB Roman Sliva, IT4Innovations / VSB - Technical University of Ostrava Filip Stanek, IT4Innovations / VSB - Technical University of Ostrava May 2013 1 Table

More information

FLOW-3D Performance Benchmark and Profiling. September 2012

FLOW-3D Performance Benchmark and Profiling. September 2012 FLOW-3D Performance Benchmark and Profiling September 2012 Note The following research was performed under the HPC Advisory Council activities Participating vendors: FLOW-3D, Dell, Intel, Mellanox Compute

More information

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes

More information

Introduction to Supercomputing with Janus

Introduction to Supercomputing with Janus Introduction to Supercomputing with Janus Shelley Knuth shelley.knuth@colorado.edu Peter Ruprecht peter.ruprecht@colorado.edu www.rc.colorado.edu Outline Who is CU Research Computing? What is a supercomputer?

More information

Getting Started with HPC

Getting Started with HPC Getting Started with HPC An Introduction to the Minerva High Performance Computing Resource 17 Sep 2013 Outline of Topics Introduction HPC Accounts Logging onto the HPC Clusters Common Linux Commands Storage

More information

PRIMERGY server-based High Performance Computing solutions

PRIMERGY server-based High Performance Computing solutions PRIMERGY server-based High Performance Computing solutions PreSales - May 2010 - HPC Revenue OS & Processor Type Increasing standardization with shift in HPC to x86 with 70% in 2008.. HPC revenue by operating

More information

Sun Constellation System: The Open Petascale Computing Architecture

Sun Constellation System: The Open Petascale Computing Architecture CAS2K7 13 September, 2007 Sun Constellation System: The Open Petascale Computing Architecture John Fragalla Senior HPC Technical Specialist Global Systems Practice Sun Microsystems, Inc. 25 Years of Technical

More information

1 DCSC/AU: HUGE. DeIC Sekretariat 2013-03-12/RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations

1 DCSC/AU: HUGE. DeIC Sekretariat 2013-03-12/RB. Bilag 1. DeIC (DCSC) Scientific Computing Installations Bilag 1 2013-03-12/RB DeIC (DCSC) Scientific Computing Installations DeIC, previously DCSC, currently has a number of scientific computing installations, distributed at five regional operating centres.

More information

Lustre SMB Gateway. Integrating Lustre with Windows

Lustre SMB Gateway. Integrating Lustre with Windows Lustre SMB Gateway Integrating Lustre with Windows Hardware: Old vs New Compute 60 x Dell PowerEdge 1950-8 x 2.6Ghz cores, 16GB, 500GB Sata, 1GBe - Win7 x64 Storage 1 x Dell R510-12 x 2TB Sata, RAID5,

More information

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver 1 The PHI solution Fujitsu Industry Ready Intel XEON-PHI based solution SC2013 - Denver Industrial Application Challenges Most of existing scientific and technical applications Are written for legacy execution

More information

Moving Virtual Storage to the Cloud

Moving Virtual Storage to the Cloud Moving Virtual Storage to the Cloud White Paper Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage www.parallels.com Table of Contents Overview... 3 Understanding the Storage

More information

SMB Direct for SQL Server and Private Cloud

SMB Direct for SQL Server and Private Cloud SMB Direct for SQL Server and Private Cloud Increased Performance, Higher Scalability and Extreme Resiliency June, 2014 Mellanox Overview Ticker: MLNX Leading provider of high-throughput, low-latency server

More information

Maurice Askinazi Ofer Rind Tony Wong. HEPIX @ Cornell Nov. 2, 2010 Storage at BNL

Maurice Askinazi Ofer Rind Tony Wong. HEPIX @ Cornell Nov. 2, 2010 Storage at BNL Maurice Askinazi Ofer Rind Tony Wong HEPIX @ Cornell Nov. 2, 2010 Storage at BNL Traditional Storage Dedicated compute nodes and NFS SAN storage Simple and effective, but SAN storage became very expensive

More information

IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM

IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM IBM Platform Computing : infrastructure management for HPC solutions on OpenPOWER Jing Li, Software Development Manager IBM #OpenPOWERSummit Join the conversation at #OpenPOWERSummit 1 Scale-out and Cloud

More information

An Introduction to High Performance Computing in the Department

An Introduction to High Performance Computing in the Department An Introduction to High Performance Computing in the Department Ashley Ford & Chris Jewell Department of Statistics University of Warwick October 30, 2012 1 Some Background 2 How is Buster used? 3 Software

More information

Solution for private cloud computing

Solution for private cloud computing The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details System requirements and installation How to get it? 2 What is CC1? The CC1 system is a complete solution

More information

SGI UV 300, UV 30EX: Big Brains for No-Limit Computing

SGI UV 300, UV 30EX: Big Brains for No-Limit Computing SGI UV 300, UV 30EX: Big Brains for No-Limit Computing The Most ful In-memory Supercomputers for Data-Intensive Workloads Key Features Scales up to 64 sockets and 64TB of coherent shared memory Extreme

More information

SR-IOV: Performance Benefits for Virtualized Interconnects!

SR-IOV: Performance Benefits for Virtualized Interconnects! SR-IOV: Performance Benefits for Virtualized Interconnects! Glenn K. Lockwood! Mahidhar Tatineni! Rick Wagner!! July 15, XSEDE14, Atlanta! Background! High Performance Computing (HPC) reaching beyond traditional

More information

Building Clusters for Gromacs and other HPC applications

Building Clusters for Gromacs and other HPC applications Building Clusters for Gromacs and other HPC applications Erik Lindahl lindahl@cbr.su.se CBR Outline: Clusters Clusters vs. small networks of machines Why do YOU need a cluster? Computer hardware Network

More information

Cloud Optimize Your IT

Cloud Optimize Your IT Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release

More information

General Overview. Slurm Training15. Alfred Gil & Jordi Blasco (HPCNow!)

General Overview. Slurm Training15. Alfred Gil & Jordi Blasco (HPCNow!) Slurm Training15 Agenda 1 2 3 About Slurm Key Features of Slurm Extending Slurm Resource Management Daemons Job/step allocation 4 5 SMP MPI Parametric Job monitoring Accounting Scheduling Administration

More information

The CNMS Computer Cluster

The CNMS Computer Cluster The CNMS Computer Cluster This page describes the CNMS Computational Cluster, how to access it, and how to use it. Introduction (2014) The latest block of the CNMS Cluster (2010) Previous blocks of the

More information

Solution for private cloud computing

Solution for private cloud computing The CC1 system Solution for private cloud computing 1 Outline What is CC1? Features Technical details Use cases By scientist By HEP experiment System requirements and installation How to get it? 2 What

More information

An introduction to compute resources in Biostatistics. Chris Scheller schelcj@umich.edu

An introduction to compute resources in Biostatistics. Chris Scheller schelcj@umich.edu An introduction to compute resources in Biostatistics Chris Scheller schelcj@umich.edu 1. Resources 1. Hardware 2. Account Allocation 3. Storage 4. Software 2. Usage 1. Environment Modules 2. Tools 3.

More information

The Information Technology Solution. Denis Foueillassar TELEDOS project coordinator

The Information Technology Solution. Denis Foueillassar TELEDOS project coordinator The Information Technology Solution Denis Foueillassar TELEDOS project coordinator TELEDOS objectives (TELEservice DOSimetrie) Objectives The steps to reach the objectives 2 Provide dose calculation in

More information

Fujitsu HPC Cluster Suite

Fujitsu HPC Cluster Suite Webinar Fujitsu HPC Cluster Suite 29 th May 2013 Павел Борох 0 HPC: полный спектр предложений от Fujitsu PRIMERGY Server, Workstation Cluster Management & Operation ISV and Research Partnerships HPC Cluster

More information

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing

An Alternative Storage Solution for MapReduce. Eric Lomascolo Director, Solutions Marketing An Alternative Storage Solution for MapReduce Eric Lomascolo Director, Solutions Marketing MapReduce Breaks the Problem Down Data Analysis Distributes processing work (Map) across compute nodes and accumulates

More information

Data Center Op+miza+on

Data Center Op+miza+on Data Center Op+miza+on Sept 2014 Jitender Sunke VP Applications, ITC Holdings Ajay Arora Sr. Director, Centroid Systems Justin Youngs Principal Architect, Oracle 1 Agenda! Introductions! Oracle VCA An

More information

Cluster Computing at HRI

Cluster Computing at HRI Cluster Computing at HRI J.S.Bagla Harish-Chandra Research Institute, Chhatnag Road, Jhunsi, Allahabad 211019. E-mail: jasjeet@mri.ernet.in 1 Introduction and some local history High performance computing

More information

Introduction to ACENET Accelerating Discovery with Computational Research May, 2015

Introduction to ACENET Accelerating Discovery with Computational Research May, 2015 Introduction to ACENET Accelerating Discovery with Computational Research May, 2015 What is ACENET? What is ACENET? Shared regional resource for... high-performance computing (HPC) remote collaboration

More information

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband

Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband Intel Cluster Ready Appro Xtreme-X Computers with Mellanox QDR Infiniband A P P R O I N T E R N A T I O N A L I N C Steve Lyness Vice President, HPC Solutions Engineering slyness@appro.com Company Overview

More information

Designed for Maximum Accelerator Performance

Designed for Maximum Accelerator Performance Designed for Maximum Accelerator Performance A dense, GPU-accelerated cluster supercomputer that delivers up to 329 double-precision GPU teraflops in one rack. This power- and spaceefficient system can

More information

Manual for using Super Computing Resources

Manual for using Super Computing Resources Manual for using Super Computing Resources Super Computing Research and Education Centre at Research Centre for Modeling and Simulation National University of Science and Technology H-12 Campus, Islamabad

More information

A Crash course to (The) Bighouse

A Crash course to (The) Bighouse A Crash course to (The) Bighouse Brock Palen brockp@umich.edu SVTI Users meeting Sep 20th Outline 1 Resources Configuration Hardware 2 Architecture ccnuma Altix 4700 Brick 3 Software Packaged Software

More information

SURFsara HPC Cloud Workshop

SURFsara HPC Cloud Workshop SURFsara HPC Cloud Workshop www.cloud.sara.nl Tutorial 2014-06-11 UvA HPC and Big Data Course June 2014 Anatoli Danezi, Markus van Dijk cloud-support@surfsara.nl Agenda Introduction and Overview (current

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one

More information

SUN HPC SOFTWARE CLUSTERING MADE EASY

SUN HPC SOFTWARE CLUSTERING MADE EASY SUN HPC SOFTWARE CLUSTERING MADE EASY New Software Access Visualization Workstation, Thin Clients, Remote Access Developer Management OS Compilers, Workload, Linux, Solaris Debuggers, Systems and Optimization

More information

StorPool Distributed Storage Software Technical Overview

StorPool Distributed Storage Software Technical Overview StorPool Distributed Storage Software Technical Overview StorPool 2015 Page 1 of 8 StorPool Overview StorPool is distributed storage software. It pools the attached storage (hard disks or SSDs) of standard

More information

Quantum StorNext. Product Brief: Distributed LAN Client

Quantum StorNext. Product Brief: Distributed LAN Client Quantum StorNext Product Brief: Distributed LAN Client NOTICE This product brief may contain proprietary information protected by copyright. Information in this product brief is subject to change without

More information

Linux Cluster Computing An Administrator s Perspective

Linux Cluster Computing An Administrator s Perspective Linux Cluster Computing An Administrator s Perspective Robert Whitinger Traques LLC and High Performance Computing Center East Tennessee State University : http://lxer.com/pub/self2015_clusters.pdf 2015-Jun-14

More information

Low-cost storage @PSNC

Low-cost storage @PSNC Low-cost storage @PSNC Update for TF-Storage TF-Storage meeting @Uppsala, September 22nd, 2014 Agenda Motivations data center perspective Application / use-case Hardware components: bought some, will buy

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

Headline in Arial Bold 30pt. The Need For Speed. Rick Reid Principal Engineer SGI

Headline in Arial Bold 30pt. The Need For Speed. Rick Reid Principal Engineer SGI Headline in Arial Bold 30pt The Need For Speed Rick Reid Principal Engineer SGI Commodity Systems Linux Red Hat SUSE SE-Linux X86-64 Intel Xeon AMD Scalable Programming Model MPI Global Data Access NFS

More information

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC

More information

Best of Breed of an ITIL based IT Monitoring. The System Management strategy of NetEye

Best of Breed of an ITIL based IT Monitoring. The System Management strategy of NetEye Best of Breed of an ITIL based IT Monitoring The System Management strategy of NetEye by Georg Kostner 5/11/2012 1 IT Services and IT Service Management IT Services means provisioning of added value for

More information

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " 4 April 2013"

GPFS Storage Server. Concepts and Setup in Lemanicus BG/Q system Christian Clémençon (EPFL-DIT)  4 April 2013 GPFS Storage Server Concepts and Setup in Lemanicus BG/Q system" Christian Clémençon (EPFL-DIT)" " Agenda" GPFS Overview" Classical versus GSS I/O Solution" GPFS Storage Server (GSS)" GPFS Native RAID

More information

Performance in a Gluster System. Versions 3.1.x

Performance in a Gluster System. Versions 3.1.x Performance in a Gluster System Versions 3.1.x TABLE OF CONTENTS Table of Contents... 2 List of Figures... 3 1.0 Introduction to Gluster... 4 2.0 Gluster view of Performance... 5 2.1 Good performance across

More information

HPC Growing Pains. Lessons learned from building a Top500 supercomputer

HPC Growing Pains. Lessons learned from building a Top500 supercomputer HPC Growing Pains Lessons learned from building a Top500 supercomputer John L. Wofford Center for Computational Biology & Bioinformatics Columbia University I. What is C2B2? Outline Lessons learned from

More information

The safer, easier way to help you pass any IT exams. Exam : 000-115. Storage Sales V2. Title : Version : Demo 1 / 5

The safer, easier way to help you pass any IT exams. Exam : 000-115. Storage Sales V2. Title : Version : Demo 1 / 5 Exam : 000-115 Title : Storage Sales V2 Version : Demo 1 / 5 1.The IBM TS7680 ProtecTIER Deduplication Gateway for System z solution is designed to provide all of the following EXCEPT: A. ESCON attach

More information

Biowulf2 Training Session

Biowulf2 Training Session Biowulf2 Training Session 9 July 2015 Slides at: h,p://hpc.nih.gov/docs/b2training.pdf HPC@NIH website: h,p://hpc.nih.gov System hardware overview What s new/different The batch system & subminng jobs

More information

NexentaStor Enterprise Backend for CLOUD. Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V.

NexentaStor Enterprise Backend for CLOUD. Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V. NexentaStor Enterprise Backend for CLOUD Marek Lubinski Marek Lubinski Sr VMware/Storage Engineer, LeaseWeb B.V. AGENDA LeaseWeb overview Express Cloud platform Initial storage build Why NexentaStor NexentaStor

More information

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7

Introduction 1 Performance on Hosted Server 1. Benchmarks 2. System Requirements 7 Load Balancing 7 Introduction 1 Performance on Hosted Server 1 Figure 1: Real World Performance 1 Benchmarks 2 System configuration used for benchmarks 2 Figure 2a: New tickets per minute on E5440 processors 3 Figure 2b:

More information

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014

Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet. September 2014 Comparing SMB Direct 3.0 performance over RoCE, InfiniBand and Ethernet Anand Rangaswamy September 2014 Storage Developer Conference Mellanox Overview Ticker: MLNX Leading provider of high-throughput,

More information

The Greenplum Analytics Workbench

The Greenplum Analytics Workbench The Greenplum Analytics Workbench External Overview 1 The Greenplum Analytics Workbench Definition Is a 1000-node Hadoop Cluster. Pre-configured with publicly available data sets. Contains the entire Hadoop

More information

Blueprints for Scalable IBM Spectrum Protect (TSM) Disk-based Backup Solutions

Blueprints for Scalable IBM Spectrum Protect (TSM) Disk-based Backup Solutions Blueprints for Scalable IBM Spectrum Protect (TSM) Disk-based Backup Solutions Jason Basler Software Test Architect IBM Technical University/Symposia materials may not be reproduced in whole or in part

More information

Globus and the Centralized Research Data Infrastructure at CU Boulder

Globus and the Centralized Research Data Infrastructure at CU Boulder Globus and the Centralized Research Data Infrastructure at CU Boulder Daniel Milroy, daniel.milroy@colorado.edu Conan Moore, conan.moore@colorado.edu Thomas Hauser, thomas.hauser@colorado.edu Peter Ruprecht,

More information

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar Computational infrastructure for NGS data analysis José Carbonell Caballero Pablo Escobar Computational infrastructure for NGS Cluster definition: A computer cluster is a group of linked computers, working

More information

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt.

SLURM: Resource Management and Job Scheduling Software. Advanced Computing Center for Research and Education www.accre.vanderbilt. SLURM: Resource Management and Job Scheduling Software Advanced Computing Center for Research and Education www.accre.vanderbilt.edu Simple Linux Utility for Resource Management But it s also a job scheduler!

More information

SR-IOV In High Performance Computing

SR-IOV In High Performance Computing SR-IOV In High Performance Computing Hoot Thompson & Dan Duffy NASA Goddard Space Flight Center Greenbelt, MD 20771 hoot@ptpnow.com daniel.q.duffy@nasa.gov www.nccs.nasa.gov Focus on the research side

More information

Using the Windows Cluster

Using the Windows Cluster Using the Windows Cluster Christian Terboven terboven@rz.rwth aachen.de Center for Computing and Communication RWTH Aachen University Windows HPC 2008 (II) September 17, RWTH Aachen Agenda o Windows Cluster

More information

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services ALPS Supercomputing System A Scalable Supercomputer with Flexible Services 1 Abstract Supercomputing is moving from the realm of abstract to mainstream with more and more applications and research being

More information

Virtual Compute Appliance Frequently Asked Questions

Virtual Compute Appliance Frequently Asked Questions General Overview What is Oracle s Virtual Compute Appliance? Oracle s Virtual Compute Appliance is an integrated, wire once, software-defined infrastructure system designed for rapid deployment of both

More information

Lustre on Hyperion. Marc Stearman. April, 2009

Lustre on Hyperion. Marc Stearman. April, 2009 Lustre on Hyperion Marc tearman marc@llnl.gov April, 2009 This work performed under the auspices of the U.. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

More information

RED HAT ENTERPRISE VIRTUALIZATION

RED HAT ENTERPRISE VIRTUALIZATION Giuseppe Paterno' Solution Architect Jan 2010 Red Hat Milestones October 1994 Red Hat Linux June 2004 Red Hat Global File System August 2005 Red Hat Certificate System & Dir. Server April 2006 JBoss April

More information

Cray DVS: Data Virtualization Service

Cray DVS: Data Virtualization Service Cray : Data Virtualization Service Stephen Sugiyama and David Wallace, Cray Inc. ABSTRACT: Cray, the Cray Data Virtualization Service, is a new capability being added to the XT software environment with

More information

Cloud Bursting with SLURM and Bright Cluster Manager. Martijn de Vries CTO

Cloud Bursting with SLURM and Bright Cluster Manager. Martijn de Vries CTO Cloud Bursting with SLURM and Bright Cluster Manager Martijn de Vries CTO Architecture CMDaemon 2 Management Interfaces Graphical User Interface (GUI) Offers administrator full cluster control Standalone

More information

HPC Update: Engagement Model

HPC Update: Engagement Model HPC Update: Engagement Model MIKE VILDIBILL Director, Strategic Engagements Sun Microsystems mikev@sun.com Our Strategy Building a Comprehensive HPC Portfolio that Delivers Differentiated Customer Value

More information

Overview of HPC systems and software available within

Overview of HPC systems and software available within Overview of HPC systems and software available within Overview Available HPC Systems Ba Cy-Tera Available Visualization Facilities Software Environments HPC System at Bibliotheca Alexandrina SUN cluster

More information

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS) PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters from One Stop Systems (OSS) PCIe Over Cable PCIe provides greater performance 8 7 6 5 GBytes/s 4

More information

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation

SMB Advanced Networking for Fault Tolerance and Performance. Jose Barreto Principal Program Managers Microsoft Corporation SMB Advanced Networking for Fault Tolerance and Performance Jose Barreto Principal Program Managers Microsoft Corporation Agenda SMB Remote File Storage for Server Apps SMB Direct (SMB over RDMA) SMB Multichannel

More information

THE SUN STORAGE AND ARCHIVE SOLUTION FOR HPC

THE SUN STORAGE AND ARCHIVE SOLUTION FOR HPC THE SUN STORAGE AND ARCHIVE SOLUTION FOR HPC The Right Data, in the Right Place, at the Right Time José Martins Storage Practice Sun Microsystems 1 Agenda Sun s strategy and commitment to the HPC or technical

More information

HPC Software Requirements to Support an HPC Cluster Supercomputer

HPC Software Requirements to Support an HPC Cluster Supercomputer HPC Software Requirements to Support an HPC Cluster Supercomputer Susan Kraus, Cray Cluster Solutions Software Product Manager Maria McLaughlin, Cray Cluster Solutions Product Marketing Cray Inc. WP-CCS-Software01-0417

More information

Introduction to parallel computing and UPPMAX

Introduction to parallel computing and UPPMAX Introduction to parallel computing and UPPMAX Intro part of course in Parallel Image Analysis Elias Rudberg elias.rudberg@it.uu.se March 22, 2011 Parallel computing Parallel computing is becoming increasingly

More information

HPC @ CRIBI. Calcolo Scientifico e Bioinformatica oggi Università di Padova 13 gennaio 2012

HPC @ CRIBI. Calcolo Scientifico e Bioinformatica oggi Università di Padova 13 gennaio 2012 HPC @ CRIBI Calcolo Scientifico e Bioinformatica oggi Università di Padova 13 gennaio 2012 what is exact? experience on advanced computational technologies a company lead by IT experts with a strong background

More information

Building a Linux Cluster

Building a Linux Cluster Building a Linux Cluster CUG Conference May 21-25, 2001 by Cary Whitney Clwhitney@lbl.gov Outline What is PDSF and a little about its history. Growth problems and solutions. Storage Network Hardware Administration

More information

Technical Overview of Windows HPC Server 2008

Technical Overview of Windows HPC Server 2008 Technical Overview of Windows HPC Server 2008 Published: June, 2008, Revised September 2008 Abstract Windows HPC Server 2008 brings the power, performance, and scale of high performance computing (HPC)

More information

Preparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment.

Preparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment. Preparation Guide v3.0 BETA How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment. Document version 1.0 Document release date 25 th September 2012 document revisions 1 Contents 1. Overview...

More information

WebLogic on Oracle Database Appliance: Combining High Availability and Simplicity

WebLogic on Oracle Database Appliance: Combining High Availability and Simplicity WebLogic on Oracle Database Appliance: Combining High Availability and Simplicity Frances Zhao-Perez Alexandra Huff Oracle CAF Product Management Simon Haslam Technical Director O-box Safe Harbor Statement

More information

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture Ron Weiss, Exadata Product Management Exadata Database Machine Best Platform to Run the

More information

Parallel IO. Single namespace. Performance. Disk locality awareness? Data integrity. Fault tolerance. Standard interface. Network of disks?

Parallel IO. Single namespace. Performance. Disk locality awareness? Data integrity. Fault tolerance. Standard interface. Network of disks? PARALLEL IO Parallel IO Single namespace Network of disks? Performance Data replication Multiple I/O paths Disk locality awareness? Data integrity Multiple writers Locking? Fault tolerance Hardware failure

More information

1 Bull, 2011 Bull Extreme Computing

1 Bull, 2011 Bull Extreme Computing 1 Bull, 2011 Bull Extreme Computing Table of Contents HPC Overview. Cluster Overview. FLOPS. 2 Bull, 2011 Bull Extreme Computing HPC Overview Ares, Gerardo, HPC Team HPC concepts HPC: High Performance

More information

Zadara Storage Cloud A whitepaper. @ZadaraStorage

Zadara Storage Cloud A whitepaper. @ZadaraStorage Zadara Storage Cloud A whitepaper @ZadaraStorage Zadara delivers two solutions to its customers: On- premises storage arrays Storage as a service from 31 locations globally (and counting) Some Zadara customers

More information