HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk

Similar documents

Parallel Programming Survey

Hadoop on the Gordon Data Intensive Cluster

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

Trends in High-Performance Computing for Power Grid Applications

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

Scalable Cloud Computing Solutions for Next Generation Sequencing Data

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Kriterien für ein PetaFlop System

PACE Predictive Analytics Center of San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.

Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

HPC performance applications on Virtual Clusters

A Novel Cloud Based Elastic Framework for Big Data Preprocessing

Building a Top500-class Supercomputing Cluster at LNS-BUAP

HPC-related R&D in 863 Program

HPC Update: Engagement Model

Parallel Large-Scale Visualization

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Big Systems, Big Data

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

Bringing Big Data Modelling into the Hands of Domain Experts

A Service for Data-Intensive Computations on Virtual Clusters

GTC Presentation March 19, Copyright 2012 Penguin Computing, Inc. All rights reserved

HPC ABDS: The Case for an Integrating Apache Big Data Stack

Building Clusters for Gromacs and other HPC applications

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

Cluster Implementation and Management; Scheduling

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Can High-Performance Interconnects Benefit Memcached and Hadoop?

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

Supercomputing Resources in BSC, RES and PRACE

Data Analytics at NERSC. Joaquin Correa NERSC Data and Analytics Services

Scientific Computing Data Management Visions

Cloud Computing Where ISR Data Will Go for Exploitation

Lustre Networking BY PETER J. BRAAM

New Storage System Solutions

HPC technology and future architecture

Cluster Computing at HRI

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Sun Constellation System: The Open Petascale Computing Architecture

CRIBI. Calcolo Scientifico e Bioinformatica oggi Università di Padova 13 gennaio 2012

WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression

Enabling Large-Scale Testing of IaaS Cloud Platforms on the Grid 5000 Testbed

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Clusters: Mainstream Technology for CAE

THE SUN STORAGE AND ARCHIVE SOLUTION FOR HPC

MapReduce and Hadoop Distributed File System V I J A Y R A O

Data Semantics Aware Cloud for High Performance Analytics

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.

SURVEY ON SCIENTIFIC DATA MANAGEMENT USING HADOOP MAPREDUCE IN THE KEPLER SCIENTIFIC WORKFLOW SYSTEM

Amazon EC2 Product Details Page 1 of 5

1 Bull, 2011 Bull Extreme Computing

SURFsara HPC Cloud Workshop

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

MapReduce and Hadoop Distributed File System

Performance Comparison of Intel Enterprise Edition for Lustre* software and HDFS for MapReduce Applications

High Performance Computing OpenStack Options. September 22, 2015

Architectures for Big Data Analytics A database perspective

CSE-E5430 Scalable Cloud Computing Lecture 2

Bringing Compute to the Data Alternatives to Moving Data. Part of EUDAT s Training in the Fundamentals of Data Infrastructures

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy?

Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED

ALPS Supercomputing System A Scalable Supercomputer with Flexible Services

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

Recommended hardware system configurations for ANSYS users

High Performance Computing in CST STUDIO SUITE

Lustre * Filesystem for Cloud and Hadoop *

IMPLEMENTING GREEN IT

CMS Tier-3 cluster at NISER. Dr. Tania Moulik

Building an energy dashboard. Energy measurement and visualization in current HPC systems

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

Jeffrey D. Ullman slides. MapReduce for data intensive computing

Analysis and Optimization of Massive Data Processing on High Performance Computing Architecture

Introduction to High Performance Cluster Computing. Cluster Training for UCL Part 1

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Cloud Computing through Virtualization and HPC technologies

Big Fast Data Hadoop acceleration with Flash. June 2013

GraySort and MinuteSort at Yahoo on Hadoop 0.23

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Overview of HPC Resources at Vanderbilt

CIS 4930/6930 Spring 2014 Introduction to Data Science /Data Intensive Computing. University of Florida, CISE Department Prof.

Mixing Hadoop and HPC Workloads on Parallel Filesystems

Commoditisation of the High-End Research Storage Market with the Dell MD3460 & Intel Enterprise Edition Lustre

Transcription:

HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk

EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training EPCC is the HPC Centre of the University of Edinburgh Vital statistics: ~75 staff ~ 4M turnover from external sources Multidisciplinary and multi-funded with a large spectrum of activities and a critical mass of expertise Supports research through: Access to facilities Training courses Visitor programmes Collaborative projects HPC and Big Data 2

HPC Big compute Scientific simulation Third pillar of science Explore universe through simulation rather than experimentation Test theories Predict or validate experiments Simulate untestable science Reproduce real world in computers Generally simplified Dimensions and timescales restricted Simulation of scientific problem or environment Input of real data Output of simulated data Parameter space studies Wide range of approaches HPC and Big Data 3

Performance Trend FLOPS Yotta: 10 24 Zetta: 10 21 Exa: 10 18 Peta: 10 15 Tera: 10 12 Giga: 10 9 Mega: 10 6 Kilo: 10 3 This graph is borrowed from Wikipedia Lucas wilkins HPC and Big Data 4

Performance Trend HPC and Big Data 5

Key challenges Scale Ensuring program can utilise resources Decompose problem over processes Overheads Communication costs Synchronisation Serial parts I/O Utilisation Ensuring resources load-balanced Ensuring machines fully utilised HPC and Big Data 6

Data Intensive Computing Large amounts of data to be processed Low computing requirements Independent tasks Key challenges Distributing data to compute Minimise data movements Hadoop, MapReduce, HPCC SciDB Fault tolerance/reliability HPC and Big Data 7

Big data Worldwide LHC Computing Grid Distribute and manage LHC data; 25 PB per year Computing resource requirement too large of one site Grid technology driver OGSA-DAI: http://www.ogsadai.org.uk Distributed data access and management Federate and access resources (e.g. relational or XML databases, files or web services) via web services on the web or within grids or clouds. Query, update, transform, and combined data Enable user to focus on application-specific data analysis and processing. HPC and Big Data 8

Data Projects ADMIRE: Architecture for Data Intensive Research http://www.admire-project.eu single platform for knowledge discovery: data access, integration, pre-processing, data mining, statistical analysis, post-processing, transformation DISPEL, Java-like language for describing complex data-intensive workflows Streaming execution engine to remove data bottlenecks Visual programming tools based on the eclipse platform Library of common workflows and components HPC and Big Data 9

Example: Oncology Aim: investigate genetic causes of bowel cancer Collaborative project between EPCC and the Colon Cancer Genetics Group (CCGG) Vast amount of data Over 500,000 genetic markers from 2000 people Two-stage study Stage 1: investigated effect of each individual marker Required ~565,000 computations, O(N) problem Predicted serial runtime ~4 months on a single cpu Parallel code took 6.5 hours on 128 processors (BlueGene/L) (www.sanofi-aventis.com) HPC and Big Data 10

Example: Oncology Stage 2: investigated interactions between the gene markers Every pair of markers must be tested O(N 2 ) problem 565,000 x 565,000 2 = 1.5 billion gene interactions! Key challenges: runtime, memory, scaling & sorting HPC and Big Data 11

Oncology Runtime code expected to take 400 days, optimisation reduced this to 130 days but still too long Need a parallel code Memory Impossible to fit all the data into memory However, we only actually need 5% of the results Scaling 2D decomposition used with a task farm More chunks than processors Sorting Parallel sorting algorithm used Computed interactions between all pairs of markers 565,000 2 computations Runtime reduced from 400 days to 5 hours on 512 CPUs on HECToR 8.5x10 9 (192GB) probability values obtained Sorting performed in 5 minutes HPC and Big Data 12

Square Kilometre Array (SKA) Largest and most sensitive radio telescope in the world to be built in South Africa and Australia 3000 dishes 1 EB data generated per day HPC and Big Data 13

Facilities: EDIM1 A machine for Data Intensive Research Commissioned by EPCC & Informatics Designed for I/O-intensive applications Use commodity components Combine them in a novel way Use cheap low-power processors HPC and Big Data 14

Facilities: EDIM1 EDIM1 120 nodes Dual-Core Intel 1.6 GHz ATOM processor NVIDIA ION GPU 1 x 256 MB SSD 3 x 2TB HDD Data staging node for hot-plugging SATA hard disks for direct data upload SDSC Gordon Purpose built DIR machine 1024 compute nodes: 2 x 8-core Intel processors, 64 GB memory 64 I/O nodes: Gb Ethernet 2 x 6-core Intel processors, 48 GB memory, 16 x 300 GB SSD USB2 HPC and Big Data 15

Facilities: DiRAC and Indy Indy: Linux and Windows HPC cluster 1536 cores 24 nodes: 64 cores, 256GB 128 cores 2 nodes: 64 cores, 512GB Commercial usage focus No job length or queue restrictions DiRAC: BlueGene/Q 6144 compute nodes 98304 compute cores 1.26PFlop/s HPC and Big Data 16

Facilities: HECToR UK National HPC Service Currently 30- cabinet Cray XE6 system 90,112 cores Each node has 2 16-core AMD Opterons (2.3GHz Interlagos) 32 GB memory Peak of over 800 TF and 90 TB of memory HPC and Big Data 17

Cray XE6 Layout Compute nodes Login nodes Lustre OSS Lustre MDS NFS Server Boot/SDB node 1 GigE Backbone Cray XE6 Supercomputer Infiniband Switch 10 GigE Backup and Archive Servers Lustre high-performance, parallel filesystem HPC and Big Data

Filesystem High performance file system Lustre: /work Smaller space for /home Filesystem globally access /work from backend /home from frontend Connected over compute network HPC and Big Data

UK Research Data Facility RDF consists of 7.8PB disk 19.5 PB backup tape Provide a high capacity robust file store; Persistent infrastructure - will last beyond any one national service; Will remove end of service data issues - transfers at end of services have become increasingly lengthy; Will also ensure that data from the current HECToR service is secured - this will ensure a degree of soft landing if there is ever a gap in National Services; RDF is designed for long term data storage Currently only open to HECToR users HPC and Big Data 20

Facilities: GPGPU Test-bed Evaluate how your application performs compared to traditional CPU or develop your application to run on GPGPUs GPGPUs NVIDIA Fermi C2050 NVIDIA Fermi C2070 AMD FireStream 9270 NVIDIA K20 HPC and Big Data 21

Contact details EPCC, The University of Edinburgh JCMB, Mayfield Road, Edinburgh EH9 3JZ +44 131 650 5022 info@epcc.ed.ac.uk http://www.epcc.ed.ac.uk/ HPC and Big Data 22