COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

Similar documents
HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect

Trends in High-Performance Computing for Power Grid Applications

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

PACE Predictive Analytics Center of San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

Accelerating CFD using OpenFOAM with GPUs

Linux Cluster Computing An Administrator s Perspective

High Performance Computing in CST STUDIO SUITE

Building a Top500-class Supercomputing Cluster at LNS-BUAP

How To Compare Amazon Ec2 To A Supercomputer For Scientific Applications

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

HPC-related R&D in 863 Program

Jean-Pierre Panziera Teratec 2011

Mississippi State University High Performance Computing Collaboratory Brief Overview. Trey Breckenridge Director, HPC

Parallel Analysis and Visualization on Cray Compute Node Linux

1 Bull, 2011 Bull Extreme Computing

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Parallel Computing. Introduction

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Overview of HPC Resources at Vanderbilt

BSC - Barcelona Supercomputer Center

Cloud Computing and Amazon Web Services

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

Cornell University Center for Advanced Computing

HPC Update: Engagement Model

Cloud Computing through Virtualization and HPC technologies

Parallel Computing with MATLAB

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

CS140: Parallel Scientific Computing. Class Introduction Tao Yang, UCSB Tuesday/Thursday. 11:00-12:15 GIRV 1115

Clusters: Mainstream Technology for CAE

Monitoring Infrastructure for Superclusters: Experiences at MareNostrum

Sun Constellation System: The Open Petascale Computing Architecture

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Next Generation GPU Architecture Code-named Fermi

Cluster Computing at HRI

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Performance Evaluation of Amazon EC2 for NASA HPC Applications!

wu.cloud: Insights Gained from Operating a Private Cloud System

Lecture 2 Cloud Computing & Virtualization. Cloud Application Development (SE808, School of Software, Sun Yat-Sen University) Yabo (Arber) Xu

How To Build A Supermicro Computer With A 32 Core Power Core (Powerpc) And A 32-Core (Powerpc) (Powerpowerpter) (I386) (Amd) (Microcore) (Supermicro) (

Infrastructure Matters: POWER8 vs. Xeon x86

SURFsara HPC Cloud Workshop

Building Clusters for Gromacs and other HPC applications

Lecture 1: the anatomy of a supercomputer

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

HPC Cloud. Focus on your research. Floris Sluiter Project leader SARA

Enabling Technologies for Distributed Computing

Introduction to Cloud Computing

Overview of HPC systems and software available within

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Real-time Process Network Sonar Beamformer

Cloud Panel Service Evaluation Scenarios


Cornell University Center for Advanced Computing

Kriterien für ein PetaFlop System

Computer Graphics Hardware An Overview

Cloud Data Center Acceleration 2015

Unified Computing Systems

ioscale: The Holy Grail for Hyperscale

Comparing Cloud Computing Resources for Model Calibration with PEST

Data Centers and Cloud Computing

Enabling Technologies for Distributed and Cloud Computing

SPARC64 VIIIfx: CPU for the K computer

Remote & Collaborative Visualization. Texas Advanced Compu1ng Center

OpenMP Programming on ScaleMP

Computational infrastructure for NGS data analysis. José Carbonell Caballero Pablo Escobar

CMS Tier-3 cluster at NISER. Dr. Tania Moulik

Silviu Panica, Marian Neagul, Daniela Zaharie and Dana Petcu (Romania)

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Thematic Unit of Excellence on Computational Materials Science Solid State and Structural Chemistry Unit, Indian Institute of Science

PRIMERGY server-based High Performance Computing solutions

SUN ORACLE EXADATA STORAGE SERVER

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

High Performance Computing. Course Notes HPC Fundamentals

New Jersey Big Data Alliance

SR-IOV In High Performance Computing

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

A Very Brief History of High-Performance Computing

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Introduction to Engineering Using Robotics Experiments Lecture 18 Cloud Computing

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Parallel Programming Survey

Transcription:

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1) Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State University (SDSU) Posted: 01/22/15 Updated: 01/22/15

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 2/22 Mary Thomas Table of Contents 1 Misc Information 2 Motivation for HPC HPC Performance 3 Next Time

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 3/22 Mary Thomas Misc Information Today Course overview, policies etc. Introduction to HPC (Part 1) Add Codes: The course is full, but it is normal for students to drop/modify schedules: I will start an add list

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 4/22 Mary Thomas Misc Information Course Info Course Website: http://www-rohan.sdsu.edu/faculty/mthomas/courses/ sp15/comp605/ Add Codes: The course is full, but it is normal for students to drop/modify schedules. Be sure to sign up on the add request sheet.

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 5/22 Mary Thomas Motivation for HPC My serial code is working, why should I parallelize it? Compute bound Large number of iterations Long time to converge Data bound Memory footprint is larger than RAM (common for science apps) Output may take FOREVER to save/dump/transfer contention for resources/other processes What other processes are running on the machine? What other data is running on the network? Timeout problems?

78"/$9'%#0':;<&";"#6%6+/#' COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 6/22 ' Motivation for HPC!!" #$%&'()*+,-./,0$&1,2*-')3, Mary Thomas 78"'0"6%+&'/A'8/B'"%-8'0+AA"$"#6';/0)&"'+C'"5"-)6"0'+#'!(,.'+C'C8/B#'+ :6'8%C'6/'I"'#/6"0'68%6'68"'<$/-"0)$"'C8/B#'+#'D+E)$"'FG'+C'I%C"0'/#'68 General Curvilinear Environmental Model C6"<';"68/0H' ' Simulate processes along the coast Serial version, Simple Seamount, appx 8k lines 1 km /cell, 100x50x50 (small job) large memory: 100 arrays O (107 ) 200k iters = 6 hours simulation time Twall = 70 hours = O(1014 ) ops Typical simulations need to run for days/weeks/months/years Jaguar (ORNL), 2x105 cores (=3GHz, 16GB), 1.75x1012 flops Could run this job in 0.3x10-2 seconds (note: est. does not include time to fire up 200k cores) Motivation for parallel computing!!"#$%&'()*'!+,-'./0%1',2'3.45673'809&:',;'!%0<1",;0+'=1&>?' ' Source: http://www.xsede.org '

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 7/22 Mary Thomas Motivation for HPC Porting Code to Parallel has Many Challenges I have 200k cores, now what do I do with them? How to distribute the data? Is it memory/io bound)? How to distribute the computation? Is it compute bound? How do you synchronize the results?

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 8/22 Mary Thomas Motivation for HPC What is Parallel Computing and Why do We Need it? Large problems: Biz: BM has 433,000 employees; how do you update paycheck/empoyee data? Science: Hurricane Predictions, GoM is appx 800x800 km = 6.4x10 5 points/array. Typically have 10 2 arrays. Forecast models use multiple models, multiple clusters, take 12 hours (wall clock) to forecast out 3-4 days on parallel computers (probably 500 nodes). See http://www.srh.noaa.gov/ssd/nwpmodel/html/nhcmodel.htm, f Large data: These models generate massive data sets (Tera-to-peta-bytes per simulation). My coastal ocean model generates Vel(U,V,W), Pressure,Temp,Salinity files once each iteration at 2.5MB/file. 6 hour test run is 2x10 5 iterations = 3x10 6 MBytes (but we don t have room so we only store 100 slices). Where do you store the data? How do you move it? Sneaker Net? Fast networks are required Power consumption is big issue

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 9/22 Mary Thomas Motivation for HPC Typical HPC Computing Resources Compute HPC systems & Clusters GPU Clouds Archival & Storage high speed servers with large storage Clouds Network Visualization Cyberinfrastructure

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 10/22 Mary Thomas Motivation for HPC HPC Units of Measure What is the scale of modern parallel resources & computations Unit Flops Flop/s Definition floating point operations floating point operations per second Prefix #Flops/sec 2 Bytes 10 n 2 n Mega Mflop/s Mbyte 10 6 2 20 = 1,048,576 Giga Gflop/s Gbyte 10 9 2 30 = 1,073,741,824 Tera Tflop/s Tbyte 10 12 2 40 = 10,995,211,627,776 Peta Pflop/s Pbyte 10 15 2 50 = 1,125,899,906,842,624 Exa Eflop/s Ebyte 10 18 2 60 = 1.15292152 18

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 11/22 Mary Thomas Motivation for HPC A Large HPC Machine: Kraken Hostname kraken.nics.teragrid.org Manufacturer Cray Model XT5 Processor Cores 112896 Nodes 9408 Memory 147.00 TB Peak Performance 1174.00 TFlops Disk 2400.00 TB Source: http://www.xsede.org COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 12/22 Mary Thomas Motivation for HPC Cloud Computing Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). There are many types of public cloud computing: Infrastructure as a service (IaaS), Platform as a service (PaaS), Software as a service (SaaS) Storage as a service (STaaS) Security as a service (SECaaS) Data as a service (DaaS) Business process as a service (BPaaS) Test environment as a service (TEaaS) Desktop as a service (DaaS) API as a service (APIaaS) Cyberinfrastructure Sp15/Images COMP605 Source: http://en.wikipedia.org/wiki/cloud_computing

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 13/22 Mary Thomas Motivation for HPC Cyberinfrastructure (CI) Cyberinfrastructure - Research environment or infrastructure based upon distributed computer, information and communication technology. (From: Atkins PITCAC Report, 2003) Resources: data acquisition, data storage, data management, data integration, data mining, data visualization, computing and information processing services distributed over the Internet beyond the scope of a single institution. * Scientific Comptuing: a technological and sociological solution to the problem of efficiently connecting laboratories, data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge. * Cyberinfrastructure = HPC + Cloud/Grid + Networks * Source: http://en.wikipedia.org/wiki/cyberinfrastructure

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 14/22 Mary Thomas Motivation for HPC Example of CI Project: CyberWeb Provide a bridge between users and high-end resources, emerging technologies and cyberinfrastructure. Simplify use by using common/familiar Web and emerging technologies Facilitate access to and utilization of Science Applications. Apps run in heterogeneous environment Plug-n-play mode

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 15/22 Mary Thomas Motivation for HPC GPU Computing GPU (graphics processing unit) + CPU accelerates scientific and engineering applications. CPUs - a few cores GPUs - thousands of smaller, more efficient cores designed for parallel performance. Serial code run on the CPU while parallel portions run on the GPU Source: http://www.nvidia.com

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 16/22 Mary Thomas HPC Performance HPC Requirements Supercomputers: Moores Law - compute cycles exceeding ability to move data Networks needed to move data around to resources Large data sets (GB in 96, but now peta-bytes) Computational apps will always use up resources: Just increase resolution: e.g. 3x3x3 = 27 > 5x5x5 = 125 New architectures impacting HPC libraries, programming models, languages Security: desire to protect national resources NSF (and other gov agencies) spend money to build infrastructure COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 17/22 Mary Thomas HPC Performance Technology Growth Laws Moore s Law: Processing power of a cpu/core doubles every 18 months; corollary, computers become faster and the price of a given level of computing power halves every 18 months (Gordon Moore,70 s). Gilder s Law: Total bandwidth of communication systems triples every twelve months (George Gilder). Metcalfe s Law: The value of a network is proportional to the square of the number of nodes; so, as a network grows, the value of being connected to it grows exponentially, while the cost per user remains the same or even reduces (Robert Metcalfe, originator of Ethernet) Kryder s Law: The density of information on hard drives growing faster than Moore s law, and is doubling roughly every 13 months. Source: http://www.jimpinto.com/writings/techlaws.html, and http://www.scientificamerican.com/article/kryders-law/

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 18/22 Mary Thomas HPC Performance Technology Growth Laws Source: G. Stix, Triumph of Light. Scientific American. January 2001

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 19/22 Mary Thomas HPC Performance COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 19/22 Mary Thomas HPC Performance HPC Performance Source: http://www.top500.org COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 20/22 Mary Thomas HPC Performance HPC and CI evolution Explosion of computer technologies in the late 90s had a significant impact on HPC Advances in commodity computers allows researchers to build clusters equivalent to big iron Compute TOP 500 numbers: June 2014: National Super Computer Center, Guangzhou Cores: 3,120,000 Tflops/s: 54902.4 Power: 17808 kw 06-367 Tflops/s IBM BlueGene 131072 processors Hardware: 0.7 GHz PowerPC 440 04 - Total 92 TFlops IBM BlueGene ( 360 GF); 32768 processors Hardware: XEON, dual CPU, 2GHz 97 - Total 17 TFlops, Sun NOW (33 GFlops 95 - Total 4.4 TFlops, no clusters Source: http://www.top500.org

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 21/22 Mary Thomas HPC Performance Top500 Figure: Source: http://blog.infinibandta.org/tag/petascale/

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 22/22 Mary Thomas Next Time

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 22/22 Mary Thomas Next Time COMP605-Sp15/Images Next Time Next class: 01/22/15 Topics: Unix basics/using student cluster, etc., Introduction to HPC (Part 2) Homework 1 posted: Due date: 11:59 p.m PST, 09/07/14