COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1) Mary Thomas Department of Computer Science Computational Science Research Center (CSRC) San Diego State University (SDSU) Posted: 01/22/15 Updated: 01/22/15

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 2/22 Mary Thomas Table of Contents 1 Misc Information 2 Motivation for HPC HPC Performance 3 Next Time

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 3/22 Mary Thomas Misc Information Today Course overview, policies etc. Introduction to HPC (Part 1) Add Codes: The course is full, but it is normal for students to drop/modify schedules: I will start an add list

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 4/22 Mary Thomas Misc Information Course Info Course Website: http://www-rohan.sdsu.edu/faculty/mthomas/courses/ sp15/comp605/ Add Codes: The course is full, but it is normal for students to drop/modify schedules. Be sure to sign up on the add request sheet.

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 5/22 Mary Thomas Motivation for HPC My serial code is working, why should I parallelize it? Compute bound Large number of iterations Long time to converge Data bound Memory footprint is larger than RAM (common for science apps) Output may take FOREVER to save/dump/transfer contention for resources/other processes What other processes are running on the machine? What other data is running on the network? Timeout problems?

78"/$9'%#0':;<&";"#6%6+/#' COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 6/22 ' Motivation for HPC!!" #$%&'()*+,-./,0$&1,2*-')3, Mary Thomas 78"'0"6%+&'/A'8/B'"%-8'0+AA"$"#6';/0)&"'+C'"5"-)6"0'+#'!(,.'+C'C8/B#'+ :6'8%C'6/'I"'#/6"0'68%6'68"'<$/-"0)$"'C8/B#'+#'D+E)$"'FG'+C'I%C"0'/#'68 General Curvilinear Environmental Model C6"<';"68/0H' ' Simulate processes along the coast Serial version, Simple Seamount, appx 8k lines 1 km /cell, 100x50x50 (small job) large memory: 100 arrays O (107 ) 200k iters = 6 hours simulation time Twall = 70 hours = O(1014 ) ops Typical simulations need to run for days/weeks/months/years Jaguar (ORNL), 2x105 cores (=3GHz, 16GB), 1.75x1012 flops Could run this job in 0.3x10-2 seconds (note: est. does not include time to fire up 200k cores) Motivation for parallel computing!!"#$%&'()*'!+,-'./0%1',2'3.45673'809&:',;'!%0<1",;0+'=1&>?' ' Source: http://www.xsede.org '

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 7/22 Mary Thomas Motivation for HPC Porting Code to Parallel has Many Challenges I have 200k cores, now what do I do with them? How to distribute the data? Is it memory/io bound)? How to distribute the computation? Is it compute bound? How do you synchronize the results?

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 8/22 Mary Thomas Motivation for HPC What is Parallel Computing and Why do We Need it? Large problems: Biz: BM has 433,000 employees; how do you update paycheck/empoyee data? Science: Hurricane Predictions, GoM is appx 800x800 km = 6.4x10 5 points/array. Typically have 10 2 arrays. Forecast models use multiple models, multiple clusters, take 12 hours (wall clock) to forecast out 3-4 days on parallel computers (probably 500 nodes). See http://www.srh.noaa.gov/ssd/nwpmodel/html/nhcmodel.htm, f Large data: These models generate massive data sets (Tera-to-peta-bytes per simulation). My coastal ocean model generates Vel(U,V,W), Pressure,Temp,Salinity files once each iteration at 2.5MB/file. 6 hour test run is 2x10 5 iterations = 3x10 6 MBytes (but we don t have room so we only store 100 slices). Where do you store the data? How do you move it? Sneaker Net? Fast networks are required Power consumption is big issue

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 9/22 Mary Thomas Motivation for HPC Typical HPC Computing Resources Compute HPC systems & Clusters GPU Clouds Archival & Storage high speed servers with large storage Clouds Network Visualization Cyberinfrastructure

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 10/22 Mary Thomas Motivation for HPC HPC Units of Measure What is the scale of modern parallel resources & computations Unit Flops Flop/s Definition floating point operations floating point operations per second Prefix #Flops/sec 2 Bytes 10 n 2 n Mega Mflop/s Mbyte 10 6 2 20 = 1,048,576 Giga Gflop/s Gbyte 10 9 2 30 = 1,073,741,824 Tera Tflop/s Tbyte 10 12 2 40 = 10,995,211,627,776 Peta Pflop/s Pbyte 10 15 2 50 = 1,125,899,906,842,624 Exa Eflop/s Ebyte 10 18 2 60 = 1.15292152 18

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 11/22 Mary Thomas Motivation for HPC A Large HPC Machine: Kraken Hostname kraken.nics.teragrid.org Manufacturer Cray Model XT5 Processor Cores 112896 Nodes 9408 Memory 147.00 TB Peak Performance 1174.00 TFlops Disk 2400.00 TB Source: http://www.xsede.org COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 12/22 Mary Thomas Motivation for HPC Cloud Computing Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). There are many types of public cloud computing: Infrastructure as a service (IaaS), Platform as a service (PaaS), Software as a service (SaaS) Storage as a service (STaaS) Security as a service (SECaaS) Data as a service (DaaS) Business process as a service (BPaaS) Test environment as a service (TEaaS) Desktop as a service (DaaS) API as a service (APIaaS) Cyberinfrastructure Sp15/Images COMP605 Source: http://en.wikipedia.org/wiki/cloud_computing

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 13/22 Mary Thomas Motivation for HPC Cyberinfrastructure (CI) Cyberinfrastructure - Research environment or infrastructure based upon distributed computer, information and communication technology. (From: Atkins PITCAC Report, 2003) Resources: data acquisition, data storage, data management, data integration, data mining, data visualization, computing and information processing services distributed over the Internet beyond the scope of a single institution. * Scientific Comptuing: a technological and sociological solution to the problem of efficiently connecting laboratories, data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge. * Cyberinfrastructure = HPC + Cloud/Grid + Networks * Source: http://en.wikipedia.org/wiki/cyberinfrastructure

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 14/22 Mary Thomas Motivation for HPC Example of CI Project: CyberWeb Provide a bridge between users and high-end resources, emerging technologies and cyberinfrastructure. Simplify use by using common/familiar Web and emerging technologies Facilitate access to and utilization of Science Applications. Apps run in heterogeneous environment Plug-n-play mode

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 15/22 Mary Thomas Motivation for HPC GPU Computing GPU (graphics processing unit) + CPU accelerates scientific and engineering applications. CPUs - a few cores GPUs - thousands of smaller, more efficient cores designed for parallel performance. Serial code run on the CPU while parallel portions run on the GPU Source: http://www.nvidia.com

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 16/22 Mary Thomas HPC Performance HPC Requirements Supercomputers: Moores Law - compute cycles exceeding ability to move data Networks needed to move data around to resources Large data sets (GB in 96, but now peta-bytes) Computational apps will always use up resources: Just increase resolution: e.g. 3x3x3 = 27 > 5x5x5 = 125 New architectures impacting HPC libraries, programming models, languages Security: desire to protect national resources NSF (and other gov agencies) spend money to build infrastructure COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 17/22 Mary Thomas HPC Performance Technology Growth Laws Moore s Law: Processing power of a cpu/core doubles every 18 months; corollary, computers become faster and the price of a given level of computing power halves every 18 months (Gordon Moore,70 s). Gilder s Law: Total bandwidth of communication systems triples every twelve months (George Gilder). Metcalfe s Law: The value of a network is proportional to the square of the number of nodes; so, as a network grows, the value of being connected to it grows exponentially, while the cost per user remains the same or even reduces (Robert Metcalfe, originator of Ethernet) Kryder s Law: The density of information on hard drives growing faster than Moore s law, and is doubling roughly every 13 months. Source: http://www.jimpinto.com/writings/techlaws.html, and http://www.scientificamerican.com/article/kryders-law/

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 18/22 Mary Thomas HPC Performance Technology Growth Laws Source: G. Stix, Triumph of Light. Scientific American. January 2001

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 19/22 Mary Thomas HPC Performance COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 19/22 Mary Thomas HPC Performance HPC Performance Source: http://www.top500.org COMP605-Sp15/Images

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 20/22 Mary Thomas HPC Performance HPC and CI evolution Explosion of computer technologies in the late 90s had a significant impact on HPC Advances in commodity computers allows researchers to build clusters equivalent to big iron Compute TOP 500 numbers: June 2014: National Super Computer Center, Guangzhou Cores: 3,120,000 Tflops/s: 54902.4 Power: 17808 kw 06-367 Tflops/s IBM BlueGene 131072 processors Hardware: 0.7 GHz PowerPC 440 04 - Total 92 TFlops IBM BlueGene ( 360 GF); 32768 processors Hardware: XEON, dual CPU, 2GHz 97 - Total 17 TFlops, Sun NOW (33 GFlops 95 - Total 4.4 TFlops, no clusters Source: http://www.top500.org

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 21/22 Mary Thomas HPC Performance Top500 Figure: Source: http://blog.infinibandta.org/tag/petascale/

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 22/22 Mary Thomas Next Time

COMP/CS 605: Lecture 01 Posted: 01/22/15 Updated: 01/22/15 22/22 Mary Thomas Next Time COMP605-Sp15/Images Next Time Next class: 01/22/15 Topics: Unix basics/using student cluster, etc., Introduction to HPC (Part 2) Homework 1 posted: Due date: 11:59 p.m PST, 09/07/14