Parallel Computing with MATLAB



Similar documents
Speed up numerical analysis with MATLAB

Speeding up MATLAB and Simulink Applications

High Performance Computing in CST STUDIO SUITE

Bringing Big Data Modelling into the Hands of Domain Experts

Calcul Parallèle sous MATLAB

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.

Accelerating CFD using OpenFOAM with GPUs

ST810 Advanced Computing

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Introduction to GPU Programming Languages

Data Analysis with MATLAB The MathWorks, Inc. 1

2015 The MathWorks, Inc. 1

Parallel Computing using MATLAB Distributed Compute Server ZORRO HPC

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

GPUs for Scientific Computing

Purchase of High Performance Computing (HPC) Central Compute Resources by Northwestern Researchers

Autodesk Revit 2016 Product Line System Requirements and Recommendations

Origins, Evolution, and Future Directions of MATLAB Loren Shure

A GPU COMPUTING PLATFORM (SAGA) AND A CFD CODE ON GPU FOR AEROSPACE APPLICATIONS

CUDA programming on NVIDIA GPUs

Image and Video Processing with MATLAB

What s New in MATLAB and Simulink

Introduction to GPU hardware and to CUDA

Cornell University Center for Advanced Computing

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

:Introducing Star-P. The Open Platform for Parallel Application Development. Yoel Jacobsen E&M Computing LTD

HPC Cluster Decisions and ANSYS Configuration Best Practices. Diana Collier Lead Systems Support Specialist Houston UGM May 2014

ultra fast SOM using CUDA

Introduction to Matlab Distributed Computing Server (MDCS) Dan Mazur and Pier-Luc St-Onge December 1st, 2015

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

Cloud Computing through Virtualization and HPC technologies

Cluster Scalability of ANSYS FLUENT 12 for a Large Aerodynamics Case on the Darwin Supercomputer

Introduction to MATLAB Gergely Somlay Application Engineer

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Revit products will use multiple cores for many tasks, using up to 16 cores for nearphotorealistic

Overview of HPC Resources at Vanderbilt

Understanding the Benefits of IBM SPSS Statistics Server

Cornell University Center for Advanced Computing

The High Performance Internet of Things: using GVirtuS for gluing cloud computing and ubiquitous connected devices

HPC with Multicore and GPUs

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

ACCELERATING COMMERCIAL LINEAR DYNAMIC AND NONLINEAR IMPLICIT FEA SOFTWARE THROUGH HIGH- PERFORMANCE COMPUTING

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage

HPC Wales Skills Academy Course Catalogue 2015

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

Recent Advances in HPC for Structural Mechanics Simulations

Solving Big Data Problems in Computer Vision with MATLAB Loren Shure

How To Build A Trading Engine In A Microsoft Microsoft Matlab (A Trading Engine)

High-Performance Computing

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Multicore Parallel Computing with OpenMP

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

vnas Series All-in-one NAS with virtualization platform

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

OPC COMMUNICATION IN REAL TIME

L20: GPU Architecture and Models

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

RWTH GPU Cluster. Sandra Wienke November Rechen- und Kommunikationszentrum (RZ) Fotos: Christian Iwainsky

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

Final Project Report. Trading Platform Server

10- High Performance Compu5ng

1 Bull, 2011 Bull Extreme Computing

Using the Windows Cluster

Parallel Programming Survey

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Integrating MATLAB into your C/C++ Product Development Workflow Andy Thé Product Marketing Image Processing Applications

Numerix CrossAsset XL and Windows HPC Server 2008 R2

Scientific Computing Programming with Parallel Objects

MATLAB Distributed Computing Server Cloud Center User s Guide

OpenPOWER Outlook AXEL KOEHLER SR. SOLUTION ARCHITECT HPC

wu.cloud: Insights Gained from Operating a Private Cloud System

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

GPGPU accelerated Computational Fluid Dynamics

Comparing the Network Performance of Windows File Sharing Environments

IBM Platform Computing Cloud Service Ready to use Platform LSF & Symphony clusters in the SoftLayer cloud

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

MATLAB in Business Critical Applications Arvind Hosagrahara Principal Technical Consultant

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

Shared Parallel File System

Parallel Simplification of Large Meshes on PC Clusters

Converting Models from Floating Point to Fixed Point for Production Code Generation

Echtzeittesten mit MathWorks leicht gemacht Simulink Real-Time Tobias Kuschmider Applikationsingenieur

Logically a Linux cluster looks something like the following: Compute Nodes. user Head node. network

Pedraforca: ARM + GPU prototype

Large-Data Software Defined Visualization on CPUs

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Integrated Grid Solutions. and Greenplum

STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING

Enterprise HPC & Cloud Computing for Engineering Simulation. Barbara Hutchings Director, Strategic Partnerships ANSYS, Inc.

Dell High-Performance Computing Clusters and Reservoir Simulation Research at UT Austin.

Transcription:

Parallel Computing with MATLAB Scott Benway Senior Account Manager Jiro Doke, Ph.D. Senior Application Engineer 2013 The MathWorks, Inc. 1

Acceleration Strategies Applied in MATLAB Approach Options Best coding practices Preallocation, vectorization, profiling ( Speeding Up MATLAB Applications ) More hardware More Processors, Cores, or GPUs (MATLAB Parallel Computing Tools) Integration with other languages C/C++, Fortran (MEX, MATLAB Coder) 3

Agenda Introduction to parallel computing tools Using multicore/multi-processor computers Using graphics processing units (GPUs) Scaling up to a cluster 4

Using More Hardware Built-in multithreading Automatically enabled in MATLAB since R2008a Multiple threads in a single MATLAB computation engine www.mathworks.com/discovery/multicore-matlab.html Parallel Computing using explicit techniques Multiple computation engines controlled by a single session Perform MATLAB Computations on GPUs High-level constructs to let you parallelize MATLAB applications 5

Going Beyond Serial MATLAB Applications MATLAB Desktop (Client) Worker Worker Worker Worker Worker Worker 6

Parallel Computing Toolbox for the Desktop Desktop Computer Local Speed up parallel applications Take advantage of GPUs Prototype code for your cluster MATLAB Desktop (Client) 7

Scale Up to Clusters and Clouds Desktop Computer Computer Cluster Local Cluster MATLAB Desktop (Client) Scheduler 8

Agenda Introduction to parallel computing tools Using multicore/multi-processor computers Using graphics processing units (GPUs) Scaling up to a cluster 9

Ease of Use Programming Parallel Applications (CPU) Built-in support with Toolboxes Greater Control 10

Example: Optimizing Cell Tower Position Built-in parallel support With Parallel Computing Toolbox use built-in parallel algorithms in Optimization Toolbox Run optimization in parallel Use pool of MATLAB workers 11

Tools Providing Parallel Computing Support Optimization Toolbox, Global Optimization Toolbox Statistics Toolbox Signal Processing Toolbox Neural Network Toolbox Image Processing Toolbox BLOCKSETS Directly leverage functions in Parallel Computing Toolbox www.mathworks.com/builtin-parallel-support 12

Ease of Use Programming Parallel Applications (CPU) Built-in support with Toolboxes Simple programming constructs: parfor, batch, distributed Greater Control 13

Independent Tasks or Iterations Ideal problem for parallel computing No dependencies or communications between tasks Examples: parameter sweeps, Monte Carlo simulations Time Time blogs.mathworks.com/loren/2009/10/02/using-parfor-loops-getting-up-and-running/ 14

Example: Parameter Sweep of ODEs Parallel for-loops Parameter sweep of ODE system Damped spring oscillator Sweep through different values of damping and stiffness Record peak value for each simulation 5 m x b x k x 0 1,2,... 1,2,... Convert for to parfor Use pool of MATLAB workers 15

Ease of Use Programming Parallel Applications (CPU) Built-in support with Toolboxes Simple programming constructs: parfor, batch, distributed Advanced programming constructs: createjob, labsend, spmd Greater Control 17

Agenda Introduction to parallel computing tools Using multicore/multi-processor computers Using graphics processing units (GPUs) Scaling up to a cluster 18

What is a Graphics Processing Unit (GPU) Originally for graphics acceleration, now also used for scientific calculations Massively parallel array of integer and floating point processors Typically hundreds of processors per card GPU cores complement CPU cores Dedicated high-speed memory * Parallel Computing Toolbox requires NVIDIA GPUs with Compute Capability 1.3 or higher, including NVIDIA Tesla 20-series products. See a complete listing at www.nvidia.com/object/cuda_gpus.html 19

Performance Gain with More Hardware Using More Cores (CPUs) Using GPUs Core 1 Core 2 Core 3 Core 4 GPU cores Cache Device Memory 20

Ease of Use Programming Parallel Applications (GPU) Built-in support with Toolboxes Simple programming constructs: gpuarray, gather Advanced programming constructs: arrayfun, bsxfun, spmd Greater Control Interface for experts: CUDAKernel, MEX support www.mathworks.com/gpu 21

Example: Solving 2D Wave Equation GPU Computing 2 u t 2 = 2 u x 2 + 2 u y 2 Intel Xeon Processor W3690 (3.47GHz), NVIDIA Tesla K20 GPU 22

Agenda Introduction to parallel computing tools Using multicore/multi-processor computers Using graphics processing units (GPUs) Scaling up to a cluster 23

Example: Migrate from Desktop to Cloud Change hardware without changing algorithmic code 5 m x b x k x 0 1,2,... 1,2,... 24

Use MATLAB Distributed Computing Server Desktop Computer 1. Prototype code Local MATLAB Desktop (Client) MATLAB code Profile (Local) 25

Use MATLAB Distributed Computing Server Computer Cluster 1. Prototype code Cluster 2. Get access to an enabled cluster Scheduler Profile (Cluster) 26

Use MATLAB Distributed Computing Server Desktop Computer Computer Cluster 1. Prototype code Local Cluster 2. Get access to an enabled cluster MATLAB Desktop (Client) MATLAB code Scheduler MATLAB code 3. Switch cluster profile to run on cluster resources Profile (Local) Profile (Cluster) 27

Take Advantage of Cluster Hardware Offload computation: Free up desktop Access better computers Computer Cluster Cluster Scale speed-up: Use more cores Go from hours to minutes Scale memory: Utilize distributed arrays MATLAB Desktop (Client) Solve larger problems without re-coding algorithms Scheduler 28

Offloading Computations Send desktop code to cluster resources No parallelism required within code Submit directly from MATLAB Computer Cluster Cluster Leverage supplied infrastructure File transfer / path augmentation Job monitoring Simplified retrieval of results Scale offloaded computations MATLAB code Scheduler 29

Offload Computations with batch MATLAB Desktop (Client) Work Result Worker Worker Worker Worker batch( ) 30

Offload and Scale Computations with batch MATLAB Desktop (Client) Work Result Worker Worker Worker Worker batch(,'pool', ) 31

Example: Parameter Sweep of ODEs Offload and Scale Processing Offload processing to workers: batch 5 m x b x k x 0 1,2,... 1,2,... Scale offloaded processing: batch(,'pool', ) Retrieve results from job: fetchoutputs 32

Benchmark: Parameter Sweep of ODEs Scaling case study for a fixed problem size with a cluster Workers Computation (minutes) Speed-up 1 173 1 16 13 13 32 6.4 27 64 3.2 55 96 2.1 83 128 1.6 109 160 1.3 134 192 1.1 158 Processor: Intel Xeon E5-2670 16 cores per node 33

Distributing Large Data 11 26 41 12 27 42 13 28 43 MATLAB Desktop (Client) Worker Worker Worker 14 29 44 15 30 45 16 31 46 17 32 47 17 33 48 19 34 49 Worker 20 35 50 21 36 51 22 37 52 Remotely Manipulate Array from Client Distributed Array Lives on the Workers 34

Investigation: Distributed Calculations Effect of number of computers on execution time Time (s) N 1 node, multithreaded 2 nodes, 32W Distributed 4 nodes, 64W 4000 2 3 3 8000 16 14 12 16000 126 102 67 20000 244 187 118 32000-664 394 40000 - - 710 Processor: Intel Xeon E5-2670 16 cores, 60 GB RAM per compute node 10 Gigabit Ethernet 35

MATLAB Distributed Computing Server Extension of desktop parallel computing Pre-built framework and infrastructure Computer Cluster Cluster Simplified license and maintenance Scheduler 36

Dynamic Licensing Model Users have access to their licensed products Server does not check out any licenses on the client User can exit MATLAB once the job is queued MATLAB User B MATLAB User A Computer Cluster Cluster Scheduler 37

Ease of Use Job Schedulers MathWorks Job Scheduler Direct support for specific schedulers (Platform LSF, Microsoft HPCS, PBS) Greater Control Open API to support other schedulers www.mathworks.com/products/distriben/supported 38

Summary Easily develop parallel MATLAB applications without being a parallel programming expert Speed up the execution of your MATLAB applications using additional hardware Develop parallel applications on your desktop and easily scale to a cluster when needed 39

For more information Visit http://www.mathworks.com/products/parallel-computing 2013 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders. 40