Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.



Similar documents
Managing large clusters resources

Challenges for Data Driven Systems

Part I Courses Syllabus

SAP and Hortonworks Reference Architecture

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

HPC Wales Skills Academy Course Catalogue 2015

The GPU Accelerated Data Center. Marc Hamilton, August 27, 2015

Ø Teaching Evaluations. q Open March 3 through 16. Ø Final Exam. q Thursday, March 19, 4-7PM. Ø 2 flavors: q Public Cloud, available to public

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Bringing Big Data Modelling into the Hands of Domain Experts

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

Applications to Computational Financial and GPU Computing. May 16th. Dr. Daniel Egloff

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Parallel Programming Survey

Stream Processing on GPUs Using Distributed Multimedia Middleware

BIG DATA What it is and how to use?

E6895 Advanced Big Data Analytics Lecture 14:! NVIDIA GPU Examples and GPU on ios devices

Parallel Computing for Data Science

Toronto 26 th SAP BI. Leap Forward with SAP

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Contents. Preface Acknowledgements. Chapter 1 Introduction 1.1

Introduction to GPU Programming Languages

Intro to GPU computing. Spring 2015 Mark Silberstein, , Technion 1

Amazon EC2 Product Details Page 1 of 5

The Top Six Advantages of CUDA-Ready Clusters. Ian Lumb Bright Evangelist

Using In-Memory Computing to Simplify Big Data Analytics

NoSQL and Hadoop Technologies On Oracle Cloud

2015 The MathWorks, Inc. 1

GPU for Scientific Computing. -Ali Saleh

Next Generation GPU Architecture Code-named Fermi

Hadoop2, Spark Big Data, real time, machine learning & use cases. Cédric Carbone Twitter

Chapter 19 Cloud Computing for Multimedia Services

Shark Installation Guide Week 3 Report. Ankush Arora

Introduction to Cloud Computing

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011

Supercomputing and Big Data: Where are the Real Boundaries and Opportunities for Synergy?

Using an In-Memory Data Grid for Near Real-Time Data Analysis

Enabling the Use of Data

Enhancing Cloud-based Servers by GPU/CPU Virtualization Management

Cluster, Grid, Cloud Concepts

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

Characterizing Task Usage Shapes in Google s Compute Clusters

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Datacenter Operating Systems

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu

A SURVEY ON MAPREDUCE IN CLOUD COMPUTING

Application of Predictive Analytics for Better Alignment of Business and IT

STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

How To Handle Big Data With A Data Scientist

Lecture 10 - Functional programming: Hadoop and MapReduce

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

A survey on platforms for big data analytics

Scalability in the Cloud HPC Convergence with Big Data in Design, Engineering, Manufacturing

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

Chapter 2 Parallel Architecture, Software And Performance

10- High Performance Compu5ng

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

Building Scalable Applications Using Microsoft Technologies

Overview of HPC Resources at Vanderbilt

GigaSpaces Real-Time Analytics for Big Data

HPC with Multicore and GPUs

Elastic Application Platform for Market Data Real-Time Analytics. for E-Commerce

Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

Grid Computing for Artificial Intelligence

<Insert Picture Here> Oracle and/or Hadoop And what you need to know

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

Big Data Database Revenue and Market Forecast,

Software: Systems and. Application Software. Software and Hardware. Types of Software. Software can represent 75% or more of the total cost of an IS.

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Benchmark Study on Distributed XML Filtering Using Hadoop Distribution Environment. Sanjay Kulhari, Jian Wen UC Riverside

Zero Downtime In Multi tenant Software as a Service Systems

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Graphical Processing Units to Accelerate Orthorectification, Atmospheric Correction and Transformations for Big Data

CSE-E5430 Scalable Cloud Computing Lecture 11

Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia

What s New in SharePoint 2016 (On- Premise) for IT Pros

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

Parallel Firewalls on General-Purpose Graphics Processing Units

Parallel Algorithm Engineering

Amazon Web Services 100 Success Secrets

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Next Generation Operating Systems

Hadoop and Map-Reduce. Swati Gore

Transcription:

Parallel Computing: Strategies and Implications Dori Exterman CTO IncrediBuild.

In this session we will discuss Multi-threaded vs. Multi-Process Choosing between Multi-Core or Multi- Threaded development Considerations in making your choices Scalability Summary and Q&A

Moore s Law Going Multicore Increase in clock speed, allowing software to automatically run faster. No longer obliged to Moore s law Instead of doing everything in a linear fashion, programmers need to ensure applications are designed for parallel tasking.

Multi-threaded Development: Pros Easy and fast to communicate and share data between threads. Easy to communicate with parent process. Beneficial for large datasets that can t be divided to sub sets. Supported by many libraries.

Multi-threaded Development: Cons One thread crashes the whole process crashes Multi-threaded apps are hard to debug Shared data requires a set of (difficult to maintain) locks Too many threads too much time in contextswitching. Not scalable to other machines / public cloud

Multi-process Development: Pros Includes multi-threaded by definition. Process crash won t harm other processes. Easier to debug Less lock issues depending on the infrastructure/api used. Scalable distributed computing \ public cloud

Multi-process Development: Cons Custom development required in order to communicate between processes Requires special mechanism to share memory data between processes Smaller set of supporting libraries

Multi-threaded vs. Multi-core: List of Considerations Synch/no-synch Large data sets Ease of development Scaling - Compute time and intensity

Synchronization/State Maintenance Easier in multi-threaded Easier debug in multi-threaded In multi-process, will require custom development Many mechanisms exists to allow this -- both for multi-process and multi-threaded

Ease of Development Multi-Threaded vs. Multi-Process Pre-made libraries for multi-threaded MP - Easier debug/maintenance/code understanding / less bugs will be introduced with new commits MP - Can be developed and maintained by less experienced developers MP - Requires developer to maintain encapsulation Bugs are less complicated to solve when MT is not involved

Throughput & scalability - 1/2 How many tasks do I have to execute in parallel dozens /millions How many tasks will my product need to support in the future? How much time will each task take? Milliseconds or minutes? How much memory will each task require? How large is the data that each task requires?

Throughput & scalability - 2/2 How complex is each task s code? Is my system real-time Would you benefit from scalability? Business wise - scalability as a service? Offloading

Other Questions Do my tasks require special hardware? Will my tasks perform better with specific hardware? Connection to a database

Let s have a look at a real-life scenario Maven vs Make (What are these?) Common Many tasks Tasks are atomic Minimal communication Small dataset Scalability

Scenarios Build system approach: Maven Multi-threaded code was only added recently:

Scenarios build system approach: Maven Not Scalable: Users are asking for it, but it is still missing multi-process invocation.

Single Core

Single Machine Multi-Core

Make Multi-process (same as other modern build tools) Better memory usage Scalable 3 rd party solutions (DistCC, IceCC, IncrediBuild)

Multiple machines, multi-core

Real Life Scenarios

Common high-throughput scenarios and decision Multi-threaded: Large Mesh Memory (CAD, AI (connections between objects), weather forecasting, genetic algorithms) Needs all the data to be in-memory and accessible for all tasks.

Common high-throughput scenarios and decision Multi-threaded: Real time financial transactions, defense systems Latency and initialization time are very important.

Common high-throughput scenarios and decision Multi-Threading: Applications which most of the calculation (business logic) is DB related (stored procedures) CRM, ERP, SAP

Common high-throughput scenarios and decisions Multi-threaded code: Simple scenarios Real-time applications Long initialization time with short computation time Communication, synchs and states Only a few highly intensive tasks.

Common high-throughput scenarios and decision Multi-Processing: Large independent dataset: when we don t need all data in-memory at the same time. Low dependencies between data entities. Simulations (transient analysis) Ansys, PSCad Financial derivatives Morgan Stanley Rendering

Common high-throughput scenarios and decision Multi-Processing: When we have Many calculations with small business units. Solution: batch tasks Compilations Testing (unit tests, API tests, stress tests) Code analysis Asset builds Math calculations (Banking, Insurance, Diamond industry).

Common high-throughput scenarios and decision Multi-Processing: GPU applications that scale to use many GPUs (CUDA, OpenCL)

Common high-throughput scenarios and decision Multi-Processing: When applications stream output of computation: NVIDIA Shield, Onlive Service, Netflix

Common high-throughput scenarios and decision Multi-Processing: Scalability Offloading Full cloud solution SAAS

Summary

Summary - Writing Scalable Apps: Architectural considerations : Can my execution be broken to many independent tasks? Can the work be divided over more than a single host? Do I have large datasets I need to work with? How many relations are in my dataset? How much time would it take my customers to execute a large scenario do I have large scenarios?

Summary - Writing Scalable app: Technical considerations: Do I have any special communication and synchronization requirements? Dev Complexity - What risks can we have in terms of race-conditions, timing issues, sharing violations does it justify multi-threading programming? Would I like to scale processing performance by using the cloud or cluster?

Moving to multi-core Some of the known infrastructures to using multi-core/multi-machine solutions: MapReduce (Google dedicated grid) Hadoop (Open source dedicated grid) Univa Grid Engine(Propriety dedicated grid) IncrediBuild (Propriety - ad-hoc \ process virtualization grid)

Dori Exterman, IncrediBuild CTO dori.exterman@incredibuild.com Download IncrediBuild today free on 4 core local machine http://www.incredibuild.com