Parallelism and Cloud Computing



Similar documents
Cloud Computing. What is Cloud Computing? Computing As A Utility. Resource Provisioning Problem 4/29/2014. Kai Shen

Communicating with devices

Hardware-Aware Analysis and. Presentation Date: Sep 15 th 2009 Chrissie C. Cui

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

Parallel Algorithm Engineering

Clouds vs Grids KHALID ELGAZZAR GOODWIN 531

Control 2004, University of Bath, UK, September 2004

Chapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

RevoScaleR Speed and Scalability

Introduction to Cloud Computing

Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) mzahran@cs.nyu.edu

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Icepak High-Performance Computing at Rockwell Automation: Benefits and Benchmarks

Choosing a Computer for Running SLX, P3D, and P5

Introduction to DISC and Hadoop

what operations can it perform? how does it perform them? on what kind of data? where are instructions and data stored?

Lecture 2: Computer Hardware and Ports.

Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two

Hadoop Parallel Data Processing

An Introduction to Parallel Computing/ Programming

9/26/2011. What is Virtualization? What are the different types of virtualization.

ultra fast SOM using CUDA

Multi-core architectures. Jernej Barbic , Spring 2007 May 3, 2007

FPGA-based Multithreading for In-Memory Hash Joins

Unit A451: Computer systems and programming. Section 2: Computing Hardware 1/5: Central Processing Unit

Computing at the HL-LHC

Machine Learning over Big Data

Software Engagement with Sleeping CPUs

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

SUBJECT: SOLIDWORKS HARDWARE RECOMMENDATIONS UPDATE

Evaluating HDFS I/O Performance on Virtualized Systems

Computer Organization

Why Computers Are Getting Slower (and what we can do about it) Rik van Riel Sr. Software Engineer, Red Hat

Software and the Concurrency Revolution

Amazing renderings of 3D data... in minutes.

10- High Performance Compu5ng

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

Operating Systems: Basic Concepts and History

Virtuoso and Database Scalability

Lecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?

How To Build A Cloud Computer

Lecture 11: Multi-Core and GPU. Multithreading. Integration of multiple processor cores on a single chip.

Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms

APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

Virtualization and Cloud Computing. Sorav Bansal

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Tableau Server 7.0 scalability

Key Attributes for Analytics in an IBM i environment

High Performance Computing in CST STUDIO SUITE

Lattice QCD Performance. on Multi core Linux Servers

Four Keys to Successful Multicore Optimization for Machine Vision. White Paper

Understanding the Performance of an X User Environment

- Nishad Nerurkar. - Aniket Mhatre

Chapter 1 Computer System Overview

22S:295 Seminar in Applied Statistics High Performance Computing in Statistics

Evoluzione dell Infrastruttura di Calcolo e Data Analytics per la ricerca

From Big Data, Data Mining, and Machine Learning. Full book available for purchase here.

Computer Performance. Topic 3. Contents. Prerequisite knowledge Before studying this topic you should be able to:

High Performance Computing. Course Notes HPC Fundamentals

Computer Architecture TDTS10

CS 159 Two Lecture Introduction. Parallel Processing: A Hardware Solution & A Software Challenge

DATA ANALYSIS II. Matrix Algorithms

ADVANCED COMPUTER ARCHITECTURE

A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster

CS2101a Foundations of Programming for High Performance Computing

Cloud Computing Capacity Planning. Maximizing Cloud Value. Authors: Jose Vargas, Clint Sherwood. Organization: IBM Cloud Labs

Emerging IT and Energy Star PC Specification Version 4.0: Opportunities and Risks. ITI/EPA Energy Star Workshop June 21, 2005 Donna Sadowy, AMD

Introduction to Microprocessors

IMCM: A Flexible Fine-Grained Adaptive Framework for Parallel Mobile Hybrid Cloud Applications

Big Data Systems CS 5965/6965 FALL 2015

Parallel Scalable Algorithms- Performance Parameters

1. INTRODUCTION Graphics 2

System Requirements Table of contents

An Implementation Of Multiprocessor Linux

Unit 4: Performance & Benchmarking. Performance Metrics. This Unit. CIS 501: Computer Architecture. Performance: Latency vs.

CHAPTER 1 INTRODUCTION

Towards Energy Efficient Query Processing in Database Management System

Design and Optimization of OpenFOAM-based CFD Applications for Hybrid and Heterogeneous HPC Platforms

Spring 2011 Prof. Hyesoon Kim

The Mainframe Virtualization Advantage: How to Save Over Million Dollars Using an IBM System z as a Linux Cloud Server

Binary search tree with SIMD bandwidth optimization using SSE

Bindel, Spring 2010 Applications of Parallel Computers (CS 5220) Week 1: Wednesday, Jan 27

Generations of the computer. processors.

SQL Server Upgrading to. and Beyond ABSTRACT: By Andy McDermid

Embedded Systems: map to FPGA, GPU, CPU?

Parallel & Distributed Optimization. Based on Mark Schmidt s slides

Understanding the Benefits of IBM SPSS Statistics Server

Multi-Threading Performance on Commodity Multi-Core Processors

Cluster Computing at HRI

White Paper. Recording Server Virtualization

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Dell Reliable Memory Technology

Tools Page 1 of 13 ON PROGRAM TRANSLATION. A priori, we have two translation mechanisms available:

GPUs for Scientific Computing

GPU Hardware and Programming Models. Jeremy Appleyard, September 2015

Muse Server Sizing. 18 June Document Version Muse

SHARED HASH TABLES IN PARALLEL MODEL CHECKING

Transcription:

Parallelism and Cloud Computing Kai Shen Parallel Computing Parallel computing: Process sub tasks simultaneously so that work can be completed faster. For instances: divide the work of matrix multiplication into subtasks to run in parallel parallelize the modeling of star movements and galaxy formation parallelize the web page ranking (at Google) parallelize the mining of social networks (at Facebook) parallelize the prediction of customer preferences (at Amazon, Netflix, VISA/Mastercard) parallelize the graphics rendering in making animated movies (at Pixar/Disney) 1 2 Why Parallel Computing? Intel x86 Evolution: Milestones The speed of uniprocessor will catch up with the need of computing? Moore s law (1965, with later revisions) circuit complexity doubles every one year (18 months, 2 years) Some argue that it is a self fulfilling prophecy Moore said in 2005: In terms of size [of transistors] you can see that we're approaching the size of atoms which is a fundamental barrier, Name Date Transistors MHz 8086 1978 29K 5 10 First 16 bit processor. Basis for IBM PC & DOS 386 1985 275K 16 33 First 32 bit processor, referred to as IA32 Pentium 4F 2004 125M 2800 3800 First 64 bit processor, referred to as x86 64 Core i7 2008 731M 2667 3333 Multicore Haswell today 1.4B Not faster Improve on core count, power reduction, not frequency increase 3 4 CSC252 - Spring 2015 1

Why Parallel Computing? Why Parallel Computing? Not always desirable to use the fastest hardware. Example: A machine with eight 3.6GHz CPUs consumes up to 100 Watts A machine with four 1.9GHz CPUs consumes no more than 4 Watts It may be more energy efficient to use multiple low power machines in parallel Multiprocessors are ubiquitous Commodity processors contain multiple computing cores because they are cost effective to produce, and power efficient to run Traditionally sequential software (e.g., desktop applications) should take advantage of it 5 6 Multicore Architecture Arguments against Parallel Computing? Last level cache becomes a significant part of the chip Multicore: add another processor core on the chip (sharing a single last level cache) Low cost (manufacturing, power) CPU performance isn t important Parallel programming is too difficult CPU CPU last level cache One chip memory Additional benefit: faster processor to processor sharing 7 8 CSC252 - Spring 2015 2

Parallelism and Dependency Computation intensive application contains tasks that can run in parallel limit available parallelism/concurrency Dependency between two operations one has to wait for another to complete Examples of dependencies? Amdahl s law: Normalized runtime = 1 fraction_enhanced + fraction_enhanced speedup_enhanced Load Balance Utilizations of parallel processors What happens if most work is assigned to one processor? Require load balancing Parallel processors are typically symmetric 9 10 Simulation of Ocean Currents Evolution of Galaxies Entity in ocean floor: a unit body of water Variables: velocity, direction, Simulation over time Velocity/direction of water body depends on the velocity/direction of nearby water bodies in the last time unit Entity in space: a star Variables: velocity, direction, location, Simulation over time Velocity/direction/location of a star depends on the mass and location of other stars in the last time unit N body simulation 11 12 CSC252 - Spring 2015 3

Ray Tracing PageRank Ray tracing PageRank scores each page the score is the converged probability for a random web surfer to arrive at the page Scores transfer through links (random click by the surfer): If pages B, C are the pages that link to A, then PR(A) = PR(B)/L(B) + PR(C)/L(C) Damping factor (for stability): The surfer has a tendency to stay at where he/she is PR(A) = (1 d)/n + d(pr(b)/l(b) + PR(C)/L(C)) A linear system of equations (N variables, N linear equations) 13 4/23/2015 CSC 258/458 - Spring 2015 14 Parallel Programming Example: Gaussian Elimination Solving a system of linear equations Reduce an equation matrix into an equivalent upper diagonal matrix A X R A1 p*a1+ A2 X = r1 r2+p*r1 More on PageRank PR(A) = (1 d)/n + d(pr(b)/l(b) + PR(C)/L(C)) Iterative solver: Start from an initial PR vector (1/N for each page) Compute a new PR vector by the above linear transformation Keep repeating the linear transformation until the PR converges (very small change after further linear transformation) Even with a large, complex web graph, often converge after dozens of iterations Each iteration is a matrix vector multiplication, so the whole computation is a series of matrix vector multiplication 15 4/23/2015 CSC 258/458 - Spring 2015 16 CSC252 - Spring 2015 4

Parallel Programming Shared Memory Parallel Systems Challenges: Control dependencies Data sharing Parallel algorithms: Identification of parallelisms in applications Design control flow and data sharing mechanism Parallel programming: Shared memory Distributed memory In shared memory parallel computers, multiple processors can access the memory simultaneously Problems: One processor s cache may contain a copy of the data that was just modified by another processor hardware support for cache coherence Two processes (semantically) atomic operations may be interleaved system support for mutual exclusion and synchronization 17 18 Distributed Memory Parallel Systems What is Cloud Computing? Parallel systems that do not share memory Less requirement on the system support Little or no hardware support Software system support for communications (point to point, group) More trouble with writing applications Data must be explicitly partitioned and transferred when needed Hard to do dynamic workload management The interesting thing about Cloud Computing is that we ve redefined Cloud Computing to include everything that we already do.... I don t understand what we would do differently in the light of Cloud Computing other than change the wording of some of our ads. Larry Ellison, quoted in the Wall Street Journal, September 26, 2008 A lot of people are jumping on the [cloud] bandwagon, but I have not heard two people say the same thing about it. Andy Isherwood, quoted in ZDnet News, December 11, 2008 19 20 CSC252 - Spring 2015 5

Resource Provisioning Problem Cloud Computing: Computing As A Utility Provisioning of computing/storage resources for Internet applications Challenges: Difficulty to predict load of Internet applications Curse of hugely successful Internet services, DoS attacks Cost of dynamic capacity expansion High upfront cost I want to try some idea and see if people are interested Cost of excess capacity (power) Pooling resources together, managed centrally (in a data center); many computing jobs and network services share the resource pool Benefits: While one application s resource needs may vary dramatically, the sum of many independent applications resource needs varies much less More efficient resource utilization through sharing Space sharing as well as time sharing Upfront setup cost is amortized over many applications Cost is proportional to use Commoditizing support for availability, data durability, 21 22 CSC252 - Spring 2015 6