# Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Save this PDF as:

Size: px
Start display at page:

Download "Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:"

## Transcription

1 CSE341T 08/31/2015 Lecture 3 Cost Model: Work, Span and Parallelism In this lecture, we will look at how one analyze a parallel program written using Cilk Plus. When we analyze the cost of an algorithm formally, we need to be reasonably precise in what model we are performing the analysis. As in CSE241, we will use asymptotic costs. These costs can then be used to compare algorithms in terms of how they scale to large inputs. For example, as you know, some sorting algorithms use O(n log n) work and others O(n 2 ). Clearly the O(n log n) algorithm scales better, but perhaps the O(n 2 ) is actually faster on small inputs. In this class we are concerned with how algorithms scale, and therefore asymptotic analysis is indeed what we want. 1 The RAM model for sequential computation: Traditionally, algorithms have been analyzed in the Random Access Machine (RAM) model. This model assumes a single processor executing one instruction after another, with no parallel operations, and that the processor has access to unbounded memory. The model contains instructions commonly found in real computer, including basic arithmetic and logical operations (e.g. +, -, *, and, or, not), reads from and writes to arbitrary memory locations, and conditional and unconditional jumps to other locations in the code. Each instruction is assumed to take a constant amount of time, and the cost of a computation is measured in terms of the number of instructions execute by the machine. This model has served well for analyzing the asymptotic runtime of sequential algorithms, and most work on sequential algorithms to date has used this model. One reason for its success is that there is an easy mapping from algorithmic pseudocode and sequential languages such as C and C++ to the model, and so it is reasonably easy to reason about the cost of algorithms. That said, this model should only be used to derive the asymptotic bounds (i.e., using big-o, big-theta and big-omega), and not for trying to predict exact running time. One reason for this is that on a real machine not all instructions take the same time, and furthermore not all machines have the same instructions. 2 The dag model for parallel computation To analyze a parallel algorithm, it s helpful to think of the execution of the program as a directed acyclic graph (dag), where nodes represent instructions, and edges represent dependencies between instructions. If there is an edge from node u to node v, then node u must be executed before 1

2 node v. A serial computation is just a chain of instructions. A parallel computation is a dag. If there is no path from node u to node v and vice versa, then there is no prescribed ordering between u and v, and they can be executed in any order, or in parallel with each other. For convenience, we won t represent each instruction as a node. Instead, we will group together a sequence of instructions that does not contain any parallel keywords (i.e., spawn, sync, or returning from a spawned function), referred to as a strand, into a single node, and the dag models dependencies between strands. To make this model concrete, let s take a look at how we model the dag for a parallel computation. Recall the Fibonacci code from last lecture: Fib(n) 1 if n 1 2 then return 1 3 x spawn FIB(n 1) 4 y FIB(n 2) 5 sync 6 return x + y The DAG representation of FIB(4) is shown below. initial strand 4" final strand continue edge spawn edge 3" 2" 2" 1" 1" 0" 1" 0" call edge strand return edge In this model, just like the RAM model, we will assume a basic set of instructions, and each instruction takes a constant amount of time, including access to memory. Unlike the RAM model, however, notice this Dag model is not really tied to the underlying machine but rather tied to the control structure of the program. As it turns out, there is a way to map the costs we derive using this model onto costs for running on parallel machines (machines with multiple processing units). 3 Performance measures For a parallel algorithm, we are concerned with the amount of parallelism in the algorithm, or in other words, how it scales with the number of processors. We will measure the parallelism 2

3 using two metrics: work and span. Roughly speaking, the work corresponds to the total number of instructions we perform, and span (also called depth or critical path length) to the number of instructions along the longest chain of dependences. In terms of the dag representation of a parallel computation, let s assume that we chop up the strands so that each node consists of only one instruction (i.e., each node has cost one). Then, the work is the total number of nodes in the graph, and the span is the number of nodes along the longest path in the dag. 1 Another way of thinking about it is: work is the running time of the program on one core, while span is the running time of the program on an infinite number of cores. 2 Henceforth, we will use T p to denote the time it takes to execute a computation on p processors, so we will use T 1 to denote work and T to denote span. Just as an example, let s analyze the work and span for the following code (note that any function can be represented in this manner): Foo 1 e 1 2 spawn BAR 3 spawn BAZ 4 e 2 5 sync 6 e 3 The work of a program is the running time on one core (as measured with the RAM model), so we simply ignore all the spawns and syncs in the program: T 1 (FOO) = T 1 (e 1 ) + T 1 (BAR) + T 1 (BAZ) + T 1 (e 2 ) + T 1 (e 3 ). The span calculation is more involved: T (FOO) = T (e 1 ) + max {T (BAR), T (BAZ), T (e 2 )} + T (e 3 ). Exercise 1 What s the work for FIB(n)? Hint: The closed form for Fibonacci number F ib i = φi ˆφ i 5, where φ = 1+ 5 and ˆφ = 1 5 (i.e., 2 2 the two roots of the equation x 2 = x + 1). Exercise 2 What s the span for FIB(n)? 1 If each node may cost different amount, then the span is the sum of the cost along the path in the dag that has the largest such sum. 2 Here we assume that the scheduler is perfect and has no overheads. 3

4 Once we know the work and span of an algorithm, its parallelism is simply defined as the work over span: P = T 1 T One way of thinking about parallelism is that it denotes the average amount of work that can be performed in parallel for each step along the span. This measure of parallelism also represents roughly how many processors we can use efficiently when we run the algorithm. The parallelism of an algorithm is dictated by both the work term and the span term. For example, suppose n = 10, 000 and if T 1 (n) = θ(n 3 ) and T (n) = θ(n log n) 10 5 then P(n) 10 7, which is a lot of parallelism. But, if T 1 (n) = θ(n 2 ) 10 8 then P(n) 10 3, which is much less parallelism. The decrease in parallelism is not because of the span was large, but because the work was reduced. Goals: In parallel algorithm design, we would like to keep the parallelism as high as possible but without increasing the work. In general, the goals in designing efficient algorithms are 1. First priority: to keep work as low as possible 2. Second priority: keep parallelism as high as possible (and hence the span as low as possible). In this course we will mostly cover parallel algorithms that is work efficient, meaning that its work is the same or close to the same (within a constant factor) as that of the best sequential algorithm. Now among the algorithms that are work-efficient, we will try to minimize the span to achieve the greatest parallelism. 4 Greedy Scheduling In order to get good performance, not only do we need a work-efficient algorithm with ample parallelism, we also need to ensure that the strands are mapped onto the executing processors efficiently. In the model that we just discussed, a program can generate lots of strands on the fly, which can lead to a huge amount of parallelism, typically much more than the number of processors available when running. The model itself does not say how strands are mapped onto the processors, however. It is the job of a scheduler to decide how strands are mapped onto executing processors (which and when). Today, we are going to look at a simple scheduler, called greedy scheduler. We say that a scheduler is greedy if whenever there is a processor available and a strand ready to execute, then the strand will be scheduled on the processor and start running immediately. 4

5 Greedy schedulers have a very nice property that is summarized by the following: Theorem 1 On an ideal parallel computer with p processors, a greedy scheduler executes a multithreaded computation with work T 1 and span T in time T p T 1 p + T (1) This is actually a very nice bound. For any scheduler, the time to execute the computation cannot be any better than T 1 p, since we have a total of T 1 work to do, and the best we can possibly do is divide all the work evenly among the processors. Also note that the time to execute the computation cannot be any better than T, since T represents the longest chain of sequential dependences. Therefore, the best we could do is: ( ) T1 T p max p, T We therefore see that a greedy scheduler does reasonably close to the best possible (within a factor of 2 of optimal). In particular, when W S the difference between the two is very small. p Indeed we can rewrite equation 1 above in terms of the parallelism P = W/S as follows: T p < W p + S = W p + W P = W ( 1 + p ) p P Therefore as long as P p (the parallelism is much greater than the number of processors) then we get near perfect speedup (perfect speedup would be W/p). Proof of Theorem 1. To prove this bound, let s consider consider the execution dag G representing the computation with work T 1 and span T. For convenience and without loss of generality, let s assume each node in G represents one instruction (i.e., unit time). At each time step, either there are p or more nodes ready (call it a complete step), or there are less than p nodes ready (call it an incomplete step). We show this time bound by bounding the number of complete and incomplete steps a greedy scheduler takes before completing the execution. Let s first consider complete steps. In a complete step, a greedy scheduler executes p nodes (picked arbitrarily among all the ready nodes), using all p processors. The scheduler can t take 5

6 more than T 1 /p number of complete steps, because otherwise the total work performed in those complete steps would be greater than T 1. Now we try to bound the number of incomplete steps. We assert that after each incomplete step, the longest path length decreases by one (we will come back to show this assertion later). Since G starts out having the longest path length of T, and the longest path length decreases by one at each incomplete step, there can be at most T number of incomplete steps. What remains to be shown is the assertion that the longest path length decreases by one after each incomplete step. Let s consider an complete step, and call the subdag that has yet to be executed at the beginning of this step as G. Say G has longest path length l. The path(s) (there can be multiple longest paths having the same length) with length l in G must starts with some node with in-degree 0 (i.e., a ready node), because otherwise G would have the longest path length l + 1. Since it s an incomplete step, the greedy scheduler executes all ready nodes with in-degree 0, thereby decreasing the longest path length by one. In the real world: No real schedulers are fully greedy. This is because there is overhead in scheduling the job. Therefore there will surely be some delay from when a job becomes ready until when it starts up. In practice, therefore, the efficiency of a scheduler is quite important to achieving good running time. Also the bounds we give do not account for memory affects. By moving a job we might have to move data along with it. Because of these affects the greedy scheduling principle should only be viewed as a rough estimate in much the same way that the RAM model or any other computational model should be just viewed as an estimate of real time. Exercise 3 Last time, we discussed that cilk for is just syntactic sugar for cilk spawn and cilk sync. How would you go about converting a for loop to spawns and syncs? Think about minimizing span while doing so. Exercise 4 What is the work and span of the following code from lecture 2: 1 parallel for i 1 to n 2 do B[i] A[i] + 1 Exercise 5 What is the work and span of GREEDYSS from lecture 2? 6

### Divide And Conquer Algorithms

CSE341T/CSE549T 09/10/2014 Lecture 5 Divide And Conquer Algorithms Recall in last lecture, we looked at one way of parallelizing matrix multiplication. At the end of the lecture, we saw the reduce SUM

### The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, cheapest first, we had to determine whether its two endpoints

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

### MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With

### Introduction to Scheduling Theory

Introduction to Scheduling Theory Arnaud Legrand Laboratoire Informatique et Distribution IMAG CNRS, France arnaud.legrand@imag.fr November 8, 2004 1/ 26 Outline 1 Task graphs from outer space 2 Scheduling

### Parallel Scalable Algorithms- Performance Parameters

www.bsc.es Parallel Scalable Algorithms- Performance Parameters Vassil Alexandrov, ICREA - Barcelona Supercomputing Center, Spain Overview Sources of Overhead in Parallel Programs Performance Metrics for

### Lecture 6 Online and streaming algorithms for clustering

CSE 291: Unsupervised learning Spring 2008 Lecture 6 Online and streaming algorithms for clustering 6.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

### CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin November 15, 2012; Lecture 10 1 / 27 Randomized online bipartite matching and the adwords problem. We briefly return to online algorithms

### U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009. Notes on Algebra

U.C. Berkeley CS276: Cryptography Handout 0.1 Luca Trevisan January, 2009 Notes on Algebra These notes contain as little theory as possible, and most results are stated without proof. Any introductory

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### 8.1 Min Degree Spanning Tree

CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree

### Analysis of Algorithms, I

Analysis of Algorithms, I CSOR W4231.002 Eleni Drinea Computer Science Department Columbia University Thursday, February 26, 2015 Outline 1 Recap 2 Representing graphs 3 Breadth-first search (BFS) 4 Applications

### Data Structures. Algorithm Performance and Big O Analysis

Data Structures Algorithm Performance and Big O Analysis What s an Algorithm? a clearly specified set of instructions to be followed to solve a problem. In essence: A computer program. In detail: Defined

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### 2.3 Scheduling jobs on identical parallel machines

2.3 Scheduling jobs on identical parallel machines There are jobs to be processed, and there are identical machines (running in parallel) to which each job may be assigned Each job = 1,,, must be processed

CS787: Advanced Algorithms Lecture 14: Online algorithms We now shift focus to a different kind of algorithmic problem where we need to perform some optimization without knowing the input in advance. Algorithms

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### [Refer Slide Time: 05:10]

Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture

### Simulation-Based Security with Inexhaustible Interactive Turing Machines

Simulation-Based Security with Inexhaustible Interactive Turing Machines Ralf Küsters Institut für Informatik Christian-Albrechts-Universität zu Kiel 24098 Kiel, Germany kuesters@ti.informatik.uni-kiel.de

### Lecture 4 Online and streaming algorithms for clustering

CSE 291: Geometric algorithms Spring 2013 Lecture 4 Online and streaming algorithms for clustering 4.1 On-line k-clustering To the extent that clustering takes place in the brain, it happens in an on-line

### Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique

### An improved on-line algorithm for scheduling on two unrestrictive parallel batch processing machines

This is the Pre-Published Version. An improved on-line algorithm for scheduling on two unrestrictive parallel batch processing machines Q.Q. Nong, T.C.E. Cheng, C.T. Ng Department of Mathematics, Ocean

### The PRAM Model and Algorithms. Advanced Topics Spring 2008 Prof. Robert van Engelen

The PRAM Model and Algorithms Advanced Topics Spring 2008 Prof. Robert van Engelen Overview The PRAM model of parallel computation Simulations between PRAM models Work-time presentation framework of parallel

### Chap 4 The Simplex Method

The Essence of the Simplex Method Recall the Wyndor problem Max Z = 3x 1 + 5x 2 S.T. x 1 4 2x 2 12 3x 1 + 2x 2 18 x 1, x 2 0 Chap 4 The Simplex Method 8 corner point solutions. 5 out of them are CPF solutions.

### 1 Approximating Set Cover

CS 05: Algorithms (Grad) Feb 2-24, 2005 Approximating Set Cover. Definition An Instance (X, F ) of the set-covering problem consists of a finite set X and a family F of subset of X, such that every elemennt

### Mathematical Induction

Mathematical Induction Victor Adamchik Fall of 2005 Lecture 2 (out of three) Plan 1. Strong Induction 2. Faulty Inductions 3. Induction and the Least Element Principal Strong Induction Fibonacci Numbers

### princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

### ! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Approximation Algorithms 11 Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of three

Lecture 10 Union-Find The union-nd data structure is motivated by Kruskal's minimum spanning tree algorithm (Algorithm 2.6), in which we needed two operations on disjoint sets of vertices: determine whether

### ! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

Approximation Algorithms Chapter Approximation Algorithms Q Suppose I need to solve an NP-hard problem What should I do? A Theory says you're unlikely to find a poly-time algorithm Must sacrifice one of

### CS/COE 1501 http://cs.pitt.edu/~bill/1501/

CS/COE 1501 http://cs.pitt.edu/~bill/1501/ Lecture 01 Course Introduction Meta-notes These notes are intended for use by students in CS1501 at the University of Pittsburgh. They are provided free of charge

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### 6.080 / 6.089 Great Ideas in Theoretical Computer Science Spring 2008

MIT OpenCourseWare http://ocw.mit.edu 6.080 / 6.089 Great Ideas in Theoretical Computer Science Spring 008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

### IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti

### CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Linda Shapiro Today Registration should be done. Homework 1 due 11:59 pm next Wednesday, January 14 Review math essential

### A Comparison of General Approaches to Multiprocessor Scheduling

A Comparison of General Approaches to Multiprocessor Scheduling Jing-Chiou Liou AT&T Laboratories Middletown, NJ 0778, USA jing@jolt.mt.att.com Michael A. Palis Department of Computer Science Rutgers University

### CMPSCI611: Approximating MAX-CUT Lecture 20

CMPSCI611: Approximating MAX-CUT Lecture 20 For the next two lectures we ll be seeing examples of approximation algorithms for interesting NP-hard problems. Today we consider MAX-CUT, which we proved to

### Distributed Computing over Communication Networks: Maximal Independent Set

Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.

### Exponential time algorithms for graph coloring

Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].

### The Classes P and NP

The Classes P and NP We now shift gears slightly and restrict our attention to the examination of two families of problems which are very important to computer scientists. These families constitute the

### arxiv:1112.0829v1 [math.pr] 5 Dec 2011

How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly

### CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma Please Note: The references at the end are given for extra reading if you are interested in exploring these ideas further. You are

### , each of which contains a unique key value, say k i , R 2. such that k i equals K (or to determine that no such record exists in the collection).

The Search Problem 1 Suppose we have a collection of records, say R 1, R 2,, R N, each of which contains a unique key value, say k i. Given a particular key value, K, the search problem is to locate the

### SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

### Applied Algorithm Design Lecture 5

Applied Algorithm Design Lecture 5 Pietro Michiardi Eurecom Pietro Michiardi (Eurecom) Applied Algorithm Design Lecture 5 1 / 86 Approximation Algorithms Pietro Michiardi (Eurecom) Applied Algorithm Design

### Algebraic Computation Models. Algebraic Computation Models

Algebraic Computation Models Ζυγομήτρος Ευάγγελος ΜΠΛΑ 201118 Φεβρουάριος, 2013 Reasons for using algebraic models of computation The Turing machine model captures computations on bits. Many algorithms

### Theoretical Computer Science (Bridging Course) Complexity

Theoretical Computer Science (Bridging Course) Complexity Gian Diego Tipaldi A scenario You are a programmer working for a logistics company Your boss asks you to implement a program that optimizes the

### Linear Programming I

Linear Programming I November 30, 2003 1 Introduction In the VCR/guns/nuclear bombs/napkins/star wars/professors/butter/mice problem, the benevolent dictator, Bigus Piguinus, of south Antarctica penguins

### Lecture 3: Finding integer solutions to systems of linear equations

Lecture 3: Finding integer solutions to systems of linear equations Algorithmic Number Theory (Fall 2014) Rutgers University Swastik Kopparty Scribe: Abhishek Bhrushundi 1 Overview The goal of this lecture

### Solutions to Homework 6

Solutions to Homework 6 Debasish Das EECS Department, Northwestern University ddas@northwestern.edu 1 Problem 5.24 We want to find light spanning trees with certain special properties. Given is one example

Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait

### Virtual Time and Timeout in Client-Server Networks

Virtual Time and Timeout in Client-Server Networks Jayadev Misra July 13, 2011 Contents 1 Introduction 2 1.1 Background.............................. 2 1.1.1 Causal Model of Virtual Time...............

### System Copy GT Manual 1.8 Last update: 2015/07/13 Basis Technologies

System Copy GT Manual 1.8 Last update: 2015/07/13 Basis Technologies Table of Contents Introduction... 1 Prerequisites... 2 Executing System Copy GT... 3 Program Parameters / Selection Screen... 4 Technical

### Data Structures and Algorithms Written Examination

Data Structures and Algorithms Written Examination 22 February 2013 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students: Write First Name, Last Name, Student Number and Signature where

### Performance metrics for parallelism

Performance metrics for parallelism 8th of November, 2013 Sources Rob H. Bisseling; Parallel Scientific Computing, Oxford Press. Grama, Gupta, Karypis, Kumar; Parallel Computing, Addison Wesley. Definition

V. Adamchik 1 Graph Theory Victor Adamchik Fall of 2005 Plan 1. Basic Vocabulary 2. Regular graph 3. Connectivity 4. Representing Graphs Introduction A.Aho and J.Ulman acknowledge that Fundamentally, computer

### Find-The-Number. 1 Find-The-Number With Comps

Find-The-Number 1 Find-The-Number With Comps Consider the following two-person game, which we call Find-The-Number with Comps. Player A (for answerer) has a number x between 1 and 1000. Player Q (for questioner)

### Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Algorithms Efficiency of algorithms Computational resources: time and space Best, worst and average case performance How to compare algorithms: machine-independent measure of efficiency Growth rate Complexity

### Discuss the size of the instance for the minimum spanning tree problem.

3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can

Approximation Algorithms Chapter Approximation Algorithms Q. Suppose I need to solve an NP-hard problem. What should I do? A. Theory says you're unlikely to find a poly-time algorithm. Must sacrifice one

### From Last Time: Remove (Delete) Operation

CSE 32 Lecture : More on Search Trees Today s Topics: Lazy Operations Run Time Analysis of Binary Search Tree Operations Balanced Search Trees AVL Trees and Rotations Covered in Chapter of the text From

### Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms

CSCE 310J Data Structures & Algorithms P, NP, and NP-Complete Dr. Steve Goddard goddard@cse.unl.edu CSCE 310J Data Structures & Algorithms Giving credit where credit is due:» Most of the lecture notes

### Load Balancing and Switch Scheduling

EE384Y Project Final Report Load Balancing and Switch Scheduling Xiangheng Liu Department of Electrical Engineering Stanford University, Stanford CA 94305 Email: liuxh@systems.stanford.edu Abstract Load

### Single machine parallel batch scheduling with unbounded capacity

Workshop on Combinatorics and Graph Theory 21th, April, 2006 Nankai University Single machine parallel batch scheduling with unbounded capacity Yuan Jinjiang Department of mathematics, Zhengzhou University

### Intro. to the Divide-and-Conquer Strategy via Merge Sort CMPSC 465 CLRS Sections 2.3, Intro. to and various parts of Chapter 4

Intro. to the Divide-and-Conquer Strategy via Merge Sort CMPSC 465 CLRS Sections 2.3, Intro. to and various parts of Chapter 4 I. Algorithm Design and Divide-and-Conquer There are various strategies we

### Lectures 5-6: Taylor Series

Math 1d Instructor: Padraic Bartlett Lectures 5-: Taylor Series Weeks 5- Caltech 213 1 Taylor Polynomials and Series As we saw in week 4, power series are remarkably nice objects to work with. In particular,

### Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

### ON THE COMPLEXITY OF THE GAME OF SET. {kamalika,pbg,dratajcz,hoeteck}@cs.berkeley.edu

ON THE COMPLEXITY OF THE GAME OF SET KAMALIKA CHAUDHURI, BRIGHTEN GODFREY, DAVID RATAJCZAK, AND HOETECK WEE {kamalika,pbg,dratajcz,hoeteck}@cs.berkeley.edu ABSTRACT. Set R is a card game played with a

### SIMS 255 Foundations of Software Design. Complexity and NP-completeness

SIMS 255 Foundations of Software Design Complexity and NP-completeness Matt Welsh November 29, 2001 mdw@cs.berkeley.edu 1 Outline Complexity of algorithms Space and time complexity ``Big O'' notation Complexity

### Reminder: Complexity (1) Parallel Complexity Theory. Reminder: Complexity (2) Complexity-new

Reminder: Complexity (1) Parallel Complexity Theory Lecture 6 Number of steps or memory units required to compute some result In terms of input size Using a single processor O(1) says that regardless of

### Reminder: Complexity (1) Parallel Complexity Theory. Reminder: Complexity (2) Complexity-new GAP (2) Graph Accessibility Problem (GAP) (1)

Reminder: Complexity (1) Parallel Complexity Theory Lecture 6 Number of steps or memory units required to compute some result In terms of input size Using a single processor O(1) says that regardless of

### Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances Jialin Zhang Tsinghua University zhanggl02@mails.tsinghua.edu.cn Wei Chen Microsoft Research Asia weic@microsoft.com Abstract

### 1. (First passage/hitting times/gambler s ruin problem:) Suppose that X has a discrete state space and let i be a fixed state. Let

Copyright c 2009 by Karl Sigman 1 Stopping Times 1.1 Stopping Times: Definition Given a stochastic process X = {X n : n 0}, a random time τ is a discrete random variable on the same probability space as

### NP-Completeness I. Lecture 19. 19.1 Overview. 19.2 Introduction: Reduction and Expressiveness

Lecture 19 NP-Completeness I 19.1 Overview In the past few lectures we have looked at increasingly more expressive problems that we were able to solve using efficient algorithms. In this lecture we introduce

### Parallelization: Binary Tree Traversal

By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First

### Algorithm Design and Analysis

Algorithm Design and Analysis LECTURE 27 Approximation Algorithms Load Balancing Weighted Vertex Cover Reminder: Fill out SRTEs online Don t forget to click submit Sofya Raskhodnikova 12/6/2011 S. Raskhodnikova;

### Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/06

CS880: Approximations Algorithms Scribe: Matt Elder Lecturer: Shuchi Chawla Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/06 3.1 Set Cover The Set Cover problem is: Given a set of

### Recall that two vectors in are perpendicular or orthogonal provided that their dot

Orthogonal Complements and Projections Recall that two vectors in are perpendicular or orthogonal provided that their dot product vanishes That is, if and only if Example 1 The vectors in are orthogonal

### THE COST OF CAPITAL THE EFFECT OF CHANGES IN GEARING

December 2015 Examinations Chapter 19 Free lectures available for - click here THE COST OF CAPITAL THE EFFECT OF CHANGES IN GEARING 103 1 Introduction In this chapter we will look at the effect of gearing

### PRACTICE BOOK COMPUTER SCIENCE TEST. Graduate Record Examinations. This practice book contains. Become familiar with. Visit GRE Online at www.gre.

This book is provided FREE with test registration by the Graduate Record Examinations Board. Graduate Record Examinations This practice book contains one actual full-length GRE Computer Science Test test-taking

### 8.1 Makespan Scheduling

600.469 / 600.669 Approximation Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programing: Min-Makespan and Bin Packing Date: 2/19/15 Scribe: Gabriel Kaptchuk 8.1 Makespan Scheduling Consider an instance

### LECTURE 1 I. Inverse matrices We return now to the problem of solving linear equations. Recall that we are trying to find x such that IA = A

LECTURE I. Inverse matrices We return now to the problem of solving linear equations. Recall that we are trying to find such that A = y. Recall: there is a matri I such that for all R n. It follows that

### Algorithms and Data Structures

Algorithms and Data Structures CMPSC 465 LECTURES 20-21 Priority Queues and Binary Heaps Adam Smith S. Raskhodnikova and A. Smith. Based on slides by C. Leiserson and E. Demaine. 1 Trees Rooted Tree: collection

### GRAPH THEORY LECTURE 4: TREES

GRAPH THEORY LECTURE 4: TREES Abstract. 3.1 presents some standard characterizations and properties of trees. 3.2 presents several different types of trees. 3.7 develops a counting method based on a bijection

### Homework 5 Solutions

Homework 5 Solutions 4.2: 2: a. 321 = 256 + 64 + 1 = (01000001) 2 b. 1023 = 512 + 256 + 128 + 64 + 32 + 16 + 8 + 4 + 2 + 1 = (1111111111) 2. Note that this is 1 less than the next power of 2, 1024, which

### Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650

### Graphs without proper subgraphs of minimum degree 3 and short cycles

Graphs without proper subgraphs of minimum degree 3 and short cycles Lothar Narins, Alexey Pokrovskiy, Tibor Szabó Department of Mathematics, Freie Universität, Berlin, Germany. August 22, 2014 Abstract

### JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004

Scientiae Mathematicae Japonicae Online, Vol. 10, (2004), 431 437 431 JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS Ondřej Čepeka and Shao Chin Sung b Received December May 12, 2003; revised February

### 1. LINEAR EQUATIONS. A linear equation in n unknowns x 1, x 2,, x n is an equation of the form

1. LINEAR EQUATIONS A linear equation in n unknowns x 1, x 2,, x n is an equation of the form a 1 x 1 + a 2 x 2 + + a n x n = b, where a 1, a 2,..., a n, b are given real numbers. For example, with x and

### An Innocent Investigation

An Innocent Investigation D. Joyce, Clark University January 2006 The beginning. Have you ever wondered why every number is either even or odd? I don t mean to ask if you ever wondered whether every number

### Big Data Systems CS 5965/6965 FALL 2015

Big Data Systems CS 5965/6965 FALL 2015 Today General course overview Expectations from this course Q&A Introduction to Big Data Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2015.html

### Data storage Tree indexes

Data storage Tree indexes Rasmus Pagh February 7 lecture 1 Access paths For many database queries and updates, only a small fraction of the data needs to be accessed. Extreme examples are looking or updating

### A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions

A Second Course in Mathematics Concepts for Elementary Teachers: Theory, Problems, and Solutions Marcel B. Finan Arkansas Tech University c All Rights Reserved First Draft February 8, 2006 1 Contents 25

### CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.

Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure

### Sequential Data Structures

Sequential Data Structures In this lecture we introduce the basic data structures for storing sequences of objects. These data structures are based on arrays and linked lists, which you met in first year

### Performance Metrics for Parallel Programs. 8 March 2010

Performance Metrics for Parallel Programs 8 March 2010 Content measuring time towards analytic modeling, execution time, overhead, speedup, efficiency, cost, granularity, scalability, roadblocks, asymptotic

### MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS

MODEL DRIVEN DEVELOPMENT OF BUSINESS PROCESS MONITORING AND CONTROL SYSTEMS Tao Yu Department of Computer Science, University of California at Irvine, USA Email: tyu1@uci.edu Jun-Jang Jeng IBM T.J. Watson