Why? A central concept in Computer Science. Algorithms are ubiquitous.

Similar documents

Lecture 7: NP-Complete Problems

NP-Completeness. CptS 223 Advanced Data Structures. Larry Holder School of Electrical Engineering and Computer Science Washington State University

Discuss the size of the instance for the minimum spanning tree problem.

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms

Introduction to Algorithms. Part 3: P, NP Hard Problems

Data Structures and Algorithms Written Examination

Reductions & NP-completeness as part of Foundations of Computer Science undergraduate course

1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification)

Analysis of Binary Search algorithm and Selection Sort algorithm

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

CSC 373: Algorithm Design and Analysis Lecture 16

Approximation Algorithms

Computer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Tutorial 8. NP-Complete Problems

Chapter Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Introduction to Logic in Computer Science: Autumn 2006

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

In mathematics, it is often important to get a handle on the error term of an approximation. For instance, people will write

Theoretical Computer Science (Bridging Course) Complexity

Chapter. NP-Completeness. Contents

Introduction to computer science

The Classes P and NP

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

P versus NP, and More

Catalan Numbers. Thomas A. Dowling, Department of Mathematics, Ohio State Uni- versity.

Welcome to... Problem Analysis and Complexity Theory , 3 VU

Near Optimal Solutions

On the Relationship between Classes P and NP

Chapter 3. if 2 a i then location: = i. Page 40

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

Algorithm Design and Analysis

NP-complete? NP-hard? Some Foundations of Complexity. Prof. Sven Hartmann Clausthal University of Technology Department of Informatics

Warshall s Algorithm: Transitive Closure

A simple algorithm with no simple verication

Solutions to Homework 6

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

Analysis of Computer Algorithms. Algorithm. Algorithm, Data Structure, Program

Applied Algorithm Design Lecture 5

Why Study NP- hardness. NP Hardness/Completeness Overview. P and NP. Scaling 9/3/13. Ron Parr CPS 570. NP hardness is not an AI topic

CAD Algorithms. P and NP

One last point: we started off this book by introducing another famously hard search problem:

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Single machine parallel batch scheduling with unbounded capacity

Data Structures. Algorithm Performance and Big O Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

Solution of Linear Systems

CSC 180 H1F Algorithm Runtime Analysis Lecture Notes Fall 2015

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

Reminder: Complexity (1) Parallel Complexity Theory. Reminder: Complexity (2) Complexity-new

Mathematics for Algorithm and System Analysis

Integer Factorization using the Quadratic Sieve

Mathematical Induction. Lecture 10-11

OHJ-2306 Introduction to Theoretical Computer Science, Fall

Problem Set 7 Solutions

Bicolored Shortest Paths in Graphs with Applications to Network Overlay Design

NP-completeness and the real world. NP completeness. NP-completeness and the real world (2) NP-completeness and the real world

Cloud Computing is NP-Complete

Section IV.1: Recursive Algorithms and Recursion Trees

Introduction to NP-Completeness Written and copyright c by Jie Wang 1

NP-Completeness and Cook s Theorem

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Complexity Classes P and NP

Scheduling Shop Scheduling. Tim Nieberg

Determinants can be used to solve a linear system of equations using Cramer s Rule.

ON THE COMPLEXITY OF THE GAME OF SET.

Social Media Mining. Graph Essentials

Notes on Complexity Theory Last updated: August, Lecture 1

Logic in Computer Science: Logic Gates

Notes on Complexity Theory Last updated: August, Lecture 1

ARTICLE IN PRESS. European Journal of Operational Research xxx (2004) xxx xxx. Discrete Optimization. Nan Kong, Andrew J.

Transportation Polytopes: a Twenty year Update

Euler Paths and Euler Circuits

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 313]

Guessing Game: NP-Complete?

Every tree contains a large induced subgraph with all degrees odd

Chapter 1. Computation theory

Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1

DATA ANALYSIS II. Matrix Algorithms

SCORE SETS IN ORIENTED GRAPHS

CoNP and Function Problems

Classification - Examples

THE PROBLEM WORMS (1) WORMS (2) THE PROBLEM OF WORM PROPAGATION/PREVENTION THE MINIMUM VERTEX COVER PROBLEM

The Goldberg Rao Algorithm for the Maximum Flow Problem

South Carolina College- and Career-Ready (SCCCR) Algebra 1

Discrete Mathematics Problems

Chapter 1. NP Completeness I Introduction. By Sariel Har-Peled, December 30, Version: 1.05

Tetris is Hard: An Introduction to P vs NP

A Working Knowledge of Computational Complexity for an Optimizer

DATA STRUCTURES USING C

136 CHAPTER 4. INDUCTION, GRAPHS AND TREES

WOLLONGONG COLLEGE AUSTRALIA. Diploma in Information Technology

Transcription:

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online shopping). Document preparation/processing. Scheduling flights. Manufacturing. Medicine. 1

Algorithm: A well defined computational procedure to solve a problem. Focus: Combinatorial problems (i.e., problems whose solution space is discrete). Problem Specification: Input to the problem. Output to be produced. Problem 1: Finding the maximum value. Input: An array A[1.. n] of n integers. Output: The maximum value in the input. 2

Problem 2: Sorting into non-decreasing order. Input: An array A[1.. n] of n integers. Output: A permutation of the array elements so that A[1] A[2]... A[n]. Problem 3: Zero Sum. Input: An array A[1.. n] of n integers. (The integer values may be positive, negative or zero.) Output: True if the array has two elements A[i] and A[j] (i j) such that the sum A[i] + A[j] is zero; False otherwise. Note: Zero Sum is a decision problem. 3

Definition: Given an array A[1.. n], a block is a subarray A[i.. j], where 1 i j n. Note: Each element A[i] is a block by itself. Problem 4: Maximum Block Sum. Input: An array A[1.. n] of n integers. (The integer values may be positive, negative or zero.) Output: The maximum value among the sums of all the blocks of A. Note: Maximum Block Sum is an optimization problem. 4

Problem 5: Shortest Path. Input: A network G consisting of nodes and edges, with each edge having a length (non-negative number); two nodes u and v. Output: A shortest path between u and v in G. Example: a 4 b 5 7 c 3 4 3 4 u 3 3 d 3 v e 2 f Note: Shortest Path is also an optimization problem. 5

Boolean Satisfiability Problem: Terminology and Notation: Boolean variable x: Takes on a value from {True, False}. Complement of x is denoted by x. (The symbols x and x are called literals.) Boolean operators (connectives): e.g. And (denoted by ), Or (denoted by ). A Boolean formula is constructed using literals, connectives and parentheses. 6

Example: The following formula F uses three Boolean variables x 1, x 2 and x 3. F = (x 1 x 2 x 3 ) (x 2 x 3 ) (x 1 x 3 ) (a) Let x 1 = True, x 2 = True and x 3 = True. assignment, F evaluates to False. For this (b) Let x 1 = True, x 2 = True and x 3 = False. For this assignment, F evaluates to True. Definition: A Boolean formula F is satisfiable if there is at least one assignment of values to the variables in F for which F evaluates to True. 7

Examples: Formula F = (x 1 x 2 x 3 ) (x 2 x 3 ) (x 1 x 3 ) is satisfiable. (One satisfying assignment for F is x 1 = True, x 2 = True and x 3 = False.) Formula F 1 defined by F 1 = (x 1 ) (x 2 ) (x 1 x 2 ) is not satisfiable. Problem 6: Boolean Satisfiability (SAT) Input: A formula F constructed using Boolean variables x 1, x 2,..., x n, their complements and Boolean operators. Output: True if F satisfiable and False otherwise. Note: SAT is also a decision problem. 8

Exercises: 1. Find the number of satisfying assignments for the formula F given on page 7. 2. Construct a formula F 2 using three variables x 1, x 2 and x 3 such that F 2 has exactly one satisfying assignment. Indicate the satisfying assignment. 3. Construct a Boolean formula F 3 using two variables x 1 and x 2 such that F 3 evaluates to True for every assignment. 9

An Algorithm for Finding the Maximum: Input: Array A[1.. n] of integers. Pseudocode: 1. Max = A[1] 2. for i = 2 to n do if (A[i] > 3. Print Max Max) then Max = A[i] Correctness of an Algorithm: Algorithm must halt for every input instance. Must produce the correct output for every input instance. 10

Analyzing an Algorithm: Estimating the resources (e.g. running time, memory) needed in a machine independent manner. Useful in comparing candidate algorithms for a problem. Computational Model: Each primitive operation (e.g. arithmetic operation, comparison, assignment) takes one unit of time. Running time for a specific input: Number of primitive operations executed by the algorithm for that input (expressed as a function of the input size ). 11

Running Time of an Algorithm: The longest running time for any input of a certain size. Also called worst-case time (or time complexity). Input Size: Depends on the problem. Examples: (a) Sorting: Number (n) of input values. (b) Graph problems: Number of nodes + Number of edges. (c) SAT problem: Number of literals + Number of operators + Number of parentheses. 12

Running Time Analysis Example I: Pseudocode: 1. Max = A[1] 2. for i = 2 to n do if (A[i] > Max) then Max = A[i] 3. Print Max Analysis: Input size = n. Step 1: No. of operations = 1. Step 2: The for loop executes n 1 times. Each time through the loop, we have: 13

(a) Two operations (comparison and increment) on i. (b) At most two operations (a comparison and an assignment) for the if statement. So, the total number of operations over all the iterations of the loop is at most 4(n 1). Step 3: No. of operations = 1. So, the total number of operations carried out by the algorithm is at most 1 + 4(n 1) + 1 = 4n 2. Running Time Analysis Example 2: Zero Sum: Input: Array A[1.. n] containing n integers. 14

Pseudocode: 1. for i = 1 to n 1 do 1.1 for j = i + 1 to n do if (A[i] + A[j] = 0) then Print True and stop. 2. Print False. Analysis: Step 1: The for loop runs n 1 times. For each iteration of this loop: (a) The for loop in Step 1.1 runs n i times. (b) During each iteration of this inner for loop, at most 6 operations are carried out: 15

Comparison and increment for j, the sum A[i] + A[j] and its comparison with 0 and print and stop operations. (c) So, the total number of operations carried out during all iterations of the inner for loop is at most 6(n i). So, the total number of operations carried out during all the iterations of the outer for loop is n 1 6(n i) = 3n(n 1). Step 2: No. of operations = 1. Conclusion: i=1 The total number of operations carried out by the algorithm is at most 3n(n 1) + 1. 16

Disadvantages: This type of detailed analysis is too tedious. The exact number of operations is not too insightful. Simplified Representation: Order or Big-O Notation. Running time as the input size becomes large (also called asymptotic running time). Facilitates the comparison of algorithms for a problem. Basic Ideas: Use only the most dominant term in the expression for the number of operations. Suppress additive and multiplicative constants. 17

Examples: The number of steps used by the algorithm for finding the maximum is at most 4n 2. This is expressed as O(n). So, the running time of the maximum finding algorithm is O(n). The number of steps used by the algorithm for the Zero Sum problem is at most 3n(n 1) + 1. So, the running time of the algorithm for the Zero Sum problem is O(n 2 ). Additional Examples: (a) Let f(n) = n 3 + 24n 2 17. Then, f(n) = O(n 3 ). (b) Let g(n) = 2n log 2 n + 31n + 97. Then, g(n) = O(n log 2 n). Note: O(1) denotes a constant. 18

Exercises: n 1. Find the big-o representation for i=1 (2i2 ). 2. Suppose f(n) = 8n 3 + 2n 2 17 and g(n) = n 5 + n 2 + 19. Find the big-o representations for f(n) + g(n) and f(n) g(n). Algorithms for the Maximum Block Sum Problem: Input: An array A[1.. n] of integers. Idea behind Algorithm I: Consider each block of A and compute its sum. Output the largest sum found. 19

Pseudocode for Algorithm I: 1. MaxSum = A[1] 2. for i = 1 to n do 2.1 for j = i to n do temp = FindSum(A, i, j) if (temp > MaxSum) then MaxSum = temp 3. Print MaxSum function FindSum(A, i, j) 1. sum = 0 2. for k = i to j do sum = sum + A[k] 3. return sum 20

Analysis: Each call to FindSum takes O(n) time. Each iteration of the loop in Step 2.1 runs in O(n) time (since the dominant time is due to the call to FindSum). Since the loop itself runs at most n times, the running time of the loop in Step 2.1 is O(n 2 ). The loop in Step 2 runs n times. Each iteration of this loop executes the loop in Step 2.1. Since the latter takes O(n 2 ) time, the time to complete Step 2 is O(n n 2 ) = O(n 3 ). Steps 1 and 3 take O(1) time. So, the overall running time is O(n 3 ). 21

Algorithm II for Maximum Block Sum Idea: For each i, (1 i n), there are n i+1 blocks whose starting point is A[i]. For each such block, Algorithm I computes the sum in O(n) time (using the FindSum function). So, the time used to compute the sums for all the blocks with starting point A[i] is O(n 2 ). It is possible to compute the sums of all the blocks that start at A[i] in O(n) time as follows. Let S i,j denote the sum of the block A[i.. j]. S i,i = A[i] S i,i+1 = S i,i + A[i + 1] S i,i+2 = S i,i+1 + A[i + 2]. 22

S i,n = S i,n 1 + A[n] The resulting algorithm runs in time O(n 2 ). Exercise: Write pseudocode for Algorithm II using the above idea. Verify that the running time of the algorithm is O(n 2 ). Algorithm III for Maximum Block Sum Idea: (a) Observation: A block with the maximum sum ends at A[i] for some i, 1 i n. (b) For each i, compute and store the maximum sum among all blocks that end at A[i]. (c) The largest sum found in (b) is the solution. 23

How to carry out Step (b): Use an auxiliary array B[1.. n]; for each i, let B[i] store the maximum sum among the blocks that end at A[i]. Note that B[1] = A[1]. For any i 2, B[i] = B[i 1] + A[i] if B[i 1] > 0 = A[i] otherwise. Example: To be presented in class. 24

Pseudocode for Algorithm III: 1. B[1] = A[1] 2. for i = 2 to n do if (B[i 1] > 0) then B[i] = B[i 1] + A[i] else B[i] = A[i] 3. Find and print the maximum value in B[1.. n] Running time of Algorithm III: Step 1: O(1) time. Step 2: The loop executes at most n times and each iteration of the loop uses O(1) time. So, the total time for the loop is O(n). 25

Step 3: O(n) time. So, the overall running time is O(n). Conclusion: Algorithm III has the best running time for the Maximum Block Sum problem. Definition: An algorithm is efficient if its running time is a polynomial function of the input size. Note: Polynomial means that the running time can be expressed as O(n k ) where n is the problem size and k is a constant independent of n. 26

Examples: The algorithm for finding the maximum runs in O(n) time. The algorithm for the Zero Sum problem runs in O(n 2 ) time. All the three algorithms for the Maximum Block Sum problem are efficient algorithms. (Their respective running times are O(n 3 ), O(n 2 ) and O(n).) Problems such as Sorting and Shortest Path can also be solved efficiently. Exercise: Several algorithms with running times of O(n log 2 n) are known for the sorting problem. Also, given a sorted array A[1.. n] and a value q, binary search can be used to determine whether or not A contains the value q in O(log 2 n) time. Use these facts to devise an O(n log 2 n) algorithm for the Zero Sum problem. 27

NP-Complete Problems: Terminology: The class P contains all the problems for which a solution can be found in polynomial time. The class NP contains all the problems for which a given solution can be verified in polynomial time. Notes: 1. NP denotes Nondeterministic Polynomial. 2. For mathematical convenience, the class NP is restricted to decision problems. 28

Example: SAT is in NP. Given an assignment of values to the Boolean variables of a formula F, we can evaluate F and thus determine whether the given assignment satisfies the formula in polynomial time. However, we don t know how to find a satisfying assignment in polynomial time. Note: Every problem in P is also in NP. (Thus, P NP.) Other Problems in NP: Problem 7: Clique Input: A set S of n people, a set P of pairs of people from S who know each other and an integer K n. Question: Is there a subset S of S containing at least K people such that each person in S knows every other person in S? 29

Problem 8: Subset Sum Input: A set S of n integers and another integer Q. Question: Is there a subset S of S such that the sum of the integers in S is equal to Q? Problem 9: Longest Path Input: A graph G with vertex set V, edge set E, two vertices u and v and an integer K V. Question: Is there a path of length at least K between u and v? NP-Complete Problems: The hardest problems in NP. 30

These problems are equivalent in the following sense: (a) If any one of the NP-complete problems can be solved in polynomial time, then all of them can be solved in polynomial time. (If this happens, then P = NP.) (b) If we can show that there is no polynomial algorithm for any one of the NP-complete problems, then none of them can be solved in polynomial time. (If this happens, then P NP.) The problems SAT, Clique, Subset Sum and Longest Path defined above are known to be NP-complete. Thousands of problems that arise in practical applications are known to be NP-complete. P? = NP is a major open question. 31