Unit 5: Greedy Algorithms. Greedy Algorithm: Vertex Cover. Activity Selection. A Greedy Algorithm. The Activity-Selection Algorithm.

Similar documents
International Journal of Advanced Research in Computer Science and Software Engineering

Approximation Algorithms

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

Chapter Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. #-approximation algorithm.

Near Optimal Solutions

Computer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li

Solutions to Homework 6

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

5.1 Bipartite Matching

1. Nondeterministically guess a solution (called a certificate) 2. Check whether the solution solves the problem (called verification)

Arithmetic Coding: Introduction

Algorithm Design and Analysis

GRAPH THEORY LECTURE 4: TREES

CIS 700: algorithms for Big Data

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

Welcome to the course Algorithm Design

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Full and Complete Binary Trees

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

Social Media Mining. Graph Essentials

Topic: Greedy Approximations: Set Cover and Min Makespan Date: 1/30/06

CMPSCI611: Approximating MAX-CUT Lecture 20

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Applied Algorithm Design Lecture 5

Warshall s Algorithm: Transitive Closure

Page 1. CSCE 310J Data Structures & Algorithms. CSCE 310J Data Structures & Algorithms. P, NP, and NP-Complete. Polynomial-Time Algorithms

CS 598CSC: Combinatorial Optimization Lecture date: 2/4/2010

CSC2420 Spring 2015: Lecture 3

1 Approximating Set Cover

Lecture 1: Course overview, circuits, and formulas

Part 2: Community Detection

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION

Information, Entropy, and Coding

Large induced subgraphs with all degrees odd

LZ77. Example 2.10: Let T = badadadabaab and assume d max and l max are large. phrase b a d adadab aa b

Chapter 3. if 2 a i then location: = i. Page 40

Scheduling Shop Scheduling. Tim Nieberg

24. The Branch and Bound Method

Optimal Index Codes for a Class of Multicast Networks with Receiver Side Information

Discrete Mathematics. Hans Cuypers. October 11, 2007

Euclidean Minimum Spanning Trees Based on Well Separated Pair Decompositions Chaojun Li. Advised by: Dave Mount. May 22, 2014

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Practical Guide to the Simplex Method of Linear Programming

1 if 1 x 0 1 if 0 x 1

Nimble Algorithms for Cloud Computing. Ravi Kannan, Santosh Vempala and David Woodruff

V. Adamchik 1. Graph Theory. Victor Adamchik. Fall of 2005

Why? A central concept in Computer Science. Algorithms are ubiquitous.

JUST-IN-TIME SCHEDULING WITH PERIODIC TIME SLOTS. Received December May 12, 2003; revised February 5, 2004

Analysis of Algorithms I: Binary Search Trees

CSE 135: Introduction to Theory of Computation Decidability and Recognizability

Discrete Mathematics Problems

Graphical degree sequences and realizations

Chapter 6: Episode discovery process

Lecture 18: Applications of Dynamic Programming Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Institut für Informatik Lehrstuhl Theoretische Informatik I / Komplexitätstheorie. An Iterative Compression Algorithm for Vertex Cover

Analysis of Algorithms, I

THE DIMENSION OF A VECTOR SPACE

Definition Given a graph G on n vertices, we define the following quantities:

11. APPROXIMATION ALGORITHMS

SHARP BOUNDS FOR THE SUM OF THE SQUARES OF THE DEGREES OF A GRAPH

Gambling and Data Compression

Mathematics for Algorithm and System Analysis

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM

1 Definitions. Supplementary Material for: Digraphs. Concept graphs

CSCE 310J Data Structures & Algorithms. Dynamic programming 0-1 Knapsack problem. Dynamic programming. Dynamic Programming. Knapsack problem (Review)

Similarity and Diagonalization. Similar Matrices

Network (Tree) Topology Inference Based on Prüfer Sequence

3. Linear Programming and Polyhedral Combinatorics

USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS

Row Echelon Form and Reduced Row Echelon Form

On the k-path cover problem for cacti

Guessing Game: NP-Complete?

Linearly Independent Sets and Linearly Dependent Sets

Graph theoretic techniques in the analysis of uniquely localizable sensor networks

Connected Identifying Codes for Sensor Network Monitoring

Data Structures Fibonacci Heaps, Amortized Analysis

Fairness in Routing and Load Balancing

(67902) Topics in Theory and Complexity Nov 2, Lecture 7

Bicolored Shortest Paths in Graphs with Applications to Network Overlay Design

Dynamic Programming. Lecture Overview Introduction

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

Graph Theory Problems and Solutions

Computability Theory

A 2-factor in which each cycle has long length in claw-free graphs

Graph Security Testing

Triangle deletion. Ernie Croot. February 3, 2010

Class notes Program Analysis course given by Prof. Mooly Sagiv Computer Science Department, Tel Aviv University second lecture 8/3/2007

On three zero-sum Ramsey-type problems

How To Solve A K Path In Time (K)

P. Jeyanthi and N. Angel Benseera

A Practical Scheme for Wireless Network Operation

Protein Protein Interaction Networks

3. INNER PRODUCT SPACES

6.3 Conditional Probability and Independence

Transcription:

: Greedy Algorithms Course contents: Elements of the greedy strategy Activity selection Knapsack problem Huffman codes Task scheduling Matroid theory Reading: Chapter 16 Greedy Algorithm: Vertex Cover A vertex cover of an undirected graph G=(V, E) is a subset V' V such that if (u, v) E, then u V' or v V', or both. The set of vertices covers all the edges. The size of a vertex cover is the number of vertices in the cover. The vertex-cover problem is to find a vertex cover of minimum size in a graph. Greedy heuristic: cover as many edges as possible (vertex with the maximum degree) at each stage and then delete the covered edges. The greedy heuristic cannot always find an optimal solution! The vertex-cover problem is NP-complete. 1 2 A Greedy Algorithm Activity Selection A greedy algorithm always makes the choice that looks best at the moment. An Activity-Selection Problem: Given a set S = {1, 2,, n} of n proposed activities, with a start time s i and a finish time f i for each activity i, select a maximum-size set of mutually compatible activities. If selected, activity i takes place during the half-open time interval [s i, f i ). Activities i and j are compatible if [s i, f i ) and [s j, f j ) do not overlap (i.e., s i f j or s j f i ). 3 4 The Activity-Selection Algorithm Optimality Proofs Greedy-Activity-Selector(s,f) // Assume f 1 f 2 f n. 1. n = s.length 2. A = {1} // a 1 in 3 rd Ed. 3. j = 1 4. for i = 2 to n 5. if s i f j 6. A = A {i} 7. j = i 8. return A Time complexity excluding sorting: O(n) Theorem: Algorithm Greedy-Activity-Selector produces solutions of maximum size for the activity-selection problem. (Greedy-choice property) Suppose A S is an optimal solution. Show that if the first activity in A activity k 1, then B = A -{k} {1} is an optimal solution. (Optimal substructure) Show that if A is an optimal solution to S, then A' = A - {1} is an optimal solution to S' = {i S: s i f 1 }. Prove by induction on the number of choices made. 5 (Greedy-choice property) Suppose A S is an optimal solution. Show that if the first activity in A activity k 1, then B = A -{k} {1} is an optimal solution. f 1 1 B >= A (Optimal substructure) Show that if A is an optimal solution to S, then A' = A - {1} is an optimal solution to S' = {i S: s i f 1 }. Exp: A = {4, 8, 11}, S = {4, 6, 7, 8, 9, 11, 12} in the Activity Selection example Proof by contradiction: If A is not an optimal solution to S, we can find a better solution A. Then, A U{1} is a better solution than A to S, contradicting to the original claim that A is an optimal solution to S. A k p q B 1 p q 6

Elements of the Greedy Strategy Knapsack Problem When to apply greedy algorithms? Greedy-choice property: A global optimal solution can be arrived at by making a locally optimal (greedy) choice. Dynamic programming needs to check the solutions to subproblems. Optimal substructure: An optimal solution to the problem contains within its optimal solutions to subproblems. E.g., if A is an optimal solution to S, then A' = A - {1} is an optimal solution to S' = {i S: s i f 1 }. Greedy heuristics do not always produce optimal solutions. Greedy algorithms vs. dynamic programming (DP) Common: optimal substructure Difference: greedy-choice property DP can be used if greedy solutions are not optimal. 7 Knapsack Problem: Given n items, with ith item worth v i dollars and weighing w i pounds, a thief wants to take as valuable a load as possible, but can carry at most W pounds in his knapsack. The 0-1 knapsack problem: Each item is either taken or not taken (0-1 decision). The fractional knapsack problem: Allow to take fraction of items. Exp: = (60, 100, 120), = (10, 20, 30), W = 50 Greedy solution by taking items in order of greatest value per pound is optimal for the fractional version, but not for the 0-1 version. The 0-1 knapsack problem is NP-complete, but can be solved in O(nW) time by DP. (A polynomial-time DP??) 8 Coding Binary Tree vs. Prefix Code Is used for data compression, instruction-set encoding, etc. Binary character code: character is represented by a unique binary string Fixed-length code (block code): a: 000, b: 001,, f: 101 ace 000 010 100. Variable-length code: frequent characters short codeword; infrequent characters long codeword Prefix code: No code is a prefix of some other code. 9 10 Optimal Prefix Code Design Huffman's Procedure Coding Cost of T: B(T)= c C c.freq d T (c) c: character in the alphabet C c.freq: frequency of c d T (c): depth of c's leaf (length of the codeword of c) Code design: Given c 1.freq, c 2.freq,, c n.freq, construct a binary tree with n leaves such that B(T) is minimized. Idea: more frequently used characters use shorter depth. Pair two nodes with the least costs at each step. 11 12

Huffman's Algorithm Huffman(C) 1. n = C 2. Q = C 3. for i = 1 to n -1 4. Allocate a new node z 5. z.left = x = Extract-Min(Q) 6. z.right = y = Extract-Min(Q) 7. z.freq = x.freq + y.freq 8. Insert(Q, z) 9. return Extract-Min(Q) //return the root of the tree Huffman s Algorithm: Greedy Choice Greedy choice: Two characters x and y with the lowest frequencies must have the same length and differ only in the last bit. Time complexity: O(n lg n). Extract-Min(Q) needs O(lg n) by a heap operation. Requires initially O(n lg n) time to build a binary heap. 13 14 Huffman's Algorithm: Optimal Substructure Optimal substructure: Let T be a full binary tree for an optimal prefix code over C. Let z be the parent of two leaf characters x and y. If z.freq = x.freq + y.freq, tree T' = T -{x, y} represents an optimal prefix code for C'= C -{x, y} {z}. z.freq f[z] = f[x] = x.freq + f[y] + y.freq B(T) = B(T ) + x.freq + y.freq B(T ) = B(T ) + x.freq + y.freq < B(T ) + x.freq + y.freq = B(T) contradiction!! 15 Task Scheduling The task scheduling problem: Schedule unit-time tasks with deadlines and penalties s.t. the total penalty for missed deadlines is minimized. S = {1, 2,, n} of n unit-time tasks. Deadlines d 1, d 2,, d n for tasks, 1 d i n. Penalties w 1, w 2,, w n : w i is incurred if task i misses deadline. Set A of tasks is independent if a schedule with no late tasks. N t (A): number of tasks in A with deadlines t or earlier, t = 1, 2,, n. Three equivalent statements for any set of tasks A 1. A is independent. 2. N t (A) t, t = 1, 2,, n. 3. If the tasks in A are scheduled in order of nondecreasing deadlines, then no task is late. 16 Greedy Algorithm: Task Scheduling The optimal greedy scheduling algorithm: 1. Sort penalties in non-increasing order. 2. Find tasks of independent sets: no late task in the sets. 3. Schedule tasks in a maximum independent set in order of nondecreasing deadlines. 4. Schedule other tasks (missing deadlines) at the end arbitrarily. Matroid Let S be a finite set, and F a nonempty family of subsets of S, that is, F P(S). We call (S,F) a matroid if and only if M1) If B F and A B, then A F. [The family F is called hereditary] M2) If A,B F and A < B, then there exists x in B\A such that A {x} in F [This is called the exchange property] 17 18

Example 1 (Matric Matroids) Let M be a matrix. Let S be the set of rows of M and F = { A A S, A is linearly independent } Claim: (S,F) is a matroid. Clearly, F is not empty (it contains every row of M). M1) If B is a set of linearly independent rows of M, then any subset A of M is linearly independent. Thus, F is hereditary. M2) If A, B are sets of linearly independent rows of M, and A < B, then dim span A < dim span B. Choose a row x in B that is not contained in span A. Then A {x} is a linearly independent subset of rows of M. Therefore, F satisfied the exchange property. 19 Undirected Graphs Let V be a finite set, E a subset of { e e V, e =2 } Then (V,E) is called an undirected graph. We call V the set of vertices and E the set of edges of the graph. a c b d V = {a,b,c,d} E = { {a,b}, {a,c}, {a,d}, {b,d}, {c,d} } 20 Let G=(V,E) be a graph. Induced Subgraphs We call a graph (V,E ) an induced subgraph of G if and only if its edge set E is a subset of E. Spanning Trees Given a connected graph G, a spanning tree of G is an induced subgraph of G that happens to be a tree and connects all vertices of G. If the edges are weighted, then a spanning tree of G with minimum weight is called a minimum spanning tree. 21 a 2 c 1 3 5 b 4 d 22 Example 2 (Graphic Matroids) Let G=(V,E) be an undirected graph. Choose S = E and F = { A H = (V,A) is an induced subgraph of G such that H is a forest }. Claim: (S,F) is a matroid. M1) F is a nonempty hereditary set system. M2) Let A and B in F with A < B. Then (V,B) has fewer trees than (V,A). Therefore, (V,B) must contain a tree T whose vertices are in different trees in the forest (V,A). One can add the edge x connecting the two different trees to A and obtain another forest (V,A {x}). Weight Functions A matroid (S,F) is called weighted if it equipped with a weight function w: S->R +, i.e., all weights are positive real numbers. If A is a subset of S, then w(a) := a in A w(a). Weight functions of this form are sometimes called linear weight functions. 23 24

Greedy Algorithm for Matroids Greedy(M=(S,F),w) A := ; Sort S into monotonically decreasing order by weight w. for each x in S taken in monotonically decreasing order do if A {x} in F then A := A {x}; fi; od; return A; Correctness Theorem: Let M= (S,F) be a weighted matroid with weight function w. Then Greedy(M,w) returns a set in F of maximal weight. [Thus, even though Greedy algorithms in general do not produce optimal results, the greedy algorithm for matroids does! This algorithm is applicable for a wide class of problems. Yet, the correctness proof for Greedy is not more difficult than the correctness for special instance such as Huffman coding. This is economy of thought!] 25 26 Complexity Let n = S = # elements in the set S. Sorting of S: O(n log n) The for-loop iterates n times. In the body of the loop one needs to check whether A {x} is in F. If each check takes f(n) time, then the loop takes O(n f(n)) time. Thus, Greedy takes O(n log n + n f(n)) time. Minimizing or Maximizing? Let M=(S,F) be a matroid. The algorithm Greedy(M,w) returns a set A in F maximizing the weight w(a). If we would like to find a set A in F with minimal weight, then we can use Greedy with weight function w (a) = m-w(a) for a in A, where m is a real number such that m > max sin S w(s). 27 28 Matric Matroids Let M be a matrix. Let S be the set of rows of the matrix M and F = { A A S, A is linearly independent }. Weight function w(a)= A. What does Greedy((S,F),w) compute? The algorithm yields a basis of the vector space spanned by the rows of the matrix M. Graphic Matroids Let G=(V,E) be an undirected connected graph. Let S = E and F = { A H = (S,A) is an induced subgraph of G such that H is a forest }. Let w be a weight function on E. Define w (a)=m-w(a), where m>w(a), for all a in A. Greedy((S,F), w ) returns a minimum spanning tree of G. This algorithm in known as Kruskal s algorithm. 29 30

Kruskal's MST algorithm 16 5 12 7 4 11 3 14 6 9 10 8 2 15 13 18 17 Consider the edges in increasing order of weight, add an edge iff it does not cause a cycle Matroids characterize a group of problems for which the greedy algorithm yields an optimal solution. Kruskals minimum spanning tree algorithm fits nicely into this framework. 31 32