CS2223 Algorithms B Term 2013 Exam 2 Solutions

CS2223 Algorithms B Term 2013 Exam 2 Solutions Dec. 3, 2013 By Prof. Carolina Ruiz Dept. of Computer Science WPI PROBLEM 1: Solving Recurrences (15 points) Solve the recurrence T (n) = 2T (n/3) + n using the substitution method (= guess + induction ). Assume T (1) = 1. Hint: Use the master theorem to make a good initial guess for the substitution method. Show your work. The above recurrence satisfies case 3 of the master theorem: Here f(n) = n and k = log 3 2 < 1. Note that: there is a constant ɛ > 0 such that f(n) = Ω(n log 3 2+ɛ ), namely ɛ = 1 log 3 2; and there is a constant c < 1 such that 2f(n/3) = 2n/3 cn = cf(n) for all n 0, namely c = 2/3. Hence according to the master theorem, T (n) = Θ(f(n)) = Θ(n). Let s use this result from the master theorem to set our initial guess for the substitution method. Guess: T (n) dn where d is a constant. Proof by (strong) Induction: Let s prove that for all n 1 the inequality: T (n) dn holds. Base Case: Let s prove that the inequality holds for n = 1: For that, we d need T (1) = 1 d. We ll use this constraint on d later on in this proof. Induction Step: Let s prove that the inequality holds for n > 1. Induction Hypothesis: Let n > 1, assume that for all m < n, T (m) dm. Now let s prove that the inequality holds for n as well: T (n) = 2T (n/3) + n 2d(n/3) + n, by the induction hypothesis since (n/3) < n. (2/3)dn + n ((2/3)d + 1)n and we want this to be dn therefore, we need (2/3)d + 1 d, and so 3 d. From the base case, we also need that 1 d. Take for instance, d = 3. Hence, the inequality T (n) 3n holds for all n 1. Therefore T (n) = O(n). 1

PROBLEM 2: Divide and Conquer: Sorting (25 points) In this problem you will show that the running time of QuickSort is Θ(n 2 ) when the input array A contains distinct elements and is sorted in decreasing order. The QuickSort algorithm discussed in the textbook and in class is provided below. QuickSort(A, p, r) if p < r q = Partition(A, p, r) QuickSort(A, p, q 1) QuickSort(A, q + 1, r) Partition(A, p, r) x = A[r] # x is the pivot. i = p 1 for j = p to r 1 if A[j] x i = i + 1 exchange A[i] with A[j] exchange A[i + 1] with A[r] return i + 1 1. (10 Points) Write a recurrence for the runtime T (n) of the QuickSort algorithm above if the input array A contains distinct elements and is sorted in decreasing order. Explain your work. In this case, the pivot element x = A[r] is smaller than all the elements in A[p...r 1]. Hence, q = p after the pivot element has been relocated to the beginning of the array. The recursive calls to QuickSort will be: QuickSort(A, p, p 1) and QuickSort(A, p + 1, r). Note also that Partition takes linear time. Therefore, the recurrence satisfied by T (n) in this case is: T (n) = T (0) + T (n 1) + n. Here T (0) = 1 and so T (n) = T (n 1) + n + 1. 2

2. (15 Points) Solve your recurrence to show that T (n) = Θ(n 2 ). For this, either use the recursion-tree method (= unrolling the recurrence), the substitution method (= guess + induction ), or the master theorem. Show your work and explain your answer. Note that the master theorem doesn t apply here, since the recurrence is not of the form T (n) = at (n/b) + f(n). That is, there is no constant b for which T (n 1) = T (n/b) for all n. I ll use the recursion-tree method to solve this recurrence. T (n) n + 1 T (n 1) (n 1) + 1 = n T (n 2) n 1...... T (0) 1 Thus, T (n) = n+1 i=1 i = (n + 2)(n + 1)/2 = (n2 + 3n + 2)/2 = Θ(n 2 ). 3

PROBLEM 3: Divide and Conquer: Search (60 points) Consider the following search problem: Input: an array A[1...n] of length n, containing a sorted (in ascending order) sequence of numbers. That is A[1] A[2]... A[n]. a value v. Output: i, if A[i] = v, where 1 i n 0, otherwise. 1. (10 Points) Naïve solution. Show that there is a simple algorithm that solves this problem in O(n) time. Write the algorithm and show that its runtime is O(n). Explain your answer. The naïve algorithm traverses the array cell by cell from left to right looking for the value v until it finds v or it runs out of cells. (Technically, it would be enough to look until it finds v or it encounters an element that is greater than v, since the array is sorted in ascending order.) Since the above algorithm checks each cell of the array at most once, and checking an individual cell takes constant time, the algorithm runs in O(n) time. Here is the pseudo-code for this algorithm, and a more detailed time complexity analysis: Instruction Time naïvesearch(a[1..n], v) for i = 1 to n c 1 (n + 1) if A[i] == v c 2 n return i c 3 1 return 0 c 4 1 Total time (c 1 + c 2 )n + (c 1 + c 3 + c 4 ) = O(n) 4

2. Divide and Conquer solution. Let s construct a more efficient, divide-and-conquer solution to this problem. This solution is called binary search: Compare the element in the midpoint of the array A with v and eliminate half of the array from further consideration, until either v is found in A, or no elements in A remain under consideration. For example, if n = 11 and A = 3 3 4 6 7 10 12 17 25 32 41 index: 1 2 3 4 5 6 7 8 9 10 11 For each of the following sample values of v, our BinarySearch(A, v, 1, n) algorithm would work as follows: For example, if v = 12: The midpoint of the array A[1...11] is the index 6 = (11 + 1)/2. Since v = 12 > 10 = A[6] = A[midpoint], then we should look for v in A[7...11], and eliminate from consideration A[1...6]. Recursively, the midpoint of A[7...11] is the index 9 = (7 + 11)/2. Since v = 12 < 25 = A[9], then we should look for v in A[7...8], and eliminate from consideration A[9...11]. Recursively, the midpoint of A[7...8] is the index 7 = floor((7 + 8)/2). Since v = 12 = A[7], then v is found in the array and the index 7 is returned as the answer. Another example, if v = 5: The midpoint of the array A[1...11] is the index 6 = (11 + 1)/. 2 Since v = 5 < 10 = A[6] = A[midpoint], then we should look for v in A[1...5] and eliminate from consideration A[6...11]. Recursively, the midpoint of A[1...5] is the index 3 = floor((1 + 6)/2). Since v = 5 > 4 = A[3], then we should look for v in A[4...5] and eliminate from consideration A[1...3]. Recursively, the midpoint of A[4...5] is the index 4 = ((4 + 5)/2). Since v = 5 < 6 = A[4], then we should look for v in A[4...3] which is an empty array and therefore 0 is returned as v = 5 is not in the original array A[1...11]. Solve this problem by answering the questions in the next pages. 5

(a) (20 Points) Algorithm. Write a detailed algorithm (in pseudo-code) implementing the BinarySearch(A, v, p, r) divide-and-conquer, recursive solution described above. Here, p and r are indexes on the array A. In the initial call to your algorithm, p = 1 and r = n. Explain your work. BinarySearch(A, v, p, r) if p > r return 0 else k = floor((p + r)/2) if A[k] == v return k else if v < A[k] return BinarySearch(A, v, p, k 1) else # in this case, v > A[k] return BinarySearch(A, v, k + 1, r) (b) (10 Points) Correctness. Prove that your pseudo-code is correct with respect to its input output specification (that is, it always terminates and returns the right answer for the given input). Explain your answer. This solution is taken from the solutions to Homework 5 Problem 4 (By Piotr Mardziel) from my offering of CS2223 in B term 2005: http://www.cs.wpi.edu/ cs2223/b05/hw/hw5/solutionshw5/. The algorithm as described progressively narrows down the portion of A that is considered. To show the correctness of the method let s show the following three claims: 6

Claim 1: At any point in the execution, if v appears in the array A, the correct index (i.e., the position of v in the array) is somewhere between p and r. Proof: Initially p = 1 and r = n bounding the entire array A and for this case the right index is certainly somewhere in the claimed range. Next consider the half-point index k between p and r. Since the array A is sorted in an ascending order then the condition A[k] > v would certainly imply that the correct element cannot have an index greater than k as the array values there are all greater than A[k] which is already greater than v. Hence this case implies that the proper value must be in the other half of the array. Similar argument demonstrates the opposite when A[k] < v. Finally note the last case in which A[k] = v. Here the correct index is simply returned (thought this doesn t have much to do with the claimed statement). Claim 2: If there is an index i such that A[i] = v then the algorithm will output this index. Proof: Having established that the correct index is in the range between p and r at all times, we now note that the difference between those two decreases strictly after each new recursive call. Thus if the correct index is not a midpoint k, eventually p = r and hence the correct index will be returned as claimed. Claim 3: If there is NO index i such that A[i] = v then the algorithm will output 0. Proof: Note that the only way for an index > 0 to be returned is if it the correct value is specifically found in the array. Next notice that after each successive iteration of the algorithm, the difference between p and r decreases. These two facts lead to the conclusion that eventually the condition that p > r will hold true. At such a point the algorithm will return 0 as claimed. (c) Time Complexity. Analyze the time complexity of your algorithm. i. (10 Points) Recurrence. Write a recurrence for the runtime T (n) of your algorithm. Explain your work. T (0) = 1 T (n) = T (n/2) + c. This is because in the recursive step, the problem of finding v in an array of length n is reduced to the problem of finding v in an array of length n/2; and it takes constant time to construct the subproblem needed to be solved (searching for v in the left half or right half of the array) and producing the final solution from the subproblem s solution. 7

ii. (10 Points) Solving the recurrence. Use the master theorem, the recursion-tree method, OR the substitution method to solve your recurrence. Find the tightest asymptotic upper bound g(n) for T (n) you can for which T (n) = O(g(n)). Explain your answer and justify each step. Although only one of the alternative proofs below is needed, I provide all three of them here for illustration purposes. Solution using the master theorem: The recurrence T (n) = T (n/2) + c has the form T (n) = at ( n/b ) + f(n), where a = 1, b = 2, and f(n) = c = Θ(n log b a ) (note that log b a = log 2 1 = 0). Hence according to case 2 of the master theorem, T (n) = Θ(n log b a log n) = O(n 0 log n) = O(log n). Solution using the recursion tree method: T (n) c T (n/2) c T (n/4) c...... T (1) c T (0) 1 Thus, since the height of the tree is log 2 n, T (n) = ( log 2 n i=1 c) + 1 = c log 2 n + 1 = Θ(log n). 8

Solution using the substitution method: Guess: T (n) d log k n where d and k are constants. Proof by (strong) Induction: Let s prove that for all n 2 the inequality: T (n) d log k n holds. Base Case: Let s prove that the inequality holds for n = 2: For that, we d need T (2) = c d log k 2. We ll use this constraint on c, d, and k later on in this proof. Induction Step: Let s prove that the inequality holds for n > 2. Induction Hypothesis: Let n > 2, assume that for all m < n, T (m) d log k m. Now let s prove that the inequality holds for n as well: T (n) = T (n/2) + c d log k (n/2) + c, by the induction hypothesis since (n/2) < n. d log k n d log k 2 + c and we want this to be d log k n therefore, we need d log k 2 + c = 0, and so c = d log k 2. From the base case, we also need that c d log k 2. Take for instance, k = 2 and d = c. Hence, the inequality T (n) c log 2 n holds for all n 2. Therefore T (n) = O(log n). 9