CS 360: Data Structures and Algorithms Divide-and-Conquer (part 1) Format of Divide-and-Conquer algorithms: Divide: Split the array or list into smaller pieces Conquer: Solve the same problem recursively on smaller pieces Combine: Build the full solution from the recursive results
Merge sort: 0 1 2 3 4 5 6 7 26 48 17 31 50 9 21 16 Split 26 48 17 31 50 9 21 16 Split Split 26 48 17 31 50 9 21 16 Split Split Split Split 26 48 17 31 50 9 21 16 Merge Merge Merge Merge 26 48 17 31 9 50 16 21 Merge Merge 17 26 31 48 9 16 21 50 Merge 9 16 17 21 26 31 48 50
MergeSort (A[ ], low, high) { // low=0, high=7, n=high low+1 if (low == high) return; mid = (low+high)/2; MergeSort (A, low, mid); MergeSort (A, mid+1, high); Merge (A, low, mid, high); } Merge (A[ ], low, mid, high) { // some details omitted allocate array B[ ] of size n; repeat n times { compare the next values from each half of A; copy the smaller of these two values from A to B; } repeat n times copy the next value from B to A; } Analysis of merge sort: Merge function takes θ(n) time, where n=high low+1 Merge sort has lg n levels of recursive calls All the merges at any given level take θ(n) total time So θ(n lg n) total time for all merges at all levels
Merge sort: Divide: Split array of size n into two subarrays of size n/2 Conquer: Two recursive calls on subarrays of size n/2 Combine: Merge two sorted subarrays to create sorted array of size n Let T(n) = running time of merge sort on array of size n T(n) = T divide (n) + T conquer (n) + T combine (n) T(n) = θ(1) + [T(n/2) + T(n/2)] + θ(n) T(n) = 2 T(n/2) + θ(n) Number of recursive subproblems = 2 Size of each subproblem = n/2 Time for all the non-recursive steps = θ(n) Later we ll see how to solve these kinds of recurrence equations to determine T(n) The solution for T(n) = 2 T(n/2) + θ(n) is T(n) = θ(n lg n)
Binary search: Find(27) 0 1 2 3 4 5 6 7 8 9 10 13 18 24 27 34 41 49 55 60 66 72 0 1 2 3 4 13 18 24 27 34 27 > 24 3 4 27 34 found 27 < 41 Let T(n) = running time of binary search on array of size n T(n) = T divide (n) + T conquer (n) + T combine (n) T(n) = θ(1) + T(n/2) + 0 T(n) = T(n/2) + θ(1) Number of recursive subproblems = 1 Size of each subproblem = n/2 Time for all the non-recursive steps = θ(1) Later we ll see how to solve these kinds of recurrence equations to determine T(n) The solution for T(n) = T(n/2) + θ(1) is T(n) = θ(lg n)
Polynomial multiplication and Large integer multiplication: Given two polynomials: P = 5x 3 + 7x 2 + 6x + 2 [5, 7, 6, 2] = [p 3, p 2, p 1, p 0 ] Q = x 3 8x 2 + 9x 1 [1, 8, 9, 1] = [q 3, q 2, q 1, q 0 ] P*Q = (5x 3 + 7x 2 + 6x + 2)(x 3 8x 2 + 9x 1) Alternatively, given two large integers: P = 9863 = 9x 3 + 8x 2 + 6x + 3 where x = 10 Q = 7245 = 7x 3 + 2x 2 + 4x + 5 where x = 10 P*Q = (9863)(7245) = (9x 3 + 8x 2 + 6x + 3)(7x 3 + 2x 2 + 4x + 5) where x = 10 So large integer multiplication is a special case of polynomial multiplication, where x = the base in which the numbers are represented
Divide-and-conquer Algorithm 1 for Polynomial multiplication: P = 5x 3 + 7x 2 + 6x + 2 = (5x + 7)x 2 + (6x + 2) = Ax 2 + B Q = x 3 8x 2 + 9x 1 = (x 8)x 2 + (9x 1) = Cx 2 + D A = 5x + 7 [5, 7] B = 6x + 2 [6, 2] C = x 8 [1, 8] D = 9x 1 [9, 1] Let n = number of terms in P and Q P = Ax n/2 + B Q = Cx n/2 + D A, B, C, D are polynomials with n/2 terms P*Q = (Ax n/2 + B)(Cx n/2 + D) = (AC)x n + (BC + AD)x n/2 + (BD) Four recursive subproblems: A*C B*C A*D B*D Stop the recursion when n = 1
Let T(n) = running time for the preceding Algorithm 1 T(n) = 4 T(n/2) + θ(n) Number of recursive subproblems = 4 Size of each subproblem = n/2 Time for all the non-recursive steps = θ(n): o adding n-term polynomials o x n and x n/2 which are just shifts (not products) o also note the final product P*Q has < 2n terms Again, later we ll see how to solve these kinds of recurrence equations to determine T(n) The solution for T(n) = 4 T(n/2) + θ(n) is T(n) = θ(n 2 )
Divide-and-conquer Algorithm 2 for Polynomial multiplication (Karatsuba s algorithm): As before, P*Q = (Ax n/2 + B)(Cx n/2 + D) = (AC)x n + (BC + AD)x n/2 + (BD) This time we write (BC + AD) = (A+B)(C+D) (AC) (BD) So only three recursive calls: A*C B*D (A+B)*(C+D) Again stop the recursion when n = 1 Let T(n) = running time for the preceding Algorithm 2 T(n) = 3 T(n/2) + θ(n) Number of recursive subproblems = 3 Size of each subproblem = n/2 Time for all the non-recursive steps = θ(n) Again, later we will see how to solve these kinds of recurrence equations to determine T(n) The solution for T(n) = 3 T(n/2) + θ(n) is T(n) = θ(n lg 3 ) θ(n 1.58 )
Majority element problem: Given array A of size n, does there exist any value M that appears more than n/2 times in array A? Majority element algorithm: Phase 1: Use divide-and-conquer to find candidate value M Phase 2: Check if M really is a majority element, θ(n) time, simple loop Phase 1 details: Divide: o Group the elements of A into n/2 pairs o If n is odd, there is one unpaired element, x Check if this x is majority element of A If so, then return x, but otherwise discard x o Compare each pair (y, z) If (y==z) keep y and discard z If (y!=z) discard both y and z o So we keep n/2 elements Conquer: One recursive call on subarray of size n/2 Combine: Nothing remains to be done, so omit this step
Example: A = [7, 7, 5, 2, 5, 5, 4, 5, 5, 5, 7] (7, 7) (5, 2) (5, 5) (4, 5) (5, 5) (7) A = [7, 5, 5] (7, 5) (5) return 5 (candidate, also majority) Example: A = [1, 2, 3, 1, 2, 3, 1, 2, 9, 9] (1, 2) (3, 1) (2, 3) (1, 2) (9, 9) A = [9] return 9 (candidate, but not majority) Let T(n) = running time of Phase 1 on array of size n T(n) = T(n/2) + θ(n) Number of recursive subproblems = 1 Size of each subproblem = n/2 [worst-case] Time for all the non-recursive steps = θ(n) Later we ll see how to solve these kinds of recurrence equations to determine T(n) The solution for T(n) = T(n/2) + θ(n) is T(n) = θ(n)