How fast can we sort? Sorting. Decision-tree model. Decision-tree for insertion sort Sort a 1, a 2, a 3. CS 3343 -- Spring 2009



Similar documents
CS473 - Algorithms I

Small Business Networking

Polynomial Functions. Polynomial functions in one variable can be written in expanded form as ( )

How To Network A Smll Business

Small Business Networking

Small Business Networking

Small Business Networking

PROF. BOYAN KOSTADINOV NEW YORK CITY COLLEGE OF TECHNOLOGY, CUNY

Example A rectangular box without lid is to be made from a square cardboard of sides 18 cm by cutting equal squares from each corner and then folding

Lecture 5. Inner Product

Graphs on Logarithmic and Semilogarithmic Paper

How To Set Up A Network For Your Business

LINEAR TRANSFORMATIONS AND THEIR REPRESENTING MATRICES

One Minute To Learn Programming: Finite Automata

Binary Representation of Numbers Autar Kaw

Treatment Spring Late Summer Fall Mean = 1.33 Mean = 4.88 Mean = 3.

and thus, they are similar. If k = 3 then the Jordan form of both matrices is

RIGHT TRIANGLES AND THE PYTHAGOREAN TRIPLETS

Operations with Polynomials

Homework 3 Solutions

Lecture 3 Gaussian Probability Distribution

Example 27.1 Draw a Venn diagram to show the relationship between counting numbers, whole numbers, integers, and rational numbers.

APPLICATION NOTE Revision 3.0 MTD/PS-0534 August 13, 2008 KODAK IMAGE SENDORS COLOR CORRECTION FOR IMAGE SENSORS

Experiment 6: Friction

9 CONTINUOUS DISTRIBUTIONS

Second Term MAT2060B 1. Supplementary Notes 3 Interchange of Differentiation and Integration

Integration by Substitution

Math 135 Circles and Completing the Square Examples

Quick Reference Guide: One-time Account Update

PROBLEMS 13 - APPLICATIONS OF DERIVATIVES Page 1

, and the number of electrons is -19. e e C. The negatively charged electrons move in the direction opposite to the conventional current flow.

Use Geometry Expressions to create a more complex locus of points. Find evidence for equivalence using Geometry Expressions.

Firm Objectives. The Theory of the Firm II. Cost Minimization Mathematical Approach. First order conditions. Cost Minimization Graphical Approach

Algebra Review. How well do you remember your algebra?

Reasoning to Solve Equations and Inequalities

ClearPeaks Customer Care Guide. Business as Usual (BaU) Services Peace of mind for your BI Investment

Solution to Problem Set 1

Review guide for the final exam in Math 233

19. The Fermat-Euler Prime Number Theorem

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Solving the String Statistics Problem in Time O(n log n)

Techniques for Requirements Gathering and Definition. Kristian Persson Principal Product Specialist

Basic Analysis of Autarky and Free Trade Models

Cypress Creek High School IB Physics SL/AP Physics B MP2 Test 1 Newton s Laws. Name: SOLUTIONS Date: Period:

4.11 Inner Product Spaces

Section 5-4 Trigonometric Functions

Basic Research in Computer Science BRICS RS Brodal et al.: Solving the String Statistics Problem in Time O(n log n)

Start Here. IMPORTANT: To ensure that the software is installed correctly, do not connect the USB cable until step 17. Remove tape and cardboard

Exponential and Logarithmic Functions

Babylonian Method of Computing the Square Root: Justifications Based on Fuzzy Techniques and on Computational Complexity

Factoring Polynomials

1. Find the zeros Find roots. Set function = 0, factor or use quadratic equation if quadratic, graph to find zeros on calculator

5.2. LINE INTEGRALS 265. Let us quickly review the kind of integrals we have studied so far before we introduce a new one.

MODULE 3. 0, y = 0 for all y

Appendix D: Completing the Square and the Quadratic Formula. In Appendix A, two special cases of expanding brackets were considered:

Warm-up for Differential Calculus

Introducing Kashef for Application Monitoring

Rotating DC Motors Part II

v T R x m Version PREVIEW Practice 7 carroll (11108) 1

Enterprise Risk Management Software Buyer s Guide

Module 2. Analysis of Statically Indeterminate Structures by the Matrix Force Method. Version 2 CE IIT, Kharagpur

5 a LAN 6 a gateway 7 a modem

Distributions. (corresponding to the cumulative distribution function for the discrete case).

Helicopter Theme and Variations

10.6 Applications of Quadratic Equations

2 DIODE CLIPPING and CLAMPING CIRCUITS

Vectors Recap of vectors

Regular Sets and Expressions

Physics 43 Homework Set 9 Chapter 40 Key

Section 5.2, Commands for Configuring ISDN Protocols. Section 5.3, Configuring ISDN Signaling. Section 5.4, Configuring ISDN LAPD and Call Control

AREA OF A SURFACE OF REVOLUTION

DlNBVRGH + Sickness Absence Monitoring Report. Executive of the Council. Purpose of report

Econ 4721 Money and Banking Problem Set 2 Answer Key

9.3. The Scalar Product. Introduction. Prerequisites. Learning Outcomes

PROGRAMOWANIE STRUKTUR CYFROWYCH

15.6. The mean value and the root-mean-square value of a function. Introduction. Prerequisites. Learning Outcomes. Learning Style

Novel Methods of Generating Self-Invertible Matrix for Hill Cipher Algorithm

JaERM Software-as-a-Solution Package

Small Business Cloud Services

Lump-Sum Distributions at Job Change, p. 2

3 The Utility Maximization Problem


A.7.1 Trigonometric interpretation of dot product A.7.2 Geometric interpretation of dot product

An Undergraduate Curriculum Evaluation with the Analytic Hierarchy Process

Decision Rule Extraction from Trained Neural Networks Using Rough Sets

NQF Level: 2 US No: 7480

baby on the way, quit today

Basically, logarithmic transformations ask, a number, to what power equals another number?

Blackbaud The Raiser s Edge

1.00/1.001 Introduction to Computers and Engineering Problem Solving Fall Final Exam

PHY 140A: Solid State Physics. Solution to Homework #2

Unleashing the Power of Cloud

Or more simply put, when adding or subtracting quantities, their uncertainties add.

Unit 6: Exponents and Radicals

Answer, Key Homework 10 David McIntyre 1

Why is the NSW prison population falling?

Transcription:

CS 4 -- Spring 2009 Sorting Crol Wenk Slides courtesy of Chrles Leiserson with smll chnges by Crol Wenk CS 4 Anlysis of Algorithms 1 How fst cn we sort? All the sorting lgorithms we hve seen so fr re comprison sorts: only use comprisons to determine the reltive order of elements. E.g., insertion sort, merge sort, quicksort, hepsort. The best worst-cse running time tht we ve seen for comprison sorting is O(n log n). Is O(n log n) the best we cn do? Decision trees cn help us nswer this question. CS 4 Anlysis of Algorithms 2 Decision-tree model A decision tree models the execution of ny comprison sorting lgorithm: One tree per input size n. The tree contins ll possible comprisons (= if-brnches) tht could be executed for ny input of size n. The tree contins ll comprisons long ll possible instruction trces (= control flows) for ll inputs of size n. For one input, only one pth to lef is executed. Running time = length of the pth tken. Worst-cse running time = height of tree. CS 4 Anlysis of Algorithms Decision-tree for insertion sort Sort 1, 2, 1 2 insert 2 insert < 1 : 2 2 : 1 : 2 1 insert 1 2 < 1 2 1 2 1 : 2 1 2 1 2 : 1 2 1 2 2 1 2 1 Ech internl node is lbeled i : j for i, j {1, 2,, n}. The left subtree shows subsequent comprisons if i < j. The right subtree shows subsequent comprisons if i j. CS 4 Anlysis of Algorithms 4

Decision-tree for insertion sort Sort 1, 2, = <9,4,6> 1 2 insert 2 insert < 1 : 2 2 : 1 : 2 1 insert 1 2 < 1 2 1 2 1 : 2 1 2 1 2 : 1 2 1 2 2 1 2 1 Ech internl node is lbeled i : j for i, j {1, 2,, n}. The left subtree shows subsequent comprisons if i < j. The right subtree shows subsequent comprisons if i j. CS 4 Anlysis of Algorithms 5 Decision-tree for insertion sort Sort 1, 2, = <9,4,6> 1 2 insert 2 insert < 1 : 2 9 4 2 : 1 : 2 1 insert 1 2 < 1 2 1 2 1 : 2 1 2 1 2 : 1 2 1 2 2 1 2 1 Ech internl node is lbeled i : j for i, j {1, 2,, n}. The left subtree shows subsequent comprisons if i < j. The right subtree shows subsequent comprisons if i j. CS 4 Anlysis of Algorithms 6 Decision-tree for insertion sort Sort 1, 2, = <9,4,6> 1 2 insert 2 insert < 1 : 2 2 : 1 : 2 1 insert 1 2 1 2 < 9 6 1 2 1 : 2 1 2 1 2 : 1 2 1 2 2 1 2 1 Ech internl node is lbeled i : j for i, j {1, 2,, n}. The left subtree shows subsequent comprisons if i < j. The right subtree shows subsequent comprisons if i j. CS 4 Anlysis of Algorithms 7 Decision-tree for insertion sort Sort 1, 2, = <9,4,6> 1 2 insert 2 insert < 1 : 2 2 : 1 : 2 1 insert 1 2 < 1 2 1 2 1 : 2 1 2 1 2 : 4 < 6 1 2 1 2 2 1 2 1 Ech internl node is lbeled i : j for i, j {1, 2,, n}. The left subtree shows subsequent comprisons if i < j. The right subtree shows subsequent comprisons if i j. CS 4 Anlysis of Algorithms 8

Decision-tree for insertion sort Sort 1, 2, = <9,4,6> 1 2 insert 2 insert < 1 : 2 2 : 1 : 2 1 insert 1 2 < 1 2 1 2 1 : 2 1 2 1 2 : 1 2 1 2 2 1 2 1 4<6 9 Ech internl node is lbeled i : j for i, j {1, 2,, n}. The left subtree shows subsequent comprisons if i < j. The right subtree shows subsequent comprisons if i j. CS 4 Anlysis of Algorithms 9 Decision-tree for insertion sort Sort 1, 2, = <9,4,6> 1 2 insert 2 insert < 1 : 2 2 : 1 : 2 1 insert 1 2 < 1 2 1 2 1 : 2 1 2 1 2 : 1 2 1 2 2 1 2 1 4<6 9 Ech lef contins permuttion π(1), π(2),, π(n) to indicte tht the ordering π(1) π(2)... π(n) hs been estblished. CS 4 Anlysis of Algorithms 10 Decision-tree model A decision tree models the execution of ny comprison sorting lgorithm: One tree per input size n. The tree contins ll possible comprisons (= if-brnches) tht could be executed for ny input of size n. The tree contins ll comprisons long ll possible instruction trces (= control flows) for ll inputs of size n. For one input, only one pth to lef is executed. Running time = length of the pth tken. Worst-cse running time = height of tree. Lower bound for comprison sorting Theorem. Any decision tree tht cn sort n elements must hve height Ω(n log n). Proof. The tree must contin n! leves, since there re n! possible permuttions. A height-h binry tree hs 2 h leves. Thus, n! 2 h. h log(n!) (log is mono. incresing) log ((n/2) n/2 ) = n/2 log n/2 h Ω(n log n). CS 4 Anlysis of Algorithms 11 CS 4 Anlysis of Algorithms 12

Lower bound for comprison sorting Corollry. Hepsort nd merge sort re symptoticlly optiml comprison sorting lgorithms. Sorting in liner time Counting sort: No comprisons between elements. Input: A[1.. n], where A[ j] {1, 2,, k}. Output: B[1.. n], sorted. Auxiliry storge: C[1.. k]. CS 4 Anlysis of Algorithms 1 CS 4 Anlysis of Algorithms 14 Counting sort Counting-sort exmple 1. for i 1 to k do C[i] 0 2. for j 1 to n do C[A[ j]] C[A[ j]] + 1. for i 2 to k do C[i] C[i] + C[i 1] C[i] = {key = i} C[i] = {key i} 5 C: CS 4 Anlysis of Algorithms 15 CS 4 Anlysis of Algorithms 16

Loop 1 Loop 2 5 C: 00 00 00 00 5 C: 00 00 00 11 1. for i 1 to k do C[i] 0 2. for j 1 to n do C[A[ j]] C[A[ j]] + 1 C[i] = {key = i} CS 4 Anlysis of Algorithms 17 CS 4 Anlysis of Algorithms 18 Loop 2 Loop 2 5 C: 11 00 00 11 5 C: 11 00 11 11 2. for j 1 to n do C[A[ j]] C[A[ j]] + 1 C[i] = {key = i} 2. for j 1 to n do C[A[ j]] C[A[ j]] + 1 C[i] = {key = i} CS 4 Anlysis of Algorithms 19 CS 4 Anlysis of Algorithms 20

Loop 2 Loop 2 5 C: 11 00 11 22 5 C: 11 00 22 22 2. for j 1 to n do C[A[ j]] C[A[ j]] + 1 C[i] = {key = i} 2. for j 1 to n do C[A[ j]] C[A[ j]] + 1 C[i] = {key = i} CS 4 Anlysis of Algorithms 21 CS 4 Anlysis of Algorithms 22 Loop Loop 5 C: 11 00 22 22 5 C: 11 00 22 22 C': 11 11 22 22 C': 11 11 22. for i 2 to k do C[i] C[i] + C[i 1] C[i] = {key i}. for i 2 to k do C[i] C[i] + C[i 1] C[i] = {key i} CS 4 Anlysis of Algorithms 2 CS 4 Anlysis of Algorithms 24

Loop 5 C: 11 00 22 22 5 C: 11 11 55 C': 11 11 55 C': 11 11 55. for i 2 to k do C[i] C[i] + C[i 1] C[i] = {key i} CS 4 Anlysis of Algorithms 25 CS 4 Anlysis of Algorithms 26 5 C: 11 11 55 5 C: 11 11 22 55 C': 11 11 22 55 44 C': 11 11 22 55 CS 4 Anlysis of Algorithms 27 CS 4 Anlysis of Algorithms 28

5 C: 11 11 22 55 5 C: 11 11 22 44 44 C': 11 11 22 44 44 C': 11 11 22 44 CS 4 Anlysis of Algorithms 29 CS 4 Anlysis of Algorithms 0 5 C: 11 11 22 44 5 C: 11 11 11 44 44 C': 11 11 11 44 11 44 C': 11 11 11 44 CS 4 Anlysis of Algorithms 1 CS 4 Anlysis of Algorithms 2

5 C: 11 11 11 44 5 C: 00 11 11 44 11 44 C': 00 11 11 44 11 44 44 C': 00 11 11 44 CS 4 Anlysis of Algorithms CS 4 Anlysis of Algorithms 4 5 11 44 44 C: 00 11 11 44 C': 00 11 11 Anlysis Θ(k) Θ(n) Θ(k) Θ(n) Θ(n + k) 1. for i 1 to k do C[i] 0 2. for j 1 to n do C[A[ j]] C[A[ j]] + 1. for i 2 to k do C[i] C[i] + C[i 1] CS 4 Anlysis of Algorithms 5 CS 4 Anlysis of Algorithms 6

Running time If k = O(n), then counting sort tkes Θ(n) time. But, sorting tkes Ω(n log n) time! Where s the fllcy? Answer: Comprison sorting tkes Ω(n log n) time. Counting sort is not comprison sort. In fct, not single comprison between elements occurs! Stble sorting Counting sort is stble sort: it preserves the input order mong equl elements. 11 44 44 Exercise: Wht other sorts hve this property? CS 4 Anlysis of Algorithms 7 CS 4 Anlysis of Algorithms 8 Rdix sort Opertion of rdix sort Origin: Hermn Hollerith s crd-sorting mchine for the 1890 U.S. Census. (See Appendix.) Digit-by-digit sort. Hollerith s originl (bd) ide: sort on most-significnt digit first. Good ide: Sort on lest-significnt digit first with n uxiliry stble sorting lgorithm (like counting sort). 2 9 8 9 4 6 5 5 5 5 4 6 2 9 8 9 2 9 4 6 8 9 5 5 2 9 5 5 4 6 8 9 CS 4 Anlysis of Algorithms 9 CS 4 Anlysis of Algorithms 40

Correctness of rdix sort Correctness of rdix sort Induction on digit position Assume tht the numbers re sorted by their low-order t 1digits. Sort on digit t 2 9 4 6 8 9 5 5 2 9 5 5 4 6 8 9 Induction on digit position Assume tht the numbers re sorted by their low-order t 1digits. Sort on digit t Two numbers tht differ in digit t re correctly sorted. 2 9 4 6 8 9 5 5 2 9 5 5 4 6 8 9 CS 4 Anlysis of Algorithms 41 CS 4 Anlysis of Algorithms 42 Correctness of rdix sort Anlysis of rdix sort Induction on digit position Assume tht the numbers re sorted by their low-order t 1digits. Sort on digit t Two numbers tht differ in digit t re correctly sorted. Two numbers equl in digit t re put in the sme order s the input correct order. 2 9 4 6 8 9 5 5 2 9 5 5 4 6 8 9 CS 4 Anlysis of Algorithms 4 Sort n computer words of b bits ech. View ech word s hving b/r bse-2 r digits. Exmple: 2-bit word (b=2) r = 1: 2 bse-2 digits 2 b/r =2psses of counting 1 2 sort on bse-2 digits 4 2 2 2 2 1 2 0 r = 8: 2/8 bse-28 digits (2 8 ) (2 8 ) 2 (2 8 ) 1 (2 8 ) 0 b/r =4psses of counting sort on bse-2 8 digits r = 16: 2/16 bse-216 digits (2 16 ) 1 (2 16 ) 0 b/r =2psses of counting sort on bse-2 16 digits CS 4 Anlysis of Algorithms 44

Anlysis of rdix sort Sort n computer words of b bits ech. View ech word s hving b/r bse-2 r digits. Assume counting sort is the uxiliry stble sort. Mke b/r psses of counting sort on bse-2 r digits How mny psses should we mke? CS 4 Anlysis of Algorithms 45 Anlysis (continued) Recll: Counting sort tkes Θ(n + k) time to sort n numbers in the rnge from 0 to k 1. If ech b-bit word is broken into r-bit pieces, ech pss of counting sort tkes Θ(n + 2 r ) time. Since there re b/r psses, we hve T ( n, b) = Θ b ( n + 2r ) r. Choose r to minimize T(n, b): Incresing r mens fewer psses, but s r >> log n, the time grows exponentilly. CS 4 Anlysis of Algorithms 46 Choosing r T ( n, b) = Θ b ( n + 2r ) r Minimize T(n, b) by differentiting nd setting to 0. Or, just observe tht we don t wnt 2 r > n, nd there s no hrm symptoticlly in choosing r s lrge s possible subject to this constrint. Choosing r = log n implies T(n, b) = Θ(bn/log n). CS 4 Anlysis of Algorithms 47 Rdix Sort with optimized r Assume counting sort is the uxiliry stble sort. Sort n computer words of b bits ech. The runtime of rdix sort is: T(n, b) = Θ(bn/log n). Exmple: For numbers in the rnge from 0 to n d 1, we hve b = d log n rdix sort runs in Θ(dn) time. Notice tht counting sort runs in O(n+k) time, where ll numbers re in the rnge 1 through k. CS 4 Anlysis of Algorithms 48

Conclusions In prctice, rdix sort is fst for lrge inputs, s well s simple to code nd mintin. Exmple (2-bit numbers): At most psses when sorting 2000 numbers. Merge sort nd quicksort do t lest log 2000 = 11 psses. Downside: Unlike quicksort, rdix sort displys little loclity of reference, nd thus well-tuned quicksort fres better on modern processors, which feture steep memory hierrchies. CS 4 Anlysis of Algorithms 49 Appendix: Punched-crd technology Hermn Hollerith (1860-1929) Punched crds Hollerith s tbulting system Opertion of the sorter Origin of rdix sort Modern IBM crd Return to lst slide viewed. CS 4 Anlysis of Algorithms 50 Hermn Hollerith (1860-1929) The 1880 U.S. Census took lmost 10 yers to process. While lecturer t MIT, Hollerith prototyped punched-crd technology. His mchines, including crd sorter, llowed the 1890 census totl to be reported in 6 weeks. He founded the Tbulting Mchine Compny in 1911, which merged with other compnies in 1924 to form Interntionl Business Mchines. CS 4 Anlysis of Algorithms 51 Punched crds Punched crd = dt record. Hole = vlue. Algorithm = mchine + humn opertor. Replic of punch crd from the 1900 U.S. census. [Howells 2000] CS 4 Anlysis of Algorithms 52

Opertion of the sorter Hollerith s tbulting system Pntogrph crd punch Hnd-press reder Dil counters Sorting box Figure from [Howells 2000]. CS 4 Anlysis of Algorithms 5 An opertor inserts crd into the press. Pins on the press rech through the punched holes to mke electricl contct with mercuryfilled cups beneth the crd. Whenever prticulr digit vlue is punched, the lid of the corresponding sorting bin lifts. The opertor deposits the crd Hollerith Tbultor, Pntogrph, Press, nd Sorter into the bin nd closes the lid. When ll crds hve been processed, the front pnel is opened, nd the crds re collected in order, yielding one pss of stble sort. CS 4 Anlysis of Algorithms 54 Origin of rdix sort Hollerith s originl 1889 ptent lludes to mostsignificnt-digit-first rdix sort: The most complicted combintions cn redily be counted with comprtively few counters or relys by first ssorting the crds ccording to the first items entering into the combintions, then ressorting ech group ccording to the second item entering into the combintion, nd so on, nd finlly counting on few counters the lst item of the combintion for ech group of crds. Lest-significnt-digit-first rdix sort seems to be folk invention originted by mchine opertors. Modern IBM crd One chrcter per column. Produced by the WWW Virtul Punch- Crd Server. So, tht s why text windows hve 80 columns! CS 4 Anlysis of Algorithms 55 CS 4 Anlysis of Algorithms 56