Lecture Note CMSC 51 p p x? x < x >= x? wap r r Itermediate cofiguratio Iitial cofiguratio < x x >= x Fial cofiguratio Figure 14: Partitioig itermediate tructure. all of the elemet have bee proceed. To fiih thig off we wap A[p] the pivot with A[], ad retur the value of. Here i the complete code: Partitio Partitioit p, it r, array A { x = A[p] = p for = p+1 to r do { if A[] < x { = +1 wap A[] with A[] wap A[p] with A[] retur // 3-way partitio of A[p..r] // pivot item i A[p] // put the pivot ito fial poitio // retur locatio of pivot A example i how below. Lecture 15: QuickSort Tueday, Mar 17, 1998 Revied: March 18. Fixed a bug i the aalyi. Read: Chapt 8 i CLR. My preetatio ad aalyi are omewhat differet tha the text. QuickSort ad Radomized Algorithm: Early i the emeter we dicued the fact that we uually tudy the wort-cae ruig time of algorithm, but ometime average-cae i a more meaigful meaure. Today we will tudy QuickSort. It i a wort-cae Θ algorithm, whoe expected-cae ruig time i Θ log. We will preet QuickSort a a radomized algorithm, that i, a algorithm which make radom choice. There are two commo type of radomized algorithm: Mote Carlo algorithm: Thee algorithm may produce the wrog reult, but the probability of thi occurrig ca be made arbitrarily mall by the uer. Uually the lower you make thi probability, the loger the algorithm take to ru. 47
Lecture Note CMSC 51 p r 5 3 8 6 4 7 3 1 5 3 4 6 8 7 3 1 5 3 8 6 4 7 3 1 5 3 4 6 8 7 3 1 5 3 8 6 4 7 3 1 5 3 4 3 8 7 6 1 5 3 8 6 4 7 3 1 5 3 4 3 1 7 6 8 1 3 4 3 5 7 6 8 Fial wap Figure 15: Partitioig example. La Vega algorithm: Thee algorithm alway produce the correct reult, but the ruig time i a radom variable. I thee cae the expected ruig time, averaged over all poible radom choice i the meaure of the algorithm ruig time. The mot well kow Mote Carlo algorithm i oe for determiig whether a umber i prime. Thi i a importat problem i cryptography. The QuickSort algorithm that we will dicu today i a example of a La Vega algorithm. Note that QuickSort doe ot eed to be implemeted a a radomized algorithm, but a we hall ee, thi i geerally coidered the afet implemetatio. QuickSort Overview: QuickSort i alo baed o the divide-ad-couer deig paradigm. Ulike Merge- Sort where mot of the work i doe after the recurive call retur, i QuickSort the work i doe before the recurive call i made. Here i a overview of QuickSort. Note the imilarity with the electio algorithm, which we dicued earlier. Let A[p..r] be the ubarray to be orted. The iitial call i to A[1..]. Bai: If the lit cotai 0 or 1 elemet, the retur. Select pivot: Select a radom elemet x from the array, called the pivot. Partitio: Partitio the array i three ubarray, thoe elemet A[1.. 1] x, A[] = x, ad A[ +1..] x. Recure: Recurively ort A[1.. 1] ad A[ +1..]. The peudocode for QuickSort i give below. The iitial call i QuickSort1,, A. The Partitio routie wa dicued lat time. Recall that Partitio aume that the pivot i tored i the firt elemet of A. Sice we wat a radom pivot, we pick a radom idex i from p to r, ad the wap A[i] with A[p]. QuickSort QuickSortit p, it r, array A { if r <= p retur i = a radom idex from [p..r] // Sort A[p..r] // 0 or 1 item, retur // pick a radom elemet 48
Lecture Note CMSC 51 wap A[i] with A[p] = Partitiop, r, A QuickSortp, -1, A QuickSort+1, r, A // wap pivot ito A[p] // partitio A about pivot // ort A[p..-1] // ort A[+1..r] QuickSort Aalyi: The correcte of QuickSort hould be pretty obviou. However it aalyi i ot o obviou. It tur out that the ruig time of QuickSort deped heavily o how good a job we do i electig the pivot. I particular, if the rak of the pivot recall that thi mea it poitio i the fial orted lit i very large or very mall, the the partitio will be ubalaced. We will ee that ubalaced partitio like ubalaced biary tree are bad, ad reult i poor ruig time. However, if the rak of the pivot i aywhere ear the middle portio of the array, the the plit will be reaoably well balaced, ad the overall ruig time will be good. Sice the pivot i choe at radom by our algorithm, we may do well mot of the time ad poorly occaioally. We will ee that the expected ruig time i O log. Wort-cae Aalyi: Let begi by coiderig the wort-cae performace, becaue it i eaier tha the average cae. Sice thi i a recurive program, it i atural to ue a recurrece to decribe it ruig time. But ulike MergeSort, where we had cotrol over the ize of the recurive call, here we do ot. It deped o how the pivot i choe. Suppoe that we are ortig a array of ize, A[1..], ad further uppoe that the pivot that we elect i of rak, for ome i the rage 1 to. It take Θ time to do the partitioig ad other overhead, ad we make two recurive call. The firt i to the ubarray A[1.. 1] which ha 1 elemet, ad the other i to the ubarray A[ +1..] which ha r +1+1=r elemet. So if we igore the Θ a uual we get the recurrece: T =T 1 + T +. Thi deped o the value of. To get the wort cae, we maximize over all poible value of. Aa bai we have that T 0 = T 1 = Θ1. Puttig thi together we have { 1 if 1 T = max 1 T 1 + T + otherwie. Recurrece that have max ad mi embedded i them are very mey to olve. The key i determiig which value of give the maximum. A rule of thumb of algorithm aalyi i that the wort cae ted to happe either at the extreme or i the middle. So I would plug i the value =1,=, ad = / ad work each out. I thi cae, the wort cae happe at either of the extreme but ee the book for a more careful aalyi baed o a aalyi of the ecod derivative. If we expad the recurrece i the cae =1we get: T T 0 + T 1 + = 1+T 1 + = T 1++1 = T + ++1 = T 3+ 1 + ++1 = T 4+ + 1 + ++1 =... = k T k+ i. i= 1 49
Lecture Note CMSC 51 For the bai, T 1=1we et k = 1 ad get 3 T T 1 + i i= 1 = 1+3+4+5+...+ 1 + ++ 1 +1 i = i=1 + 1 + O. I fact, a more careful aalyi reveal that it i Θ i thi cae. Average-cae Aalyi: Next we how that i the average cae QuickSort ru i Θ log time. Whe we talked about average-cae aalyi at the begiig of the emeter, we aid that it deped o ome aumptio about the ditributio of iput. However, i thi cae, the aalyi doe ot deped o the iput ditributio at all it oly deped o the radom choice that the algorithm make. Thi i good, becaue it mea that the aalyi of the algorithm performace i the ame for all iput. I thi cae the average i computed over all poible radom choice that the algorithm might make for the choice of the pivot idex i the ecod tep of the QuickSort procedure above. To aalyze the average ruig time, we let T deote the average ruig time of QuickSort o a lit of ize. It will implify the aalyi to aume that all of the elemet are ditict. The algorithm ha radom choice for the pivot elemet, ad each choice ha a eual probability of 1/ of occurig. So we ca modify the above recurrece to compute a average rather tha a max, givig: { 1 if 1 T = 1 =1 T 1 + T + otherwie. Thi i ot a tadard recurrece, o we caot apply the Mater Theorem. Expaio i poible, but rather tricky. Itead, we will attempt a cotructive iductio to olve it. We kow that we wat a Θ log ruig time, o let try T a lg + b. Properly we hould write lg becaue ulike MergeSort, we caot aume that the recurive call will be made o array ize that are power of, but we ll be loppy becaue thig will be mey eough ayway. Theorem: There exit a cotat c uch that T c l, for all. Notice that we have replaced lg with l. Thi ha bee doe to make the proof eaier, a we hall ee. Proof: The proof i by cotructive iductio o. For the bai cae =we have T = 1 T 1 + T + =1 = 1 T 0 + T 1 + + T 1 + T 0 + = 8 = 4. We wat thi to be at mot climplyig that c 4/ l.885. For the iductio tep, we aume that 3, ad the iductio hypothei i that for ay <, we have T c l. We wat to prove it i true for T. By expadig the defiitio of T, ad movig the factor of outide the um we have: T = 1 = 1 T 1 + T + =1 T 1 + T +. =1 50
Lecture Note CMSC 51 Oberve that if we plit the um ito two um, they both add the ame value T 0 + T 1 +...+T 1, jut that oe cout up ad the other cout dow. Thu we ca replace thi with 1 =0 T. Becaue they do t follow the formula, we ll extract T 0 ad T 1 ad treat them pecially. If we make thi ubtitutio ad apply the iductio hypothei to the remaiig um we have which we ca becaue <wehave T = 1 T + = 1 T 0 + T 1 + T + =0 1 1+1+ c lg + = = c 1 c l + + 4. = We have ever ee thi um before. Later we will how that 1 S = l l 4. Aumig thi for ow, we have = T = c l + + 4 4 = c l c + + 4 = c l + 1 c + 4. To fiih the proof, we wat all of thi to be at mot c l. If we cacel the commo c l we ee that thi will be true if we elect c uch that 1 c + 4 0. After ome imple maipulatio we ee that thi i euivalet to: = 0 c + 4 c + 4 c + 8. Sice 3, we oly eed to elect c o that c + 8 9, ad o electig c =3will work. From the bai cae we have c.885, o we may chooe c =3to atify both the cotrait. The Leftover Sum: The oly miig elemet to the proof i dealig with the um 1 S = l. = 51
Lecture Note CMSC 51 To boud thi, recall the itegratio formula for boudig ummatio which we paraphrae here. For ay mootoically icreaig fuctio fx b 1 fi i=a b a fxdx. The fuctio fx =xl x i mootoically icreaig, ad o we have S x l xdx. If you are a calculu macho ma, the you ca itegrate thi by part, ad if you are a calculu wimp like me the you ca look it up i a book of itegral x l xdx = x x l x 4 = l l 1 l 4 4. x= Thi complete the ummatio boud, ad hece the etire proof. Summary: So eve though the wort-cae ruig time of QuickSort i Θ, the average-cae ruig time i Θ log. Although we did ot how it, it tur out that thi doe t jut happe much of the time. For large value of, the ruig time i Θ log with high probability. I order to get Θ time the algorithm mut make poor choice for the pivot at virtually every tep. Poor choice are rare, ad o cotiuouly makig poor choice are very rare. You might ak, could we make QuickSort determiitic Θ log by callig the electio algorithm to ue the media a the pivot. The awer i that thi would work, but the reultig algorithm would be o low practically that o oe would ever ue it. QuickSort like MergeSort i ot formally a i-place ortig algorithm, becaue it doe make ue of a recurio tack. I MergeSort ad i the expected cae for QuickSort, the ize of the tack i Olog, o thi i ot really a problem. QuickSort i the mot popular algorithm for implemetatio becaue it actual performace o typical moder architecture i o good. The reao for thi tem from the fact that ulike Heaport which ca make large jump aroud i the array, the mai work i QuickSort i partitioig ped mot of it time acceig elemet that are cloe to oe aother. The reao it ted to outperform MergeSort which alo ha good locality of referece i that mot compario are made agait the pivot elemet, which ca be tored i a regiter. I MergeSort we are alway comparig two array elemet agait each other. The mot efficiet verio of QuickSort ue the recurio for large ubarray, but oce the ize of the ubarray fall below ome miimum ize e.g. 0 it witche to a imple iterative algorithm, uch a electio ort. Lecture 16: Lower Boud for Sortig Thurday, Mar 19, 1998 Read: Chapt. 9 of CLR. Review of Sortig: So far we have ee a umber of algorithm for ortig a lit of umber i acedig order. Recall that a i-place ortig algorithm i oe that ue o additioal array torage however, we allow QuickSort to be called i-place eve though they eed a tack of ize Olog for keepig track of the recurio. A ortig algorithm i table if duplicate elemet remai i the ame relative poitio after ortig. 5