Lecture 9. Binary Trees (3) Instructor: 罗国杰 gluo@pku.edu.cn School of EECS Peking University
Outline n Sets and Maps n Priority queues (heaps) 2
Sets n The set is an ordered container that does not allow duplicates q Traversal of the set n n objects: iterator, const_iterator methods: begin, end, size, empty, q Other operations n insert, erase, find, 3
Sets // O(log N) insert pair<iterator,bool> insert (const Object & x); // O(1) insert if hint is accurate; otherwise O(log N) pair<iterator,bool> insert ( iterator hint, const Object & x); // this code segment might be faster set <int> s; for (int i = 0; i < 1000000; i++) s.insert(s.end(), i); 4
Sets int erase(const Object & x); iterator erase(iterator itr); iterator erase(iterator start, iterator end); iterator find(const Object & x) const; // in the following code, the set s has size 1 class CaseInsensitiveCompare { public: bool islessthan(const string & lhs, const string & rhs) const { return stricmp(lhs.c_cstr(), rhs.c_str()) < 0;} } set<string,caseinsensitivecompare> s; s.insert(hello ); s.insert( HELLO ); cout << The size is: << s.size() << endl; 5
Maps n A map is used to store a collection of ordered entries that consists of keys and their values n The map behaves like a set with pairs, whose comparison refers only to the key n Basic operations (like the set) q iterator itr n (*itr) is of type pair<keytype,valuetype> q begin, end, size, empty; insert, remove, find; n Extra operation q ValueType & operator[] (const KeyType & key); 6
Maps: Access Values map<string,double> salaries; salaries[ Pat ] = 75000.00; cout << salaries[ Pat ] << endl; cout << salaries[ Jan ] << endl; map<string,double>::const_iterator itr; itr = salaries.find( Chris ); if (itr == salaries.end()) cout << Not an employee of this company! << endl; else cout << itr->second << endl; 7
Implementation of Set and Map n STL requires that set and map support the basic insert, erase, and find in O(log N) worst-case n The implementation is a balanced BST q Typically AVL tree is not used q Instead, top-down red-black trees (later) n How to support the iterator classes under these constraints? q Internally, the iterator maintains a pointer to the current node q How to efficiently advance to the next node? 8
Implementation of Set and Map n How to efficiently advance an iterator to the next node? n Solution 1 q Each iterator store an array containing the set items as its data q This does not work for efficient erase and insert n Solution 2 q Have the iterator maintain a stack storing the nodes on the path to the current node q The iterator is large, and the iteration node is clumsy 9
Implementation of Set and Map n How to efficiently advance an iterator to the next node? n Solution 3 n n Have each node store an extra link to its parent Extra memory is always required; the iteration is clumsy n Solution 4 n n Have each node maintain extra links one to the next smaller, and one to the next larger node It takes spaces, but the iteration is very simple q 10
Implementation of Set and Map n How to efficiently advance an iterator to the next node? n Solution 5 q Maintain the extra links only for nodes that have NULL left or right links (threaded tree) q It is used in many STL implementations 11
Threaded Binary Trees n Reuse the lchild, rchild pointers for efficient access to predecessor and successor lchild ltag data rtag rchild q ltag = 0: lchild is the left child q ltag = 1: lchild is the predecessor q rtag = 0: rchild is the right child q rtag = 1: rchild is the successor 12
An Inorder Threaded Binary Tree 13
Priority Queue: Motivation n Have you ever been jammed by a huge job while you are waiting for just one-page printout? n This is a typical situation for a simple first-in first-out (FIFO) queue n Can there be a smarter printer (multi-user computing)? 14
Simple Implementations n There are multiple possibilities for the implementation. n Simple linked list (solution 1) q insert in front O(1) q delete minimum O(N) n Sorted linked list (solution 2) q insert O(N) q delete minimum O(1) 15
Simple Implementations n Binary search tree (solution 3) q O (log N) on average for insertion/deletion n Binary heap (solution in this lecture) q O (1) for finding the minimum q O (log N) for insert but O(1) on average q O(N) worst-case to build a priority queue 16
Priority Queue: Model Function level DeleteMin (H) Priority Queue H Insert (H) A Logical level B C D E F G H I J Implementation level A B C D E F G H I J 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 17
Binary Heap (Heap) n Structure property q A heap is a complete binary tree n that is completely filled, except at the bottom level, which is filled from left to right q A complete binary tree of height (depth) h has between 2 h and 2 h+1 1 nodes q The height of a complete binary tree = log N 18
Binary Heap A B C D E F G H I J n A complete binary tree can be represented in an array without using pointers 19
A B C D E F G H I 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 n The root is at pos. 1 q (reserve position 0 for a sentinel - MinData). 1 B A 2 C 3 4 5 6 7 D E F G J n For an element at position i, q its left child is at position 2i; q its it right child at 2i +1; q its parent is at i/2 H I J 8 9 10 20
Binary Heap n Heap Order Property q The value at any node should be smaller than all of its descendants (guarantee that the node with the minimum value is at the root). 13 21 16 13 21 16 24 31 19 68 11 31 19 68 65 26 32 65 26 32 21
Binary Heap 6 2 8 13 21 16 1 4 11 31 19 68 3 65 26 32 A Binary Search Tree a heap Note the difference in node ordering!! 22
Binary Heap // definition struct HeapStruct { }; int Capacity; // maximum size int Size; // actual size ElementType *Elements; typedef struct HeapStruct *PriorityQueue; 23
Binary Heap // some operations PriorityQueue Initialize (int MaxElements); void Destroy (PriorityQueue H); void MakeEmpty (PriorityQueue H); void Insert (ElementType X, PriorityQueue H); ElementType DeleteMin (PriorityQueue H); ElementType FindMin (PriorityQueue H); int IsEmpty (PriorityQueue H); int IsFull (PriorityQueue H); 24
Binary Heap: Initialize PriorityQueue Initialize (int MaxElements) { PriorityQueue H; if (MaxElements < MinPQSize) Error ("Priority queue size is too small"); H = malloc (sizeof (struct HeapStruct)); if (H == NULL) FatalError ("Out of space!!!"); 25
Binary Heap: Initialize } // Allocate the array plus one extra for sentinel H->Elements = malloc ((MaxElements + 1) * sizeof (ElementType)); if (H->Elements == NULL) FatalError ("Out of space!!!"); H->Capacity = MaxElements; H->Size = 0; H->Elements [0] = - ; return H; 26
Binary Heap: Insert n Attempt to insert 14 q Create a hole, and bubble the hole up 13 13 21 16 21 16 24 31 19 68 24 19 68 65 26 32 65 26 32 31 27
Binary Heap: Insert n The remaining two steps to insert 14 13 13 16 14 16 24 21 19 68 24 21 19 68 65 26 32 31 65 26 32 31 28
Binary Heap: Insert n To insert an element X, q Create a hole in the next available location. q If X can be placed in the hole without violating heap order, insertion is complete. q Otherwise bubble the hole up towards the root. q Continue this process until X can be placed in the hole (a percolating up process). n Worst case running time is O (log N) - the new element is percolating up all the way to the root. 29
Binary Heap: Insert // H->Element [0] is a sentinel void Insert (ElementType X, PriorityQueue H) { int i; if (IsFull (H)) { Error ("Priority queue is full"); return; } for (i = ++H->Size; H->Elements [i / 2] > X; i /= 2) H->Elements [i] = H->Elements [i / 2]; H->Elements [i] = X; } 30
Binary Heap: Insert H->Elements[0] 13 21 24 31 16 19 68 65 26 32 31
Binary Heap: DeleteMin n Creation of the hole at the root 13 14 16 14 16 24 21 19 68 24 21 19 68 65 26 32 31 65 26 32 31 32
Binary Heap: DeleteMin n Next two steps in DeleteMin 14 14 16 21 16 24 21 19 68 24 19 68 65 26 32 31 65 26 32 31 33
Binary Heap: DeleteMin n Last step in DeleteMin 14 21 24 31 16 19 68 65 26 32 34
Binary Heap: DeleteMin n The element at the root (position 1) is to be removed, and a hole is created. n If X is smaller than the child(ren), job is done. n Otherwise slide the smaller of the hole s children into the hole, thus pushing the hole down one level. 35
Binary Heap: DeleteMin n Repeat the previous step until X can be placed in the hole (percolating down). n Some node may have only one child (be careful when coding!). n Worst case running time is O(log N). n On average, the hole is percolated almost to the bottom of the heap, so the average running time is O(log N) again. 36
Binary Heap: DeleteMin ElementType DeleteMin (PriorityQueue H) { int i, Child; ElementType MinElement, LastElement; if (IsEmpty (H)) { } Error ("Priority queue is empty"); return H->Elements [0]; MinElement = H->Elements [1]; LastElement = H->Elements [H->Size--]; 37
Binary Heap: DeleteMin for (i = 1; i * 2 <= H->Size; i = Child) { // Find smaller child Child = i * 2; if ( Child!= H->Size && H->Elements[Child + 1] < H->Elements[Child] ) Child++; // Percolate one level if (LastElement > H->Elements [Child]) H->Elements [i] = H->Elements [Child]; else break; } H->Elements [i] = LastElement; return MinElement; } 38
Binary Heap: Other Operations n There is no way to find any particular key without a linear scan through the entire heap. n However, if we know the position, we can access the key immediately. 39
Binary Heap: Other Operations n DecreaseKey (P,, H) q Lower the key value at position P by. q Fix the heap order by percolating up. q (Advance the priority of a job.) n Delete (P, H) q Remove the node at position P. q DecreaseKey (P,, H) and DeleteMin (H) 40
Binary Heap: BuildHeap n BuildHeap (H) q Given N unordered keys, how to build a heap? q Solution 1: n n n N successive appends at the end of the array each takes O(1) average and O(log N) worst-case time. total runtime is O(N) average but O(N log N) worst-case q Solution 2: n PercolateDown (i), for i = N/2 to 1. 41
Binary Heap: BuildHeap n Initial heap 150 80 30 10 40 70 110 100 20 90 60 50 120 140 130 42
Binary Heap: BuildHeap n After PercolateDown(7) q dashed line = 2 comparisons (compare with the smaller child) 150 80 30 10 40 70 110 100 20 90 60 50 120 140 130 43
Binary Heap: BuildHeap n After PercolateDown(6) 150 80 30 10 40 50 110 100 20 90 60 70 120 140 130 44
Binary Heap: BuildHeap n After PercolateDown(5) 150 80 30 10 40 50 110 100 20 90 60 70 120 140 130 45
Binary Heap: BuildHeap n After PercolateDown(4) 150 80 20 10 40 50 110 100 30 90 60 70 120 140 130 46
Binary Heap: BuildHeap n After PercolateDown(3) 150 80 20 10 40 50 110 100 30 90 60 70 120 140 130 47
Binary Heap: BuildHeap n After PercolateDown(2) 150 10 20 60 40 50 110 100 30 90 80 70 120 140 130 48
Binary Heap: BuildHeap n After PercolateDown(1) 10 20 30 60 40 50 110 100 150 90 80 70 120 140 130 49
BuildHeap: Complexity Analysis n For a heap with n nodes q the depth d = log 2 n q there are 2 i nodes with depth i n The number of PercolateDown operations q About a half nodes at depth d q About a quarter nodes at depth d-1, which percolate down at most one level q At every upper level, the max distance to move is increased by 1, but the number of nodes to move is also decreased by 1 q Time complexity Σ i=1 logn (i-1)n/2 i = O(n) 50
Applications of Priority Queues n Find the kth smallest elements q It requires k DeleteMin operations. q O(N) to create the heap. q O(log N) for each DeleteMin. q Total running time is O(N + k log N). q If k = O(N / log N), running time is O(N). q For large value of k, running time is O(k log N). 51
Applications of Priority Queues n Discrete Event Simulation q e.g., bank waiting line q Given n n n n n n customers interarrival distribution number of tellers (server) arriving customer joins the shortest queue customers are served on FIFO within the queue no queue switching service time (transaction time) distribution 52
Applications of Priority Queues n Statistics required q average waiting time q average banking time (waiting time + service time) q maximum waiting time, banking time, queue length n Generate service time of each customer n Generate arrival time of each customer (arrival time of current customer + time interval for the next customer to come) n One customer queue for each teller 53
Applications of Priority Queues n Event queue with 2 types of events (in event occurrence sequence) q customer arrival q complete of service of one customer n At customer arrival event q generate service time for this customer q insert the customer into the end of the shortest teller queue 54
Applications of Priority Queues q generate interarrival time and then compute arrival time of the next customer q using the arrival time of the next customer, generate an arrival event and insert it into the event queue (not necessarily the last in the queue) n At service completion event q remove the customer from the teller queue q compute relevant statistics for this customer 55
Applications of Priority Queues q if this teller queue is not empty, serve the next customer in the queue q compute the service completion time (current time + service time) q generate service completion event, and insert it into the event queue 56
Applications of Priority Queues k servers Customer arriving? Average queue length? Average waiting time... Customer departing 57
Applications of Priority Queues k servers? Average queue length? Average waiting time... Departure time = Arrival time + Waiting time + Service time Customer arriving Customer departing Arrive 09:15 9:19 9:23 9:24 9:30 9:31 Serve (min) 10 12 4 15 2 5 58
Applications of Priority Queues Approach Using a heap Arrive 09:15 9:19 9:23 9:24 9:30 9:31 Serve (min) 10 6 4 15 2 5 server available? N, go into the queue FIFO Teller Queue Check the next arrival Y, compute the departure time & insert the departure event Insert arrival event 13 21 16 24 31 19 68 65 26 32 DeleteMin if arrival event Event Queue (Heap) 59
Applications of Priority Queues Approach Using a heap Arrive 09:15 9:19 9:23 9:24 9:30 9:31 Serve (min) 10 6 4 15 2 5 DeleteMin if departure event Queue Empty? Y, go to next DeleteMin 13 21 16 N, remove one from the queue, compute the departure FIFO Teller Queue time & insert as a departure event 24 31 19 68 65 26 32 Event Queue (Heap) 60
Applications of Priority Queues n Given N customers, how many events altogether? n What is the running (CPU) time for each event? n What is the overall complexity? 61
Recommended Readings n Weiss, DS & Algo. Analysis in C++ (3 rd ed.) q Section 4.8 Sets and Maps in the STL q Section 6.1-6.4 Priority Queues (Heaps) 62
Acknowledgments n 张铭 王腾蛟 赵海燕, 数据结构与算法, 高教出版社,2008 年 n Weiss, Data structures and algorithm analysis in C++, (3 rd ed.), 人民邮电出版社, 2006. q Lecture notes, COMP1200, HKBU n http://www.comp.hkbu.edu.hk/~comp1200/