1 Maintaining the heap property

Similar documents
Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Converting a Number from Decimal to Binary

Algorithms and Data Structures

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Binary Heap Algorithms

Binary Heaps. CSE 373 Data Structures

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Symbol Tables. Introduction

Big Data and Scripting. Part 4: Memory Hierarchies

Heaps & Priority Queues in the C++ STL 2-3 Trees

Chapter 13: Query Processing. Basic Steps in Query Processing

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Persistent Binary Search Trees

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

6 March Array Implementation of Binary Trees

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

1/1 7/4 2/2 12/7 10/30 12/25

Binary Search Trees CMPSC 122

Ordered Lists and Binary Trees

Data Structures and Data Manipulation

Exam study sheet for CS2711. List of topics

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

10CS35: Data Structures Using C

CSE 326: Data Structures B-Trees and B+ Trees

PES Institute of Technology-BSC QUESTION BANK

DATABASE DESIGN - 1DL400

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

CS711008Z Algorithm Design and Analysis

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

CS473 - Algorithms I

TREE BASIC TERMINOLOGIES

Data Structures Fibonacci Heaps, Amortized Analysis

Sample Questions Csci 1112 A. Bellaachia

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

Full and Complete Binary Trees

Data Structure [Question Bank]

Data Structures and Algorithms

Lecture Notes on Binary Search Trees

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

Algorithms Chapter 12 Binary Search Trees

Analysis of Algorithms I: Binary Search Trees

Lecture Notes on Binary Search Trees

EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27

Algorithms and Data Structures

Cpt S 223. School of EECS, WSU

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

AP Computer Science AB Syllabus 1

Introduction to Programming System Design. CSCI 455x (4 Units)

Data Structures. Level 6 C Module Descriptor

From Last Time: Remove (Delete) Operation

Lecture 6: Binary Search Trees CSCI Algorithms I. Andrew Rosenberg

Lecture Notes on Linear Search

Figure 1: Graphical example of a mergesort 1.

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

Lecture 1: Data Storage & Index

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

Sorting Algorithms. Nelson Padua-Perez Bill Pugh. Department of Computer Science University of Maryland, College Park

International Journal of Advanced Research in Computer Science and Software Engineering

Data Structure with C

A Comparison of Dictionary Implementations

A binary search tree is a binary tree with a special property called the BST-property, which is given as follows:

Binary Search Trees. Each child can be identied as either a left or right. parent. right. A binary tree can be implemented where each node

Physical Data Organization

Lecture 2 February 12, 2003

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

Binary Search Trees. Ric Glassey

Algorithms. Margaret M. Fleck. 18 October 2010

Assessment Plan for CS and CIS Degree Programs Computer Science Dept. Texas A&M University - Commerce

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

DATA STRUCTURES USING C

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Module 2 Stacks and Queues: Abstract Data Types

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants

Node-Based Structures Linked Lists: Implementation

Laboratory Module 6 Red-Black Trees

Analysis of Binary Search algorithm and Selection Sort algorithm

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Assessment for Master s Degree Program Fall Spring 2011 Computer Science Dept. Texas A&M University - Commerce

Chapter 14 The Binary Search Tree

Introduction to Data Structures and Algorithms

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

How To Create A Tree From A Tree In Runtime (For A Tree)

Motivation Suppose we have a database of people We want to gure out who is related to whom Initially, we only have a list of people, and information a

Operations: search;; min;; max;; predecessor;; successor. Time O(h) with h height of the tree (more on later).

Conclusion and Future Directions

Biostatistics 615/815

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Classification/Decision Trees (II)

Common Data Structures

HW4: Merge Sort. 1 Assignment Goal. 2 Description of the Merging Problem. Course: ENEE759K/CMSC751 Title:

Transcription:

1 CS29003: ALGORITHMS LABORATORY Assignment 4 The (binary) heap data structure is an array that can be viewed as a complete binary tree, as shown in figure below. Each node of the tree corresponds to an element of the array that stores the record with key (simply call value) in the node. Being a complete binary tree, it is completely filled on all levels except possibly the lowest, which is filled from the left up to a point. The root of the tree is A[1], and given the index i of a node, the indices of its parent PARENT(i), left child LEFT(i), and right child RIGHT(i) can be computed simply as follows. PARENT(i) : return i/2 LEFT(i) : return 2i RIGHT(i) : return 2i+1 Additionally, heaps also satisfy the heap property: for every node i other than the root, A[P ARENT (i)] A[i] for a min-heap and A[P ARENT (i)] A[i] for a max-heap. The basic heap operations may be listed as follows. 1 Maintaining the heap property HEAPIFY is an important subroutine for manipulating heaps. Its inputs are an array A and an index i into the array. When HEAPIFY is called, it is assumed that the binary trees rooted at LEFT(i) and RIGHT(i) are heaps, but that A[i] may be smaller than its children, thus violating the (max) heap property. The function of HEAPIFY is to let the value at A[i] float down in the heap so that the subtree rooted at index i becomes a heap. HEAPIFY (A, i) l = LEFT (i); r = RIGHT (i); if ((l <= heap - size ) && (A[l] > A[i ])) largest = l; else largest = i; if ((r <= heap - size ) && (A[r] > A[ largest ])) largest = r; if ( largest!= i) exchange (A[i], A[ largest ]); HEAPIFY (A, largest );

2 Figure 1: The action of HEAPIFY(A, 2), where heap-size[a] = 10. (a) The initial configuration of the heap, with A[2] at node i = 2 violating the heap property since it is not larger than both children. The heap property is restored for node 2 in (b) by exchanging A[2] with A[4], which destroys the heap property for node 4. The recursive call HEAPIFY(A, 4) now sets i = 4. After swapping A[4] with A[9], as shown in (c), node 4 is fixed up, and the recursive call HEAPIFY(A, 9) yields no further change to the data structure. 2 Building a heap We can use the procedure HEAPIFY in a bottom-up manner to convert an array A[1.. n], where n = length[a], into a heap. Since the elements in the subarray A[(n/2 + l).. n] are all leaves of the tree, each is a 1-element heap to begin with. The procedure BUILD-HEAP goes through the remaining nodes of the tree and runs HEAPIFY on each one. The order in which the nodes are processed guarantees that the subtrees rooted at children of a node i are heaps before HEAPIFY is run at that node. BUILD - HEAP (A) heap - size = length (A); for (i = heap - size /2; i >= 1; i - -) HEAPIFY (A, i); 3 The heapsort algorithm The heapsort algorithm starts by using BUILD-HEAP to build a heap on the input array A[1.. n], where n = length[a]. Since the maximum element of the array is stored at the root A[1], it can be put into its correct final position by exchanging it with A[n]. If we now discard node n from the heap (by decrementing heap-size(a)), we observe that A[1.. (n - 1)] can easily be made into a heap. The children of the root remain heaps, but the new root element may violate the heap

3 Figure 2: The operation of BUILD-HEAP, showing the data structure before the call to HEAPIFY in line 3 of BUILD-HEAP. (a) A 10-element input array A and the binary tree it represents. The figure shows that the loop index i points to node 5 before the call HEAPIFY(A, i). (b) The data structure that results. The loop index i for the next iteration points to node 4. (c)-(e) Subsequent iterations of the for loop in BUILD-HEAP. Observe that whenever HEAPIFY is called on a node, the two subtrees of that node are both heaps. (f) The heap after BUILD-HEAP finishes. property. All that is needed to restore the heap property, however, is one call to HEAPIFY(A, 1), which leaves a heap in A[1.. (n - 1)]. The heapsort algorithm then repeats this process for the heap of size n - 1 down to a heap of size 2. HEAPSORT (A) BUILD - HEAP (A); for (i = length (A); i >=2; i - -) exchange (A[1], A[i ]); heapsize - -; HEAPIFY (A,1);

4 Figure 3: The operation of HEAPSORT. (a) The heap data structure just after it has been built by BUILD-HEAP. (b)-(j) The heap just after each call of HEAPIFY. The value of i at that time is shown. Only lightly shaded nodes remain in the heap. (k) The resulting sorted array A. inplace stable worst average best remarks quick X N 2 NlogN NlogN fastest in practice, NlogN probabilistic guarantee merge X NlogN NlogN NlogN N log N guarantee, stable heap X NlogN NlogN NlogN N log N guarantee,in-place 4 Properties of Heapsort So, if you have to sort in O(N log N) worst-case without using extra memory, you may consider using heapsort. Mergesort will consume linear amount of extra space. Quicksort takes quadratic time in the worst case. However, as a downside, inner loop of heapsort is longer than quicksorts leading to extra costs. heapsort makes poor use of cache memory. 5 Priority Queues Heapsort is an excellent algorithm, but a good implementation of quicksort, usually beats it in practice. Nevertheless, the heap data structure itself has enormous utility. In this section, we present one of the most popular applications of a heap: its use as an efficient priority queue. A priority queue is a data structure for maintaining a set S of elements, each with an associated value called a key. A priority queue supports the following operations. INSERT(S, x) inserts the element x into the set S. MAXIMUM/MINIMUM(S) returns the element of S with the largest/smallest key. EXTRACT-MAX/MIN(S) removes and returns the element of S with the largest/smallest key.

5 One application of priority queues is to schedule jobs on a shared computer. The priority queue keeps track of the jobs to be performed and their relative priorities. When a job is finished or interrupted, the highest-priority job is selected from those pending using EXTRACT-MAX. A new job can be added to the queue at any time using INSERT. A priority queue can also be used in an event-driven simulator. The items in the queue are events to be simulated, each with an associated time of occurrence that serves as its key. The events must be simulated in order of their time of occurrence, because the simulation of an event can cause other events to be simulated in the future. For this application, it is natural to reverse the linear order of the priority queue and support the operations MINIMUM and EXTRACT-MIN instead of MAXIMUM and EXTRACT-MAX. The simulation program uses EXTRACT-MIN at each step to choose the next event to simulate. As new events are produced, they are inserted into the priority queue using INSERT. Not surprisingly, we can use a heap to implement a priority queue. The operation HEAP- MAXIMUM returns the maximum heap element in constant time by simply returning the value A[1] in the heap. The HEAP-EXTRACT-MAX procedure is similar to the for loop body of the HEAPSORT procedure. HEAP - EXTRACT - MAX (A) if heap - size < 1 error " heap underflow "; max = A [1]; A [1] = A[heap - size ]; heap - size =heap - size - 1; HEAPIFY (A, 1); return max ; The running time of HEAP-EXTRACT-MAX is O(lg n), since it performs only a constant amount of work on top of the O(lg n) time for HEAPIFY. The HEAP-INSERT procedure inserts a node into heap A. To do so, it first expands the heap by adding a new leaf to the tree. Then, it traverses a path from this leaf toward the root to find a proper place for the new element. HEAP - INSERT (A,key ) In array representation, you should for overflow here ; heap - size =heap - size + 1; i = heap - size ; while i > 1 and A[ PARENT (i)] < key A[i] = A[ PARENT (i )]; i = PARENT (i); A[i] = key ; Observe the key difference between heap and BST in this context. A min-heap is a rooted tree such that the key stored at every node is less or equal to the keys of all its descendants. Similarly a max-heap is one in which the key at a node is greater or equal to all its descendants. A search-tree is a rooted tree such that the key sorted at every node is greater than (or equal to) all the keys in its left subtree and less than all the keys in its right subtree. Heaps maintain only a partial ordering, whereas search trees maintain a total ordering. Finding a min/max in BST incurs logarithmic time, while in heap it is constant time. Hence, for a priority queue implementation, we go for a heap.

6 6 Event Driven Simulation Enough of theory! The objective of this assignment is to learn usage of priority queues as data structures which help in carrying out event driven simulation. First of all, what are the very basics of carrying out simulation? You are provided a set of objects and a set of rules governing some dynamical properties of the objects. Your program keeps on updating the properties/states by executing the rules after a specific time stamp. For example, there are a set of spherical particles moving on a 2-d plane with some predefined constant speed. You solve standard equations of motion to compute the position and velocity of each particle after every 10 mili-second (say). The idea discussed this far is the basic concept behind Time-driven simulation of particle dynamics for which the steps are as follows. Discretize time in quanta of size dt. Update the position of each particle after every dt units of time and check for overlaps. If overlap, roll back the clock to the time of the collision, update the velocities of the colliding particles, and continue the simulation. Consider the elaboration as provided in the figure. However, the approach has the following issues. t t+dt t+2dt A collision detected Roll back simulation and execute collision N 2 overlap checks per time quantum. May miss collisions if dt is too large and colliding particles fail to overlap when we are looking. Simulation is too slow if dt is very small. An alternative strategy is Event-driven simulation which essentially asks for doing something only when something other than usual (an event) is supposed to occur. In such a strategy, we change state instantenously only when an event (collision) happens. Such a simulation strategy will focus only on time-points when collisions occur resulting in a sudden change of state of the system. For each particle, you have to initialize the following information fields: radius, position co-ordinates, velocity along each of x and y axis. For each particle, compute the time instants when it may (possibly) collide with every other particle and every wall. In that way, you get a set of pending events particle 1 /wall, particle 2 /wall, time, valid/invalid. Initially, all of these are valid events. Insert the pending events into a priority queue. Your main loop iterates as follows. 1. Extract and delete the event which is immediately pending (with time-stamp t say) from the queue. 2. If the event is invalid, ignore it. 3. Advance all particles to time t on a straight line trajectory.

7 4. If the extracted event is valid, Update the velocities of colliding particle(s). Mark every future event involving the particle(s) as invalid. Predict future particle-wall and particle-particle collisions involving the colliding particle(s) and insert these new events onto the queue. 7 Optimizations It is a good idea to have the entire state of the system maintained in a single data structure (let us call STATE). For every particle, store its color, radius, last computed position (p x, p y ), velocity (v x, v y ) and time of update. While forming the dynamical equations for collision detection, keep in mind the radius of the particles and do not treat them as points without dimensions. We need an efficient routine which should mark some future events as invalid after processing a collision event. We want to avoid a linear search (which shall be slow in case of large no of particles). Hence, we shall have a linked list for each particle in the structure STATE (which stores the information for every particle as discussed earlier). Thus for any particle (i say), every node in the linked list contains a pointer which shall point to a collision event stored in the heap in which particle i is involved. Thus, for marking events which involve i, you simply traverse the linked list for i inside STATE, get directly to the requisite positions in the heap and mark those as invalid. You should deallocate the list once all the events are marked invalid. Again, whenever you compute future collisions involving i, you need to create the nodes in the list for i. The list implementation adds one overhead which you need to be careful about. Whenever the elements in the heap change their positions, the pointers in the list need to be carefully updated. 8 I/O Initialize your simulation with a pre-defined area (length and breadth specification), a set of five particles with initial velocities (as per your convenience) and radius. The user may provide a time horizon ( 100 sec say). Your output should be a plot exhibiting the trajectories of the five particles (in different colors) up to 100 sec. Also, in a seperate text file, you should log the velocity and position vectors of each particle at the moment of each collision. Generating plot data: Store the trajectories for each particle in separate log files. Suppose particle i suffers two collisions at instants t 1 followed by t 2. You should compute intermediate position co-ordinates for the particle at timesteps t 1 + k (t 2 t 1 ) 10 for k = 0, 1,, 10 and store this in the file for particle i. With each location information, you definitely need to store the time when the particle was in that location. Use this data which you have generated for each particle to obtain the plots.