Binary Heap Algorithms



Similar documents
Heaps & Priority Queues in the C++ STL 2-3 Trees

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

Algorithms and Data Structures

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Node-Based Structures Linked Lists: Implementation

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Converting a Number from Decimal to Binary

6 March Array Implementation of Binary Trees

Cpt S 223. School of EECS, WSU

Binary Trees and Huffman Encoding Binary Search Trees

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Binary Heaps. CSE 373 Data Structures

Binary search algorithm

Big Data and Scripting. Part 4: Memory Hierarchies

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

From Last Time: Remove (Delete) Operation

Lecture 6: Binary Search Trees CSCI Algorithms I. Andrew Rosenberg

Linked Lists: Implementation Sequences in the C++ STL

PES Institute of Technology-BSC QUESTION BANK

Sequences in the C++ STL

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Analysis of Binary Search algorithm and Selection Sort algorithm

Sample Questions Csci 1112 A. Bellaachia

Ordered Lists and Binary Trees

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Quiz 4 Solutions EECS 211: FUNDAMENTALS OF COMPUTER PROGRAMMING II. 1 Q u i z 4 S o l u t i o n s

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

CS711008Z Algorithm Design and Analysis

Data Structures and Algorithms

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

Binary Search Trees (BST)

Algorithms and Data Structures Written Exam Proposed SOLUTION

CSE 326: Data Structures B-Trees and B+ Trees

A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

Lecture Notes on Binary Search Trees

How To Create A Tree From A Tree In Runtime (For A Tree)

DATABASE DESIGN - 1DL400

Exam study sheet for CS2711. List of topics

10CS35: Data Structures Using C

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

1/1 7/4 2/2 12/7 10/30 12/25

Binary Search Trees CMPSC 122

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Analysis of a Search Algorithm

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Recursion vs. Iteration Eliminating Recursion

Symbol Tables. Introduction

- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time

Data storage Tree indexes

Chapter 14 The Binary Search Tree

Algorithms and Data Structures

Lecture Notes on Binary Search Trees

A Comparison of Dictionary Implementations

Binary Search Trees. basic implementations randomized BSTs deletion in BSTs

Persistent Binary Search Trees

The ADT Binary Search Tree

APP INVENTOR. Test Review

Abstract Data Type. EECS 281: Data Structures and Algorithms. The Foundation: Data Structures and Abstract Data Types

Physical Data Organization

THIS CHAPTER studies several important methods for sorting lists, both contiguous

Data Structure and Algorithm I Midterm Examination 120 points Time: 9:10am-12:10pm (180 minutes), Friday, November 12, 2010

Output: struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;

CompSci-61B, Data Structures Final Exam

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

DATA STRUCTURES USING C

Data Structure [Question Bank]

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

TREE BASIC TERMINOLOGIES

Binary Search Trees 3/20/14

Analysis of Algorithms I: Binary Search Trees

CSC148 Lecture 8. Algorithm Analysis Binary Search Sorting

Data Structures. Level 6 C Module Descriptor

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

Algorithms and Data Structures Exercise for the Final Exam (17 June 2014) Stack, Queue, Lists, Trees, Heap

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Algorithms. Margaret M. Fleck. 18 October 2010

6. Standard Algorithms

AP Computer Science AB Syllabus 1

LINKED DATA STRUCTURES

Algorithms and data structures

Persistent Data Structures and Planar Point Location

Data Structure with C

Analysis of Algorithms I: Optimal Binary Search Trees

Chapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search

Lecture 2 February 12, 2003

Binary Search Trees. Ric Glassey

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT

Keys and records. Binary Search Trees. Data structures for storing data. Example. Motivation. Binary Search Trees

External Memory Geometric Data Structures

Transcription:

CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005 2009 Glenn G. Chappell

Review Binary Search Trees Efficiency B.S.T. (balanced & average case) Sorted Array B.S.T. (worst case) Retrieve Logarithmic Logarithmic Insert Logarithmic Delete Logarithmic Binary Search Trees have poor worst-case performance. But they have very good performance: On average. If balanced. But we do not know an efficient way to make them stay balanced. Can we efficiently keep a Binary Search Tree balanced? 5 Apr 2009 CS Spring 2009 2

Unit Overview Tables & Priority Queues Major Topics Introduction to Tables Priority Queues Binary Heap algorithms Heaps & Priority Queues in the C++ STL 2- Trees Other balanced search trees Hash Tables Prefix Trees Tables in various languages 5 Apr 2009 CS Spring 2009

Review Introduction to Tables [/4] Our ultimate value-oriented ADT is Table. Three primary operations: Retrieve (by key). Insert (key-data pair). Delete (by key). What do we use a Table for? To hold data accessed by key fields. For example: Customers accessed by phone number. Students accessed by student ID number. Any other kind of data with an ID code. To hold set data. Data in which the only question we ask is whether a key lies in the data set. To hold arrays whose indices are not nonnegative integers. arr2["hello"] = ; To hold array-like data sets that are sparse. arr[6] = ; arr[000000000] = 2; 5 Apr 2009 CS Spring 2009 4

Review Introduction to Tables [2/4] How can we implement a Table? We know several lousy ways: A Sequence holding key-data pairs. Sorted or unsorted. Array-based or Linked-List-based. A Binary Search Tree holding key-data pairs. Sorted Array Unsorted Array Sorted Linked List Unsorted Linked List Binary Search Tree Balanced** Binary Search Tree Retrieve Logarithmic Logarithmic Insert Constant (?)* Constant Logarithmic Delete Logarithmic *Amortized constant time, if we might have to reallocate. (Also, we are allowing multiple equivalent keys here. If we do not, then when we insert, we need to find any existing item with the given key, which is linear time.) **Of course, we do not (yet!) know any way to guarantee the tree will stay balanced, unless we can restrict ourselves to read-only operations (no insert, delete). 5 Apr 2009 CS Spring 2009 5

Review Introduction to Tables [/4] In special situations, the (amortized) constant-time insertion for an unsorted array and the logarithmic-time retrieve for a sorted array can be combined! Insert all data into an unsorted array, sort the array, then use Binary Search to retrieve data. This is a good way to handle Table data with separate filling & searching phases (and few or no deletes). Note: We will be talking about some complicated Table implementations. But sometimes a simple solution is the best. 5 Apr 2009 CS Spring 2009 6

Review Introduction to Tables [4/4] Sorted Array Unsorted Array Sorted Linked List Unsorted Linked List Binary Search Tree Balanced (how?) Binary Search Tree Retrieve Logarithmic Logarithmic Insert Constant (?) Constant Logarithmic Delete Logarithmic Idea #: Restricted Table Perhaps we can do better if we do not implement a Table in its full generality. Idea #2: Keep a Tree Balanced Balanced Binary Search Trees look good, but how to keep them balanced efficiently? Idea #: Magic Functions Use an unsorted array of key-data pairs. Allow array items to be marked as empty. Have a magic function that tells the index of an item. Retrieve/insert/delete in constant time? (Actually no, but this is still a worthwhile idea.) We will look at what results from these ideas: From idea #: Priority Queues From idea #2: Balanced search trees (2- Trees, Red-Black Trees, B-Trees, etc.) From idea #: Hash Tables 5 Apr 2009 CS Spring 2009 7

Review Priority Queues What a Priority Queue Is A Priority Queue is a restricted-access Table. Just as a Queue is a restricted Sequence. A Priority Queue is not a Queue! The key is called the priority. We can: Retrieve (getfront) highest priority item. Insert any item (key, data). Delete highest priority item. 5 Apr 2009 CS Spring 2009 8

Review Priority Queues Applications, Implementation A PQ is useful when we have items to process and some are more important than others. PQs can be used to do sorting. Insert all items, then retrieve/delete all items. The resulting sequence is sorted by priority. Note: Once again, a sorted container gives us a sorting algorithm. However, as with Insertion Sort, instead of using a separate container to sort with, we prefer to use an in-place version of the algorithm. We will call this Heap Sort. The most interesting thing about Priority Queues is their usual implementation: a Binary Heap. 5 Apr 2009 CS Spring 2009 9

What is a Binary Heap? Definition We define a Binary Heap (usually just Heap) to be a complete Binary Tree that Notes Is empty, Or else The root s key (priority) is than the key of each of the root s children, if any, and Each of the root s subtrees is a Binary Heap. This is a Maxheap. If we reverse the order, so that the root s key is than the keys of its children, we get a Minheap. The text presents Heap as an ADT with essentially the same operations as a Priority Queue. I am not doing this. As we will see, a Binary Heap is a good basis for an implementation of a Priority Queue. 5 56 50 25 0 22 25 2 5 Apr 2009 CS Spring 2009 0

What is a Binary Heap? Complete BTs (again) We discussed an array implementation for a complete Binary Tree: Put the nodes in an array, in the order in which they would be added to a complete Binary Tree. No pointers/arrows/indices are required. We store only the array of data items and the number of nodes. 0 7 8 2 4 5 6 9 Logical Structure 0 2 4 5 6 7 8 9 Physical Structure Put the root, if any, at index 0. The left child of node k is at index 2k +. It exists if 2k + < size. The right child is similar, but at 2k + 2. The parent of node k is at index (k )/2 [int division]. It exists if k > 0. 5 Apr 2009 CS Spring 2009

What is a Binary Heap? Implementation The usual implementation of a Binary Heap uses this array-based complete Binary Tree. 5 56 50 25 0 22 25 2 Logical Structure 56 50 25 5 22 25 0 2 Physical Structure There are no required order relationships between siblings. None of the standard traversals gives any sensible ordering. In practice, we usually use Heap to mean a Binary Heap using this array representation. In order to base a Priority Queue on a Heap, we need to know how to implement the operations. getfront is easy (right?). Next we look at delete & insert. 5 Apr 2009 CS Spring 2009 2

Operations Delete [/2] In a Priority Queue, we can delete the highest-priority item. In a Maxheap, this corresponds to the root. How do we delete the root item, while maintaining the Heap properties? We cannot delete the root node (unless it is the only node). The Heap will have one less item, and so the last node must go away. But the last item is not going away. Solution: Move the last node s item to the root; delete the last node. We do this by swapping the items (which has other advantages, as we will see). 56 50 25 2 50 25 Ick! 5 22 25 5 22 25 0 2 0 But now we have another problem: This is no longer a Heap. How do we fix it? 5 Apr 2009 CS Spring 2009

Operations Delete [2/2] To fix: Trickle down the new root item. If the root is all its children, stop. Otherwise, swap the root item with its largest child and recursively fix the proper subtree. (Why largest?) 2 50 50 25 2 25 5 22 25 5 22 25 0 0 50 5 22 25 Done 2 20 0 5 Apr 2009 CS Spring 2009 4

Operations Insert To insert into a Heap, add a new node at the end. But if we put our new value in this node, then we may not have a Heap. Solution: Trickle up. If new value is greater than its parent, swap them. Repeat at new position. 56 56 50 25 50 25 5 22 25 5 22 2 0 2 2 0 2 25 56 50 2 Done 5 22 25 0 2 25 5 Apr 2009 CS Spring 2009 5

Operations Using an Array Heap insert and delete are usually given a random-access range. The item to insert or delete is last item of the range; the rest is a Heap. Action of Heap insert: That s where we want to put the item, initially (right?). Given Heap Action of Heap delete: Item to insert New Heap That s where the swap puts it (right?). Given Heap New Heap Item deleted Note that Heap algorithms can do all their work using swap. This usually allows for both speed and safety. 5 Apr 2009 CS Spring 2009 6

Efficiency What is the order of the three main Priority Queue operations, if we use a Binary Heap implementation based on a complete Binary Tree stored in an array? getfront Constant time. insert Logarithmic time. Assuming no reallocation, that is, assuming the array is large enough to hold the new item. As on the previous slide, the way that Heaps are used often guarantees that this is the case. ( time if possible reallocation.) The number of operations is roughly the height of the tree. Since the tree is balanced, the height is O(log n). delete Logarithmic time. No reallocation, of course. Other comments as for insert. We conclude that a Heap is an excellent basis for an implementation of a Priority Queue. Better than linear time! 5 Apr 2009 CS Spring 2009 7

Write It! TO DO Write the Heap insert algorithm. Prototype is shown below. The item to be inserted is the final item in the given range. All other items should form a Heap already. Done. See heapalgs.h, on the web page. // Requirements on types: // RAIter is a random-access iterator type. template<typename RAIter> void heapinsert(raiter first, RAIter last); 5 Apr 2009 CS Spring 2009 8

An Efficient Make Operation To turn a random-access range (array?) into a Heap, we could do n Heap inserts. Each insert operation is O(log n), and so making a Heap in this way is O(n log n). However, we can make a Heap faster than this. Place each item into a partially-made Heap, in backwards order. Trickle each item down through its descendants. For most items, there are not very many of these. 9 4 8 9 Bottom items: no trickling necessary 9 9 9 8 4 8 4 8 4 8 4 9 9 9 8 9 8 8 4 8 4 4 4 This Heap make method is linear time! 9 4 8 5 Apr 2009 CS Spring 2009 9

Heap Sort Introduction Our last sorting algorithm is Heap Sort. This is a sort that uses Heap algorithms. We can think of it as using a Priority Queue, where the priority of an item is its value except that the algorithm is in-place, using no separate data structure. Procedure: Make a Heap, then delete all items, using the delete procedure that places the deleted item in the top spot. We do a make operation, which is O(n), and n getfront/delete operations, each of which is O(log n). Total: O(n log n). 5 Apr 2009 CS Spring 2009 20

Heap Sort Properties Heap Sort can be done in-place. We can create a Heap in a given array. As each item is removed from the Heap, put it in the array element that is removed from the Heap. Starting the delete by swapping root and last items does this. Results Ascending order, if we used a Maxheap. Only constant additional memory required. Reallocation is avoided. Heap Sort uses less additional space than Introsort or array Merge Sort. Heap Sort: O(). Introsort: O(log n). Merge Sort on an array: O(n). Heap Sort also can easily be generalized. Doing Heap inserts in the middle of the sort. Stopping before the sort is completed. 5 Apr 2009 CS Spring 2009 2

Heap Sort Illustration [/2] Below: Heap make operation. Next slide: Heap deletion phase. Start 2 4 Add 2 2 4 4 2 4 2 4 Add 4 2 4 2 Add 2 4 Note: This is what happens in memory. This is just a picture of the logical structure. 4 4 Add Now the entire array is a Heap. 4 2 4 2 5 Apr 2009 CS Spring 2009 22 2 2 4 4

Heap Sort Illustration [2/2] Heap deletion phase: Delete 2 4 Start Delete 4 4 2 4 2 2 4 2 4 2 4 2 2 2 4 2 Delete 2 2 4 2 4 2 4 2 Delete (does nothing; can be skipped) 2 4 2 4 Now the array is sorted. 5 Apr 2009 CS Spring 2009 2

Heap Sort Analysis Efficiency Heap Sort is O(n log n). Requirements on Data Heap Sort requires random-access data. Space Usage Heap Sort is in-place. Stability Heap Sort is not stable. Performance on Nearly Sorted Data We have seen these together before (Iterative Merge Sort on a Linked List), but never for an array. Heap Sort is not significantly faster or slower for nearly sorted data. Notes Heap Sort can be generalized to handle sequences that are modified (in certain ways) in the middle of sorting. Recall that Heap Sort is used by Introsort, when the recursion depth of Quicksort exceeds the maximum allowed. 5 Apr 2009 CS Spring 2009 24

Thoughts In practice, a Heap is not so much a data structure as it is an ordinary random-access sequence with a particular ordering property. Associated with Heaps are a collection of algorithms that allow us to efficiently create Priority Queues and do comparison sorting. These algorithms are the things to remember. Thus the subject heading. 5 Apr 2009 CS Spring 2009 25