Heaps implemented without arrays

Similar documents
Binary Heaps. CSE 373 Data Structures

Converting a Number from Decimal to Binary

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

Operations: search;; min;; max;; predecessor;; successor. Time O(h) with h height of the tree (more on later).

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

DATA STRUCTURES USING C

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

Analysis of Algorithms I: Binary Search Trees

6 March Array Implementation of Binary Trees

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Data Structures and Data Manipulation

Lecture 6: Binary Search Trees CSCI Algorithms I. Andrew Rosenberg

Data Structure [Question Bank]

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Data Structures Fibonacci Heaps, Amortized Analysis

Binary Heap Algorithms

- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

From Last Time: Remove (Delete) Operation

How To Create A Tree From A Tree In Runtime (For A Tree)

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

CS711008Z Algorithm Design and Analysis

EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27

Persistent Binary Search Trees

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

Binary Search Trees CMPSC 122

Ordered Lists and Binary Trees

Why? A central concept in Computer Science. Algorithms are ubiquitous.

TREE BASIC TERMINOLOGIES

Efficient Data Structures for Decision Diagrams

Symbol Tables. Introduction

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Persistent Data Structures

Chapter 14 The Binary Search Tree

Lecture Notes on Binary Search Trees

Data Structures. Level 6 C Module Descriptor

An Immediate Approach to Balancing Nodes of Binary Search Trees

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

A Comparison of Dictionary Implementations

A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Data Structures and Algorithms

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

Parallelization: Binary Tree Traversal

Data Structure and Algorithm I Midterm Examination 120 points Time: 9:10am-12:10pm (180 minutes), Friday, November 12, 2010

A binary search tree is a binary tree with a special property called the BST-property, which is given as follows:

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

PES Institute of Technology-BSC QUESTION BANK

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Lecture Notes on Binary Search Trees

Algorithms Chapter 12 Binary Search Trees

International Journal of Software and Web Sciences (IJSWS)

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Introduction to Object-Oriented Programming

Introduction to Data Structures and Algorithms

Data Structures, Practice Homework 3, with Solutions (not to be handed in)

Physical Data Organization

Data Structures Using C++ 2E. Chapter 5 Linked Lists

GUJARAT TECHNOLOGICAL UNIVERSITY, AHMEDABAD, GUJARAT. Course Curriculum. DATA STRUCTURES (Code: )

Binary Search Trees 3/20/14

Algorithms and Data Structures

FOPR-I1O23 - Fundamentals of Programming

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

Data Structures and Algorithms Written Examination

Quiz 4 Solutions EECS 211: FUNDAMENTALS OF COMPUTER PROGRAMMING II. 1 Q u i z 4 S o l u t i o n s

Cpt S 223. School of EECS, WSU

Algorithms and Data Structures

S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India Dr. R. M.

Exercises Software Development I. 11 Recursion, Binary (Search) Trees. Towers of Hanoi // Tree Traversal. January 16, 2013

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Node-Based Structures Linked Lists: Implementation

Sequences in the C++ STL

Binary Search Trees (BST)

ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION /5:15-6:45 REC-200, EVI-350, RCH-106, HH-139

Linked Lists: Implementation Sequences in the C++ STL

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

DATABASE DESIGN - 1DL400

Big Data and Scripting. Part 4: Memory Hierarchies

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

Data Structures. Algorithm Performance and Big O Analysis

Analysis of a Search Algorithm

Heaps & Priority Queues in the C++ STL 2-3 Trees

Abstract Data Type. EECS 281: Data Structures and Algorithms. The Foundation: Data Structures and Abstract Data Types

History-Independent Cuckoo Hashing

Exam study sheet for CS2711. List of topics

Fast Sequential Summation Algorithms Using Augmented Data Structures

Trace-Based and Sample-Based Profiling in Rational Application Developer

Lecture 2 February 12, 2003

7.1 Our Current Model

Binary Trees and Huffman Encoding Binary Search Trees

QUEUES. Primitive Queue operations. enqueue (q, x): inserts item x at the rear of the queue q

Transcription:

Peaps Heaps implemented without arrays Author: Paul Picazo (ppicazo@gmail.com) Course: CS375 - Algorithms Professor: Dr. Paul Hriljac Date Submitted: 4/30/2008

Peaps 1 Table of Contents Introduction... 2 The Problem... 3 Background Information... 4 The Solution... 5 Locating Nodes, Complex... 5 Locating Nodes, Simple... 7 Inserting or Removing... 8 Conlusion... 10 References... 11

Peaps 2 Introduction Typically heap data structures have been implemented using arrays. Arrays allow for time efficient operations on the heap. Unfortunately the array implementation does not allow students to properly visualize the heap and its operations. Many sources state that heaps built with pointers are not time efficient or must be threaded. Threaded trees have links between successor and predecessor nodes. They are highly complex data structures and require that the threading be kept current in order to function properly. Storing the threading data hurts the space efficiency, while keeping the threads current destroys the time efficiency. The purpose of this paper is to explore implementation options which preserve the tree structure of the heap and are time efficient. The goal would not only be to have heaps which help students learn, but also provide alternative time and space efficient. Two unique implementations have been created and studied. Both offer advantages over traditional array implementations. While they do not achieve the same or better performance in practice as traditional array implementations, they do have the same time efficiency overall for all operations other than locating the next open slot in the heap. This allows for equal time efficiency of heap sort with traditional array implementations. These heaps using the created implementations are called Peaps, named after both their creator, Paul Picazo, and because they use pointers rather than arrays. Abstract Method Array Implementation Pointer Implementation(s) Find last node O(1) O(log n) Insert item into heap O(log n) O(log n) Remove min or max from heap* O(log n) O(log n) Get value of min or max of heap* O(1) O(1) Heap sort O(n log n) O(n log n) Table 1

Peaps 3 The Problem There are a few critical issues to overcome when implementing heaps without arrays. The most significant issue is locating the last node or next open slot when removing or inserting elements in the heap. Heaps implemented in arrays allow these nodes to be located in O(1) time. Locating these nodes in a threaded tree is also trivial; however, maintaining the threading can be considering part of these operations. Maintaining the threading requires complicated algorithms which severely damage the time efficiency. If we can solve the problem of efficiently finding the last node or next open slot to insert, all other operations will be straightforward and efficient.

Peaps 4 Background Information In order to keep track of specific locations in the almost perfect binary tree, we will number each location as shown in the Figure 1. Level 0 N o d e 1 Level 1 N o d e 2 N o d e 3 Level 2 N o d e 4 N o d e 5 N o d e 6 N o d e 7 Level 3 N o d e 8 N o d e 9 Node 10 Node 11 Node 12 Node 13 Node 14 Node 15 Figure 1: Tree node location and level number system The abstract properties of the heap data structure are best represented as a binary tree. Comparing the array (shown in Figure 2) and tree (shown in Figure 3) representations clearly shows the advantage for learning and understanding that the tree representation has. 15 8 5 3 1 2 Figure 2: Tree Representation of Example Heap 15 8 5 3 1 2 Figure 3: Array Representation of Example Heap

Peaps 5 The Solution To achieve the goal of an efficient heap implemented without arrays we must devise an efficient method for locating specific locations in the heap. The only reference to the tree we have is a pointer to the root node. Each node has two possible children, which may be accessed by a left and right pointer to them. For any children that do not exist, the pointer will point to null. Locating Nodes, Complex Two methods were researched, the first method involves calculating the horizontal location in each level a given node is. In the example, when inserting X in to the heap, it is put in location six. Level 0 15 Level 1 8 5 Level 2 3 1 X Figure 4: Example Heap We know how to calculate the depth of any location in an almost perfect binary tree: level = floor(log 2 location). The location six must be on level 2 of the almost perfect binary, because floor(log 2 6) = 2. The maximum capacity of a given level n is: 2 n. Therefore level 2 has a maximum capacity of four

Peaps 6 nodes. The total capacity of a full almost binary tree of n levels is: 2 (n+1) -1. This gives us a maximum capacity of 7 in the example tree. Level Max Nodes In Level Max Nodes in Tree 0 1 1 1 2 3 2 4 7 3 8 15 4 16 31 n 2 n 2 n+1-1 Table 2 In order to locate the nodes position horizontally we will subtract the node location from the maximum capacity of the tree and then subtract that amount from the maximum capacity of the current level. In our example this gives us: 4-(7-6) = 3. Therefore our desired node is 75% through the level horizontally. Using this property we can then determine its parent node s horizontal position in its level: Parent Postion in its level = ceiling(child Postion Horizontally in % * Maximum nodes in level). The will give us: ceiling(0.75 * 2) = 2. If the position in the level is even, then the node is a right child, otherwise it is a left child. By repeating the process for each level until we reach level zero we can generate the complete traversal path to the desired node. The psuedocode for this is shown Code Listing 1. As it is clearly visible the loop runs once for each level, giving it a time efficiency of O(log n). $current_level = floor(log($n)/log(2)); $max_nodes_in_level = pow(2,$current_level); $max_nodes_in_tree = pow(2,$current_level+1)-1; $spot_in_level = $max_nodes_in_level - ($max_nodes_in_tree - $n); while($current_level > 0) if($spot_in_level % 2 == 0) array_push($stack,"right"); else array_push($stack,"left"); $percent = $spot_in_level / $max_nodes_in_level; $max_nodes_in_level = $max_nodes_in_level / 2; $spot_in_level = ceil($percent * $max_nodes_in_level); $current_level--; while(count($stack) > 0) go(array_pop($stack)); //go left or right

Peaps 7 Code Listing 1 Locating Nodes, Simple The second method is similar to the first method in that it incrementally builds the traversal path to the desired location. It differs however in how it calculates that path. After observing almost perfect binary trees, a clear pattern emerged. This pattern is also used when building heaps in arrays. Each parent nodes location is half of either of its children s locations. Using the example heap, we can see that the parent of location 6 is at location 3. Once again if we repeat this process until we reach level zero we can determine which nodes are in the traversal path. Level 0 1 Level 1 2 3 Level 2 4 5 6 This alone does not allow us to Figure 5 traverse that path through the tree. But another pattern emerges when examining the tree in Figure 5. All even locations are left children and all odd function find_node($n) $current_node = $n; while($current_node > 1) if($current_node % 2 == 1) push($stack,"right"); else push($stack,"left"); // if node is odd it is a right child // otherwise it is even and left child locations are right children. This algorithm (Code Listing 1) is also O(log n). $current_node = floor($current_node / 2); // set the current node to the parent return $stack; // this stack now contains the path to node n Code Listing 2

Peaps 8 Inserting or Removing In practice when we insert or remove from a heap, we must also bubble up or sift down. Both of these processes are O(log n). The array implementation allows the correct node to be found in O(1), but the bubbling up or sifting down will make the overall time efficiency of the insert or remove O(log n). The peap implementation allows for the necessary node to be found in O(log n) time, making the overall time efficiency of the insert into a peap O(log n). Nodes Execution Time Rate 10 0.000022888 436906.7 100 0.000011921 8388608 1000 0.000015974 62601552 10000 0.000020027 4.99E+08 100000 0.000022888 4.37E+09 1000000 0.000026941 3.71E+10 10000000 0.000030994 3.23E+11 100000000 0.000032902 3.04E+12 1000000000 0.000037909 2.64E+13 10000000000 0.000041008 2.44E+14 1E+11 0.000045061 2.22E+15 1E+12 0.000048161 2.08E+16 1E+13 0.000051975 1.92E+17 1E+14 0.000053883 1.86E+18 1E+15 0.000056982 1.75E+19 1E+16 0.000061035 1.64E+20 1E+17 0.000063896 1.57E+21 1E+18 0.000068188 1.47E+22 In summary the steps required to insert in to a peap are: find the path to the necessary node, traverse to that node, and sift up to restore the heap property. Each of these steps is clearly O(log n). Execution Time 0.00008 0.00007 0.00006 0.00005 0.00004 0.00003 0.00002 0.00001 0 Single Item Insert 1 100 10000 1000000 10000000 1E+10 1E+12 1E+14 1E+16 1E+18 Number of Items in heap

Peaps 9 Heap Sort Given that we can now do peap inserts with the same time efficiency heaps implemented with arrays, we can also now achieve the same time efficiency with heap sort. We insert n items, and remove n items, resulting in O(n log n) behavior. Nodes Time N / S 10 0.000 100 0.010 10000 1000 0.020 50000 4000 0.040 100000 6000 0.060 100000 8000 0.080 100000 10000 0.090 111111 Execution Time (s) 0.100 0.090 0.080 0.070 0.060 0.050 0.040 0.030 0.020 0.010 0.000 Heap Sort Execution Time 1 10 100 1000 10000 Elements to Sort Time

Peaps 10 Conclusion With acceptable performance in both theory and practice, this implementation should be considered when either using heaps or teaching students about heaps. An alternative method allows for more flexibility for programmers. Asymptotically they have the same efficiency as traditional array implementations while it they are visually more appealing and intuitive. Storage space is minimized because there is no need for threaded trees or links to parent nodes. Many scenarios would benefit from a heap implemented with pointers, especially those which use a programming language that does not allow arrays to be resized. In these cases a new segment of memory must be allocated with the new size and the old array must be copied. By keeping the abstract properties of a binary tree when implementing heaps, visually and intuitively we can allow computer science students to achieve their full potential.

Peaps 11 References Matthew, Jaffe. Lectures from CS315 at Embry-Riddle, Prescott Campus.