Trees. Tree Definitions: Let T = (V, E, r) be a tree. The size of a tree denotes the number of nodes of the tree.

Similar documents
Analysis of Algorithms I: Binary Search Trees

Full and Complete Binary Trees

Binary Search Trees CMPSC 122

Binary Trees and Huffman Encoding Binary Search Trees

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

Binary Search Trees (BST)

Ordered Lists and Binary Trees

Converting a Number from Decimal to Binary

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

From Last Time: Remove (Delete) Operation

Data Structures Fibonacci Heaps, Amortized Analysis

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

Data Structures and Algorithms

A binary search tree is a binary tree with a special property called the BST-property, which is given as follows:

Output: struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;

Chapter 14 The Binary Search Tree

Binary Heap Algorithms

Algorithms and Data Structures

Binary Heaps. CSE 373 Data Structures

GRAPH THEORY LECTURE 4: TREES

Section IV.1: Recursive Algorithms and Recursion Trees

TREE BASIC TERMINOLOGIES

Data Structures and Algorithms Written Examination

Lecture Notes on Binary Search Trees

Lecture Notes on Binary Search Trees

Lecture 6: Binary Search Trees CSCI Algorithms I. Andrew Rosenberg

Heaps & Priority Queues in the C++ STL 2-3 Trees

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Exam study sheet for CS2711. List of topics

A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

Lecture 1: Course overview, circuits, and formulas

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

How To Create A Tree From A Tree In Runtime (For A Tree)

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

Data Structures, Practice Homework 3, with Solutions (not to be handed in)

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

CSE 326: Data Structures B-Trees and B+ Trees

Algorithms and Data Structures Written Exam Proposed SOLUTION

Operations: search;; min;; max;; predecessor;; successor. Time O(h) with h height of the tree (more on later).

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

Sample Questions Csci 1112 A. Bellaachia

The ADT Binary Search Tree

Classification/Decision Trees (II)

Cpt S 223. School of EECS, WSU

ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION /5:15-6:45 REC-200, EVI-350, RCH-106, HH-139

Optimal Binary Search Trees Meet Object Oriented Programming

Analysis of Algorithms I: Optimal Binary Search Trees

Algorithms Chapter 12 Binary Search Trees

CS711008Z Algorithm Design and Analysis

Sequential Data Structures

Why Use Binary Trees?

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

Algorithms and Data Structures

Data Structure with C

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Symbol Tables. Introduction

6 March Array Implementation of Binary Trees

CompSci-61B, Data Structures Final Exam

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

Mathematical Induction. Lecture 10-11

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Big Data and Scripting. Part 4: Memory Hierarchies

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Binary Trees (1) Outline and Required Reading: Binary Trees ( 6.3) Data Structures for Representing Trees ( 6.4)

A Note on Maximum Independent Sets in Rectangle Intersection Graphs

DATA STRUCTURES USING C

SECTION 10-2 Mathematical Induction

ER E P M A S S I CONSTRUCTING A BINARY TREE EFFICIENTLYFROM ITS TRAVERSALS DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A

OPTIMAL BINARY SEARCH TREES

Binary Search Trees. Ric Glassey

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

Data Structure [Question Bank]

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India Dr. R. M.

PES Institute of Technology-BSC QUESTION BANK

Analysis of Algorithms, I

An Immediate Approach to Balancing Nodes of Binary Search Trees

Binary Search Trees. basic implementations randomized BSTs deletion in BSTs

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 221] edge. parent

10CS35: Data Structures Using C

6.3 Conditional Probability and Independence

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

Any two nodes which are connected by an edge in a graph are called adjacent node.

Rotation Operation for Binary Search Trees Idea:

Graphs without proper subgraphs of minimum degree 3 and short cycles

6.2 Permutations continued

Binary Search Trees 3/20/14

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Data Structures. Level 6 C Module Descriptor

Lecture 12 Doubly Linked Lists (with Recursion)

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

CS 103X: Discrete Structures Homework Assignment 3 Solutions

Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Transcription:

Trees After lists (including stacks and queues) and hash tables, trees represent one of the most widely used data structures. On one hand trees can be used to represent objects that are recursive in nature (including programming, logical, and arithmetic expressions), as well as structured data, such as data stored in XML documents. On the other hand, trees are often used as the basis for more advanced data structures, including binary search trees. Tree Definitions: Let T = (V, E, r) be a tree. The size of a tree denotes the number of nodes of the tree. For v V, parent(v) denotes the unique u V such that (u, v) E. Note that parent(r) is undefined, but is defined for every other node. For u V, children(u) = v (u, v) E. leaf: any node n for which children(n) =. A node that is not a leaf is called an internal node. For v V, siblings(v) = u (parent(v), u) E For v V, depth(v) = length of the path (in terms of number of edges traversed) from the root to v. For v V, height(v) = length of longest path from v to a leaf of T. The height of T is defined as height(r). T is called m-ary iff v V, children(v) m. T is called binary iff the property holds for m = 2. If T is m-ary, then it is called full iff every internal node has exactly m children. It is called perfect if it is full, and every leaf of T has the same depth. 1

2

Tree Implementation. The most common type of tree to implement is the binary tree. public class BinaryTree Object root; BinaryTree left; //reference to left child BinaryTree right; //reference to right child ; For general trees, the left and right references are replaced with a single reference to the list of subtrees that are rooted at the given node that stores the data. public class Tree Object root; List children; //a list of Trees that are rooted at root Note: some implementations will include a reference to the parent of the tree. This can seem useful in some applications. 3

Example 1. Implement a method for the BinaryTree class above that provides the height of the binary tree. Example 2. Implement a method for the BinaryTree class above that provides the size of the binary tree. 4

Example 3. Implement a method for the BinaryTree class above that finds the first binary subtree (according to the tree pre-ordering) whose root is equal to that of some given object. 5

Binary Search Trees A binary search tree is a binary tree T = (V, E, r) such that the objects stored in the tree are totally ordered via some ordering relation <. A binary search tree data structure is defined similarly to that of a BinaryTree, but now the root must be an Ordered object. Moreover, for each node n of the tree we have the following statements holding. if n 1 is a left descendant of n, then n 1 < n. if n 2 is a right descendant of n, then n 1 > n. Example 4. A binary search tree is to be constructed by adding integer nodes (in the order in which they are shown) n 1, n 2,..., n 7 to an initially empty tree. If the integer values of the nodes are respectively 5, 8, 2, 9, 6, 10, 1, what will the tree look like? 6

Binary search trees are mostly used for the purpose of efficiently finding stored data. Finding data with a bst is efficient since one only needs to search along one of the branches of the tree, as opposed to an array which would require searching through possibly the entire set of data. For these reasons, the following operations are considered most essential: insert(): insert an object into the tree find(): find an object in the tree find min(): find the least object in the tree find max(): find the greatest object in the tree delete(): delete an object from the tree 7

//Insert obj into the this bst. //Assume obj is currently not in the tree. public void insert(ordered obj) if(obj.compare(root) < 0) //insert into left subtree if(root.left == null) root.left = new BinarySearchTree(obj); else root.left.insert(obj); //insert into right subtree if(root.right == null) root.right = new BinarySearchTree(obj); else root.right.insert(obj); Example 5. Insert a node with integer value 7 into the tree constructed during Example 4. 8

//find and return a stored object equal to obj Object find(ordered obj) int value = root.compare(obj); if(value == 0) return root; if(value < 0) //look in left subtree if(root.left == null) return null; return root.left.find(obj); //must be in right subtree if(root.right == null) return null; return root.right.find(obj); Example 6. Trace through the find algorithm using the tree from Example 4, and using inputs 7, and 12. 9

//find and return the least element Object minimum() if(root.left == null) return root; return root.left.minimum(); 10

Deleting a binary-search-tree node n: case 1: n has no children. Then directly removing n does not affect the bst. case 2: n has one child. Upon removing n make child(n) a child of parent(n). case 3: n has two children. Replace n with the least right descendant n of n. n will have at most one child, so either case 1 or case 2 applies to its relocation. Example 7. Demonstrate all three cases using the tree from Example 4. 11

Complexity analysis of bst operations worst case: occurs when all n data exist on one branch of the tree. In this case, all operations will equal O(n). best case: occurs when the bst is perfect. In this case the tree will have depth log n and the complexity of all operations equal O(log n). average case: the average-case complexity depends on the average length of a node in randomly constructed tree. Theorem: The average length of a node in a randomly constructed binary search tree is O(log n). Proof: Given a tree T, the sum of the depths of all the nodes of the tree is called the internal path length. Let D(n) denote the internal path length of a binary tree having n nodes. Then D(1) = 0 and, for n > 1, D(n) = D(i) + D(n i 1) + n 1, assuming that the tree has i nodes in its left subtree, and n i 1 nodes in its right subtree. The n 1 term comes from the fact that D(i) and D(n i 1) gives a measurement of the internal path lengths of the left and right subtrees from the perspective that both roots start at depth 0. But since they are subtrees of one level down, each of the n 1 nodes will have an extra unit of depth. Hence, we must add n 1 to the recurrence relation. Now, in the case of a binary search tree for which the root node is randomly selected from a list of n choices, each with equal probability, we see that the average internal path length for a randomly constructed tree is 1 [(D(0) + D(n 1) + (n 1)) + (D(1) + D(n 2) + (n 1)) + + (D(n 1) + D(0) + (n 1))] = n 2 n 1 D(i) + (n 1). n i=0 We now show that this recurrence relation yields D(n) = O(n log n). We prove this by induction. To begin, we certainly can find a constant C > 0 such that D(1) = 0 C 1 log 1 = 0. Any C will work. Now assume that there is a constant C > 0 such that, for all m < n, D(m) Cm log m. We now show that D(n) Cn log n, by choosing the right condition on C. To start, D(n) = 2 n 1 D(i) + (n 1) n i=0 12

(by the inductive assumption) 2 n 2 n 1 Ci log i + (n 1) n i=0 n 1 2 n (Cn2 2 log n 1 2 ln 2 Cx log xdx + (n 1) = n 1 xdx) + (n 1) = Cn log n C(n2 1) + (n 1) Cn log n, 2n ln 2 provided that C > 2 ln 2 = ln 4, and n is sufficiently large. Therefore, the average internal path length is D(n) = O(n log n). Dividing this quanity by n yields the average depth of a node in a randomly generated binary search tree, which in this case is O(log n). Note: this same analysis can be used to analyze the average-case running time of Quicksort when the median is chosen at random. 13

Exercises. 1. Prove that, when a binary tree with n nodes is implemented by using links to the left and right child (as was done in the BinarySearchTree class), then there will be a total of n + 1 null links. 2. Prove that the maximum number of nodes in a binary tree with height h is 2 h+1 1. 3. A full node for a binary tree is one that has two children. Prove that the number of full nodes plus one equals the number of leaves of a binary tree. 4. Using the BinaryTree class described in lecture, write a class method for deciding if a binary tree is balanced. Hint: a binary tree is balanced if the heights of its left and right subtrees are no more than 1 apart in absolute value, and both its left and right subtrees are balanced. Note: an empty tree is assumed to have a height of -1. 5. Using the BinaryTree class described in lecture, and assuming the data is of type integer, write a class method for deciding if a binary tree is a binary search tree. 6. Using the BinaryTree data structure described in lecture and assuming that the data is of type integer, and that the tree is a binary search tree, write a function that prints in order the integers stored in the tree. 7. Suppose a binary search tree is to hold keys 1,4,5,10,16,17,21. Draw possible binary search trees for these keys, and having heights 2,3,4,5, and 6. For each tree, compute its internal path length. Hint: the height-2 tree should be a perfect tree. 8. Write a method called print() for the BinarySearchTree class presented in lecture, and which prints all the objects of the tree in sorted order. You may assume that each object has a tostring() method that may be called. 9. Prove that it takes Ω(n log n) steps in the best case to build a binary search tree having n keys. Show work and explain. 10. Suppose we have a bst that stores keys between 1 and 1000. If we perform a find on key 363, then which of the following sequences could not be the sequence of nodes examined when performing using the find() method provided in lecture. a) 2,252,401,398,330,344,397 b) 924,220,911,244,898,258,362,363 c) 925,202,911,240,912,245,363 d) 2,399,387,219,266,382,381,278,363 e) 935,278,347,621,299,392,358,363 11. Suppose a bst is constructed by repeatedly inserting distinct keys into the tree. Argue that the number of nodes examined when searching for a key is equal to one more than the number examined when inserting that key. 12. Prove or disprove: deleting keys x and y from a bst is commutative. In other words, it does not matter which order the keys are deleted. The final trees will be identical. If true, provide a proof. If false, provide a counterexample. 14

13. Consider a branch of a binary search tree. This branch divides the tree into three sets: set A, the elements to the left of the branch; set B, the elements on the branch; and set C, the elements to the right of the branch. Provide a small counterexample to the statement for all a A, b B, and c C, a b c. 14. Consider a bst T whose keys are distinct. We call key y a successor of key x iff y > x and there is no key z for which y > z > x. Show that if the right subtree of the node storing key x in T is empty and x has a successor y, then the node that stores y must be the lowest ancestor of x whose left child is also an ancestor of x. Show two examples of this, one in which y is the parent of x, and one for which y is not the parent of x. Label both x and y. 15

Exercise Hints and Answers. 1. Using mathematical induction, if there is one node, then there are 2 null links. Now suppose a tree with n nodes has n + 1 null links. Let T be a tree with n + 1 nodes, and l a leaf of T. Removing l from T yields a tree with n nodes, and hence n + 1 null links (by the inductive assumption). Hence, adding l back to T removes one null link, but adds two more, which implies that T has n + 1 1 + 2 = n + 2 null links. 2. This follows from the identity 1 + 2 + + 2 h = 2 h+1 1. 3. Proof by induction. If a tree has one node then it has 0 full nodes, and 0+1 leaves. Now suppose that a tree with n full nodes has n + 1 leaves. Let T be a tree with n + 1 full nodes. Let l be a leaf of T. If the parent of l is not full, then removing l from T produces a new tree with the same number of leaves and full nodes. So, by continually plucking leaves from T, eventually we will find a leaf l whose parent is full. In this case, removing l from T produces a tree with n full nodes and hence, by the inductive assumption, n + 1 leaves. Thus, T must have n + 1 + 1 = n + 2 leaves. 4. boolean is_balanced() if(left!= null) if(!left.is_balanced()) return false; if(right!= null) if(!right.is_balanced()) return false; return abs(left.height()-right.height()) <= 1 return left.height() == 0; if(right == null) return true; return right.height() == 0; 5. boolean is_bst() if(left!= null) if(!left.is_bst()) return false; 16

int max = left.maximum(); if(maximum >= root) return false; if(right!= null) if(!right.is_bst()) return false; int min = right.minimum(); return minimum > root; return true; if(right == null) return true; int min = right.minimum(); return minimum > root; 6. void print() if(left!= null) left.print(); print root; if(right!= null) right.print(); print root; if(right!= null) right.print() 7. height 2: insert in the order of 10,4,17,1,5,16,21; height 4: insert in the order of 16,10,5,4,1,17,21, height 6: insert in order 21,17,16,10,5,4,1 Internal path lengths: height 2 = 7, height 4 = 13, height 6 = 21 8. See Exercise 6. 17

9. Best case number of steps for i th key is O( log i ). Hence, T (n) = Ω( n log i) = Ω(n log n) by the Integral theorem. 10. a) No. Does not end with 363 c) No. 912 cannot be in the left subtree of 911 i=1 11. True since the key will be found along exactly the same path that it was inserted. 12. Hint: consider the binary search tree with nodes inserted in the following order: 1,10,5,2,8,7,9 13. Use the tree of the previous problem. 14. If y is the parent of x and x is the left child of y, then the statement is true, since technically x is an ancestor of itself. So suppose y is not the parent of x. Consider the first ancestor y of x whose left child is also an ancestor of x. Then we must have x < y and x > z for every descendant z of y that lies between y and x (exclusive) on the branch from y to x. It remains to show that there can be no element z for which x < z < y. Certainly no such z will be found in the left subtree of y, since x has no right subtree, and every ancestor of x between y and x (exclusive) is less than x. Also, no such z will be found in the right subtree of y, since all these elements are greater than y. Moreover, for any ancestor z of y, if y is in z s left subtree, then z > y. If y is in z s right subtree, then y > z, but also x > z. Finally, if z is an element for which z is in one subtree of some ancestor a of y, while y is in the other subtree, then, if y < a, then y < z. Similarly, if y > a, then y > z. But also, x > a, and hence x > z. Therefore, no such z can exist. Note: we call this a proof by cases. 18