Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R



Similar documents
Binary Search Trees CMPSC 122

Algorithms Chapter 12 Binary Search Trees

Analysis of Algorithms I: Binary Search Trees

Ordered Lists and Binary Trees

A binary search tree is a binary tree with a special property called the BST-property, which is given as follows:

Data Structures and Algorithms

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Converting a Number from Decimal to Binary

Binary Search Trees (BST)

From Last Time: Remove (Delete) Operation

DATA STRUCTURES USING C

Data Structure [Question Bank]

Introduction to Data Structures and Algorithms

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

TREE BASIC TERMINOLOGIES

Data Structure with C

Data Structures Fibonacci Heaps, Amortized Analysis

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

Chapter 14 The Binary Search Tree

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Binary Trees and Huffman Encoding Binary Search Trees

Binary Search Trees. Each child can be identied as either a left or right. parent. right. A binary tree can be implemented where each node

How To Create A Tree From A Tree In Runtime (For A Tree)

Binary Heaps. CSE 373 Data Structures

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Operations: search;; min;; max;; predecessor;; successor. Time O(h) with h height of the tree (more on later).

Lecture 2 February 12, 2003

Algorithms and Data Structures

Lecture Notes on Binary Search Trees

CSE 326: Data Structures B-Trees and B+ Trees

Lecture Notes on Binary Search Trees

CS711008Z Algorithm Design and Analysis

Binary Search Tree Intro to Algorithms Recitation 03 February 9, 2011

Full and Complete Binary Trees

PES Institute of Technology-BSC QUESTION BANK

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Data Structures. Level 6 C Module Descriptor

The ADT Binary Search Tree

Lecture 6: Binary Search Trees CSCI Algorithms I. Andrew Rosenberg

Binary Heap Algorithms

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Binary Search Trees. basic implementations randomized BSTs deletion in BSTs

The following themes form the major topics of this chapter: The terms and concepts related to trees (Section 5.2).

10CS35: Data Structures Using C

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27

Symbol Tables. Introduction

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Cpt S 223. School of EECS, WSU

Introduction to Data Structures and Algorithms

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

Data Structure and Algorithm I Midterm Examination 120 points Time: 9:10am-12:10pm (180 minutes), Friday, November 12, 2010

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Analysis of Algorithms I: Optimal Binary Search Trees

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

Physical Data Organization

Algorithms and Data Structures

Output: struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

Exam study sheet for CS2711. List of topics

APP INVENTOR. Test Review

Atmiya Infotech Pvt. Ltd. Data Structure. By Ajay Raiyani. Yogidham, Kalawad Road, Rajkot. Ph : ,

Why Use Binary Trees?

Parallelization: Binary Tree Traversal

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

Data Structures, Practice Homework 3, with Solutions (not to be handed in)

Binary Search Trees. Ric Glassey

schema binary search tree schema binary search trees data structures and algorithms lecture 7 AVL-trees material

S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India Dr. R. M.

Algorithms and Data Structures Written Exam Proposed SOLUTION

Chapter 13: Query Processing. Basic Steps in Query Processing

GRAPH THEORY LECTURE 4: TREES

Big Data and Scripting. Part 4: Memory Hierarchies

Alex. Adam Agnes Allen Arthur

Rotation Operation for Binary Search Trees Idea:

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 221] edge. parent

- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time

Persistent Binary Search Trees

International Journal of Software and Web Sciences (IJSWS)

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

6 March Array Implementation of Binary Trees

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

A Comparison of Dictionary Implementations

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

ER E P M A S S I CONSTRUCTING A BINARY TREE EFFICIENTLYFROM ITS TRAVERSALS DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Heaps & Priority Queues in the C++ STL 2-3 Trees

Sample Questions Csci 1112 A. Bellaachia

Binary Search Trees 3/20/14

OPTIMAL BINARY SEARCH TREES

ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION /5:15-6:45 REC-200, EVI-350, RCH-106, HH-139

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

Transcription:

Binary Search Trees A Generic Tree Nodes in a binary search tree ( B-S-T) are of the form P parent Key A Satellite data L R B C D E F G H I J The B-S-T has a root node which is the only node whose parent pointer is NIL. K L Binary Trees Binary tree is a root left subtree (maybe empty) right subtree (maybe empty) Properties max # of leaves: max # of nodes: average height for N nodes: Representation: D B A E G C F H 1.Is this binary tree complete? Why not? (C has just one child, right side is much deeper than left). What s the maximum # of leaves a binary tree of depth d/height h can have? d 3.What s the max # of nodes a binary tree of depth d/height h can have? d + 1-1 Minimum? d-1 + 1 ; d 4.We won t go into this, but if you take N nodes and assume all distinct trees of the nodes are equally likely, you get an average depth/height of SQRT(N). D B I A E G C F J H left pointer Data right pointer I J Is that bigger or smaller than log n? Bigger, so it s not good enough! We will see we need to impose structure to get the bounds we want

Representation Implementations of Binary Trees A representation of a binary tree data node is similar to a doubly linked list in that it has two pointers. A left right pointer pointer B A C The graphic of a node shows a data area, and a left and right pointer. Some trees only store data in the leaf nodes. B left right pointer pointer C left right pointer pointer D E F D left right pointer pointer E left right pointer pointer F left right pointer pointer In this tree of five nodes, the following properties can be seen: Here is an example of a union binary tree implementation where the internal nodes are of a different construct than the leaf nodes. The internal nodes contain pointers, and a data element representing a mathematical operator. The leaf nodes are data nodes only, and contain the operands. there are nodes 3 internal leaf nodes there are a total of pointers 6 of the pointers are null pointers 4 * x * ( * x + a) - c For the above equation: Post-order traversal of the above tree will regenerate the post-fix notation of the equation. Pre-order traversal regenerates prefix notation In-order traversal regenerates the form depicted to the left of the tree

Binary Search Tree Dictionary Data Structure Every BST satisfies the BST property Binary tree property each node has children result: storage is small operations are simple average depth is small normally Search tree property all keys in left subtree smaller than root s key all keys in right subtree larger than root s key result: easy to find any given key 4 6 8 11 1 14 13 i. If y is in the LEFT subtree of x then key [ y ] < key [x ] ii. if y is in the RIGHT subtree of x then key [ y ] > key [ x ]. This property ensures that data in a B-S-T are stored in such a way as to satisfy the B-S-T property. Examples Examples 1 8 1 8 4 8 11 4 8 11 1 11 6 18 1 11 6 18 3 4 1 0 3 4 1 0 NEITHER IS A BINARY SEARCH TREE 1 NEITHER IS A BINARY SEARCH TREE 1

Binary Search Trees ( BST) The defining property of BST is that each node has left and right links pointing to another binary search tree or to external nodes ( which have no non-nil links). Compare key values in internal nodes with the search key and use result to control progress of the search. Insert ASERCGHIN into an initially empty BST Notice that each insertion follows a search miss at the bottom of the tree. Insertion is as easy to implement as Search. Run Times of BST algorithms depend on the shape of the tree Best Case: Tree is perfectly balanced ~ log n nodes from root to the bottom Worst Case : Could be n nodes from root to the bottom. Searches on BST : On average require about log n comparisons on a tree with n nodes. Proof: # of compares = 1 + distance of node to the root Adding over all nodes gives internal path length If C n = average internal path length of BST with n nodes, Sorting If look at BST in proper manner, it represents a sorted file i.e. read the tree from left to right, ignoring the level (height) of the nodes in the tree i.e. an In Order traversal of the tree ( left subtree => root => right subtree ) BST's are a dual model to quicksort : Node at root corresponds to the pivot element Traversals Many algorithms involve walking through a tree, and performing some computation at each node Walking through a tree is called a traversal Common kinds of traversal Pre-order Post-order Level-order Insert { A S E R A H C G I E N X M P E A L } into an empty BST

Consider the following pseudocode : InOrder_Traversal ( x) If x = Nil Then InOrder_Traversal( Left [ x ]) Print key [x ] InOrder_Traversal ( Right [ x ] ) In Order Listing What is printed if this is applied to the B-S-T in the graph? How long does the tree traversal take? 1 0 O ( n ) - time for a tree with n items Visiting each node once and printing the value An InOrder_Traversal prints the node values in monotonically increasing order. In order listing: 1 1 0 1 Operations on a BST Find D in the preceding B-S-T : Searching : = NIL

What happens if search for C? Maximum and Minimum : Very straightforward from the structure of the B-S-T Tree_ Minimum ( x) While left [ x ] not null Do x left [ x ] Return x 1 Tree_ Maximum( x) While right [ x ] not null Do x right [ x ] Return x 1 0 How long does each procedure take to run? O ( h ) where h = height of the tree. Just traveling down the tree one level at a time. Successor and Predecessor : If all keys are distinct, then the successor of a node x is the node with the smallest key greater than the key [x] If all keys are distinct, then the predecessor of a node x is the node with the largest key less than the key [x] Successor and Predecessor : The structure of the B-S-T allows determination of the successor without any comparison of keys : Tree_Successor (x) 1. If right [ x ] not null. Then return Tree_Minimum ( right [ x ]) 3. y p [ x ] 4. while y not null and x = right [ y ] do x y. y p [ y ] 6. return y What is happening in the situation when the key has no right subtree? In this case, if x has a successor then it is the lowest ancestor of x whose left child is also an ancestor of x. to find the successor, in this case, move up the tree from x until find a node that is the left child of its parent.

1. Find successor of 1 right [ x ] is not null so execute a call to Extract_Min on right [ 1 ] points to 18 and returns 1 = x.. Find successor to 13 i. y gets p[x] and points to node ii. y not null and x = right [ y ] iii. x set to point to node ; y set to point to 6 node iv. y not null and x = right [ y ] v. x3 set to point to 6 node ; y3 set to point to 1 node vi. y3 is not null and x3 = right [y3 ] vii. return y3 as long as move left up the subtree, we visit smaller keys our successor is the node of which we are the predecessor What is the running time? In either case follow path up the tree or down the tree (and only one of these paths) O ( h ) run time. What would code look like for the predecessor of x? Theorem : The dynamic set operations : Search, Minimum, Maximum, Successor and Predecessor can run in O ( h ) time on a B-S-T of height h. Idea behind Insertion 1. goal of the algorithm is to find a place to insert a new node. similar to the search code but with a few twists 3. as you go keep two pointers : one to where you are ; one to where you have been ( to allow for a quick connection) 4. trace a path from the root to a null this locates where the node will go. what if there is no tree? set this new node to be the root What if the input string is : B D F H J L and no tree exists at first insert?

Insertion and Deletion These operations cause the dynamic set represented by the B-S-T to change. Changes are made so that the B-S-T property is preserved. Insertion : Deleting a Node 1. If the node is an external node, simply replace it with a NIL value. If it is an internal node, then it has 1 or children that cannot simply be orphaned they need to be reattached to the BST tree while preserving the BST property. Case 1 : node has one child : Replace the node with the value ( key) of its child Begin at the root and trace a path downward in the tree - x traces the path ; y retains the pointer to the parent of x -directional choices are determined by the compare : key [x] vs key [z] until x is set to nil - nil occupies the location where z is to be stored - Running time : as with others, this is O ( h ) Case : node has two children : Find the successor ( or predecessor ) of the node to be removed replace the node with the value ( key) of the successor ( or predecessor )move to earlier cases to resolve any created orphans. Deletion This operation is a bit more complicated depends basically on whether the node to be deleted, z, has: - No children In this case, remove the node by changing its parent, p[z], by replacing z with NIL as its child - A single child Remove the child and create a spliced link from the parent, p [ z ] to the child of z - Two children A bit more complicated find the successor y that has no left child and replace the contents of z with the contents of y. In this case it s successor is the minimum in its right subtree, and so, that successor has no left children Tree_Delete ( T, z )

Theorem : The dynamic set operations Insert and Delete can run in O( h ) time, in a binary search tree of height h. Note : h not n Sorting : Sort ( A ) for i 1 to n do Tree_Insert ( A [ i ] ) InOrder_Traversal (root) What should you expect for a lower bound on the run time? ## Ω ( n lg n ) ### Why? - Is this a comparison based sort? Average Case Analysis ( same as Quicksort ) The algorithm is a quicksort in which the partitioning process maintains the order of the elements in each partition. Consider : given : 3 1 8 6 In turn everything is compared to 3 then to 1 or 8, etc. -order is different than quicksort, concept same : namely, at each level n compares, depth ~ lg n Ω ( n lg n ) running time. For a priority queue to extract the minimum : Extract_Min ( x) - returns a pointer to the Min key while left(x) = null do x left [ x ] return x *** examine the first tree ( F B ) and see what happens.

Deletion Lazy Deletion 1 0 Instead of physically deleting nodes, just mark them as deleted + Simpler + some adds just flip deleted flag + physical deletions done in batches + extra memory for deleted flag many lazy deletions slow finds some operations may have to be modified (e.g., min and max) 1 1 0 Why might deletion be harder than insertion? 1 Lazy Deletion Deletion - Leaf Case Delete(1) Delete(1) Delete() 1 Delete(1) 1 Find() 0 0 Find(16) 1 1 Insert() Find(1)

Deletion - One Child Case Deletion - Two Child Case Delete(1) Delete() 1 0 0 replace node with value guaranteed to be between the left and right subtrees: the successor Could we have used the predecessor instead? Deletion - Two Child Case Delete() always easy to delete the successor always has either 0 or 1 children! 0 Delete Code void delete(comparable x, Node *& p) { Node * q; if (p!= NULL) { if (p->key < x) delete(x, p->right); else if (p->key > x) delete(x, p- >left); else { /* p->key == x */ if (p->left == NULL) p = p->right; else if (p->right == NULL) p = p- >left; else { q = successor(p); p->key = q->key; delete(q->key, p->right); } } } }

Beauty is Only Θ(log n) Deep Balance Binary Search Trees are fast if they re shallow: e.g.: perfectly complete e.g.: perfectly complete except the fringe (leafs) any other good cases? Balance measure : height(left subtree) - height(right subtree) t zero everywhere perfectly balanced small everywhere balanced enough What matters here? Problems occur when one branch is much longer than the other! Balance between -1 and 1 everywhere maximum height of 1.44 log n