Binary Search Trees. basic implementations randomized BSTs deletion in BSTs



Similar documents
Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Ordered Lists and Binary Trees

Binary Heap Algorithms

Data Structures and Algorithms

Binary Search Trees CMPSC 122

Binary Search Trees (BST)

From Last Time: Remove (Delete) Operation

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT

Analysis of Algorithms I: Binary Search Trees

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Converting a Number from Decimal to Binary

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

Binary Search Tree Intro to Algorithms Recitation 03 February 9, 2011

Algorithms. Algorithms GEOMETRIC APPLICATIONS OF BSTS. 1d range search line segment intersection kd trees interval search trees rectangle intersection

TREE BASIC TERMINOLOGIES

Operations: search;; min;; max;; predecessor;; successor. Time O(h) with h height of the tree (more on later).

Data Structures Fibonacci Heaps, Amortized Analysis

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

How To Create A Tree From A Tree In Runtime (For A Tree)

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Lecture Notes on Binary Search Trees

Binary Trees and Huffman Encoding Binary Search Trees

Algorithms Chapter 12 Binary Search Trees

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

A binary search tree is a binary tree with a special property called the BST-property, which is given as follows:

Cpt S 223. School of EECS, WSU

Lecture 6: Binary Search Trees CSCI Algorithms I. Andrew Rosenberg

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

Output: struct treenode{ int data; struct treenode *left, *right; } struct treenode *tree_ptr;

Algorithms and Data Structures

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Keys and records. Binary Search Trees. Data structures for storing data. Example. Motivation. Binary Search Trees

Lecture Notes on Binary Search Trees

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

Symbol Tables. Introduction

root node level: internal node edge leaf node Data Structures & Algorithms McQuain

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Heaps & Priority Queues in the C++ STL 2-3 Trees

Chapter 14 The Binary Search Tree

The ADT Binary Search Tree

CSE 326: Data Structures B-Trees and B+ Trees

DATA STRUCTURES USING C

Binary search algorithm

Data Structures. Level 6 C Module Descriptor

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

CompSci-61B, Data Structures Final Exam

recursion, O(n), linked lists 6/14

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Sample Questions Csci 1112 A. Bellaachia

Sorting Algorithms. Nelson Padua-Perez Bill Pugh. Department of Computer Science University of Maryland, College Park

Data Structures CSC212 (1) Dr Muhammad Hussain Lecture - Binary Search Tree ADT

Data Structures and Data Manipulation

Data Structure and Algorithm I Midterm Examination 120 points Time: 9:10am-12:10pm (180 minutes), Friday, November 12, 2010

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

dictionary find definition word definition book index find relevant pages term list of page numbers

6. Standard Algorithms

API for java.util.iterator. ! hasnext() Are there more items in the list? ! next() Return the next item in the list.

Big Data and Scripting. Part 4: Memory Hierarchies

Binary Search Trees. Ric Glassey

Data Structure with C

Data Structures, Practice Homework 3, with Solutions (not to be handed in)

cs2010: algorithms and data structures

Binary Search Trees 3/20/14

Analysis of Algorithms I: Optimal Binary Search Trees

Union-Find Algorithms. network connectivity quick find quick union improvements applications

Full and Complete Binary Trees

Why Use Binary Trees?

A Comparison of Dictionary Implementations

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

Algorithms and Data Structures Written Exam Proposed SOLUTION

PES Institute of Technology-BSC QUESTION BANK

Analysis of a Search Algorithm

S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India Dr. R. M.

Exam study sheet for CS2711. List of topics

Class Overview. CSE 326: Data Structures. Goals. Goals. Data Structures. Goals. Introduction

Fundamental Algorithms

Binary Heaps. CSE 373 Data Structures

ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION /5:15-6:45 REC-200, EVI-350, RCH-106, HH-139

Persistent Binary Search Trees

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

CS711008Z Algorithm Design and Analysis

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Motivation Suppose we have a database of people We want to gure out who is related to whom Initially, we only have a list of people, and information a

Outline. Introduction Linear Search. Transpose sequential search Interpolation search Binary search Fibonacci search Other search techniques

Data Structures and Algorithms Written Examination

Section IV.1: Recursive Algorithms and Recursion Trees

An Evaluation of Self-adjusting Binary Search Tree Techniques

Chapter 13: Query Processing. Basic Steps in Query Processing

Data Structure [Question Bank]

Java Software Structures

Optimal Binary Search Trees Meet Object Oriented Programming

schema binary search tree schema binary search trees data structures and algorithms lecture 7 AVL-trees material

Algorithms. Margaret M. Fleck. 18 October 2010

Data Structures Using C++ 2E. Chapter 5 Linked Lists

Transcription:

Binary Search Trees basic implementations randomized BSTs deletion in BSTs eferences: Algorithms in Java, Chapter 12 Intro to Programming, Section 4.4 http://www.cs.princeton.edu/introalgsds/43bst 1

Elementary implementations: summary implementation worst case average case ordered iteration? search insert search insert operations on keys unordered array N N N/2 N/2 no equals() ordered array lg N N lg N N/2 yes compareto() unordered list N N N/2 N no equals() ordered list N N N/2 N/2 yes compareto() Challenge: Efficient implementations of get() and put() and ordered iteration. 2

basic implementations randomized BSTs deletion in BSTs 3

Binary Search Trees (BSTs) Def. A BINAY SEACH TEE is a binary tree in symmetric order. it A binary tree is either: empty a key-value pair and two binary trees [neither of which contain that key] best of the was times equal keys ruled out to facilitate associative array implementations Symmetric order means that: every node has a key every node s key is larger than all keys in its left subtree smaller than all keys in its right subtree node x subtrees smaller larger 4

BST representation A BST is a reference to a Node. A Node is comprised of four fields: A key and a value. A reference to the left and right subtree. smaller keys larger keys root private class Node Key key; Value val; Node left, right; best 1 it 2 was 2 Key and Value are generic types; Key is Comparable the 1 of 1 times 1 5

BST implementation (skeleton) public class BST<Key extends Comparable<Key>, Value> implements Iterable<Key> private Node root; instance variable private class Node Key key; Value val; Node left, right; Node(Key key, Value val) this.key = key; this.val = val; inner class public void put(key key, Value val) // see next slides public Val get(key key) // see next slides 6

BST implementation (search) public Value get(key key) Node x = root; while (x!= null) int cmp = key.compareto(x.key); if (cmp == 0) return x.val; else if (cmp < 0) x = x.left; else if (cmp > 0) x = x.right; return null; root best 1 it 2 was 2 the 1 get( the ) returns 1 of 1 times 1 get( worst ) returns null 7

BST implementation (insert) public void put(key key, Value val) root = put(root, key, val); root it 2 best 1 was 2 put( the, 2) overwrites the 1 the 1 worst 1 Caution: tricky recursive code. ead carefully! of 1 times 1 put( worst, 1) adds a new entry private Node put(node x, Key key, Value val) if (x == null) return new Node(key, val); int cmp = key.compareto(x.key); if (cmp == 0) x.val = val; else if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); return x; 8

BST: Construction Insert the following keys into BST. A S E C H I N G X M P L 9

Tree Shape Tree shape. Many BSTs correspond to same input data. Cost of search/insert is proportional to depth of node. typical A S E C H I worst Tree shape depends on order of insertion best H C A E I S A C E H I S 10

BST implementation: iterator? public Iterator<Key> iterator() return new BSTIterator(); E private class BSTIterator implements Iterator<Key> A S BSTIterator() public boolean hasnext() C H I N public Key next() 11

BST implementation: iterator? Approach: mimic recursive inorder traversal E public void visit(node x) if (x == null) return; visit(x.left) StdOut.println(x.key); visit(x.right); A Stack contents C H I N S visit(e) visit(a) print A visit(c) print C print E visit(s) visit(i) visit(h) print H print I visit() visit(n) print N print print S A C E H I N S E A E E C E E S I S H I S I S S S N S S S To process a node follow left links until empty (pushing onto stack) pop and process process node at right link 12

BST implementation: iterator public Iterator<Key> iterator() return new BSTIterator(); E private class BSTIterator implements Iterator<Key> private Stack<Node> stack = new Stack<Node>(); A C I S private void pushleft(node x) while (x!= null) stack.push(x); x = x.left; A E H N BSTIterator() pushleft(root); A C E C E public boolean hasnext() return!stack.isempty(); public Key next() Node x = stack.pop(); pushleft(x.right); return x.key; E H I S H I S I N S N S S S 13

1-1 correspondence between BSTs and Quicksort partitioning C E I K Q S T A E M X no equal keys L O P U 14

BSTs: analysis Theorem. If keys are inserted in random order, the expected number of comparisons for a search/insert is about 2 ln N. 1.38 lg N, variance = O(1) Proof: 1-1 correspondence with quicksort partitioning Theorem. If keys are inserted in random order, height of tree is proportional to lg N, except with exponentially small probability. mean 6.22 lg N, variance = O(1) But Worst-case for search/insert/height is N. e.g., keys inserted in ascending order 15

Searching challenge 3 (revisited): Problem: Frequency counts in Tale of Two Cities Assumptions: book has 135,000+ words about 10,000 distinct words Which searching method to use? 1) unordered array 2) unordered linked list 3) ordered array with binary search 4) need better method, all too slow 5) doesn t matter much, all fast enough 6) BSTs insertion cost < 10000 * 1.38 * lg 10000 <.2 million lookup cost < 135000 * 1.38 * lg 10000 < 2.5 million 16

Elementary implementations: summary implementation guarantee average case ordered iteration? search insert search insert operations on keys unordered array N N N/2 N/2 no equals() ordered array lg N N lg N N/2 yes compareto() unordered list N N N/2 N no equals() ordered list N N N/2 N/2 yes compareto() BST N N 1.38 lg N 1.38 lg N yes compareto() Next challenge: Guaranteed efficiency for get() and put() and ordered iteration. 17

basic implementations randomized BSTs deletion in BSTs 18

otation in BSTs Two fundamental operations to rearrange nodes in a tree. maintain symmetric order. local transformations (change just 3 pointers). basis for advanced BST algorithms Strategy: use rotations on insert to adjust tree shape to be more balanced h u h v v h = rotl(u) u A h = rot(v) C B C A B Key point: no change in search code (!) 19

otation Fundamental operation to rearrange nodes in a tree. easier done than said raise some nodes, lowers some others root = rotl(a) private Node rotl(node h) Node v = h.r; h.r = v.l; v.l = h; return v; A.left = rot(s) private Node rot(node h) Node u = h.l; h.l = u.r; u.r = h; return u; 20

ecursive BST oot Insertion insert G oot insertion: insert a node and make it the new root. Insert as in standard BST. otate inserted node to the root. Easy recursive implementation Caution: very tricky recursive code. ead very carefully! private Node putoot(node x, Key key, Val val) if (x == null) return new Node(key, val); int cmp = key.compareto(x.key); if (cmp == 0) x.val = val; else if (cmp < 0) x.left = putoot(x.left, key, val); x = rot(x); else if (cmp > 0) x.right = putoot(x.right, key, val); x = rotl(x); return x; 21

Constructing a BST with root insertion Ex. A S E C H I N G X M P L Why bother? ecently inserted keys are near the top (better for some clients). Basis for advanced algorithms. 22

andomized BSTs (oura, 1996) Intuition. If tree is random, height is logarithmic. Fact. Each node in a random tree is equally likely to be the root. Idea. Since new node should be the root with probability 1/(N+1), make it the root (via root insertion) with probability 1/(N+1). private Node put(node x, Key key, Value val) if (x == null) return new Node(key, val); int cmp = key.compareto(x.key); if (cmp == 0) x.val = val; return x; if (Stdandom.bernoulli(1.0 / (x.n + 1.0)) return putoot(h, key, val); if (cmp < 0) x.left = put(x.left, key, val); else if (cmp > 0) x.right = put(x.right, key, val); x.n++; return x; need to maintain count of nodes in tree rooted at x 23

Constructing a randomized BST Ex: Insert distinct keys in ascending order. Surprising fact: Tree has same shape as if keys were inserted in random order. andom trees result from any insert order Note: to maintain associative array abstraction need to check whether key is in table and replace value without rotations if that is the case. 24

andomized BST Property. andomized BSTs have the same distribution as BSTs under random insertion order, no matter in what order keys are inserted. Expected height is ~6.22 lg N Average search cost is ~1.38 lg N. Exponentially small chance of bad balance. Implementation cost. Need to maintain subtree size in each node. 25

Summary of symbol-table implementations implementation guarantee average case ordered iteration? search insert search insert operations on keys unordered array N N N/2 N/2 no equals() ordered array lg N N lg N N/2 yes compareto() unordered list N N N/2 N no equals() ordered list N N N/2 N/2 yes compareto() BST N N 1.38 lg N 1.38 lg N yes compareto() randomized BST 7 lg N 7 lg N 1.38 lg N 1.38 lg N yes compareto() andomized BSTs provide the desired guarantee probabilistic, with exponentially small chance of quadratic time Bonus (next): andomized BSTs also support delete (!) 26

basic implementations randomized BSTs deletion in BSTs 27

BST delete: lazy approach To remove a node with a given key set its value to null leave key in tree to guide searches [but do not consider it equal to any search key] A C E H I N S remove I A C E a tombstone H I N S Cost. O(log N') per insert, search, and delete, where N' is the number of elements ever inserted in the BST. Unsatisfactory solution: Can get overloaded with tombstones. 28

BST delete: first approach To remove a node from a BST. [Hibbard, 1960s] Zero children: just remove it. One child: pass the child up. Two children: find the next largest node using right-left* swap with next largest remove as above. zero children one child two children Unsatisfactory solution. Not symmetric, code is clumsy. Surprising consequence. Trees not random (!) sqrt(n) per op. Longstanding open problem: simple and efficient delete for BSTs 29

Deletion in randomized BSTs To delete a node containing a given key remove the node join the two remaining subtrees to make a tree Ex. Delete S in A E S C H I N X 30

Deletion in randomized BSTs To delete a node containing a given key remove the node join its two subtrees Ex. Delete S in A E join these two subtrees private Node remove(node x, Key key) if (x == null) return new Node(key, val); int cmp = key.compareto(x.key); if (cmp == 0) return join(x.left, x.right); else if (cmp < 0) x.left = remove(x.left, key); else if (cmp > 0) x.right = remove(x.right, key); return x; C H I N X 31

Join in randomized BSTs To join two subtrees with all keys in one less than all keys in the other maintain counts of nodes in subtrees (L and ) with probability L/(L+) make the root of the left the root make its left subtree the left subtree of the root join its right subtree to to make the right subtree of the root with probability L/(L+) do the symmetric moves on the right to join these two subtrees I make I the root with probability 4/5 need to join these two subtrees I X H X H N N 32

Join in randomized BSTs To join two subtrees with all keys in one less than all keys in the other maintain counts of nodes in subtrees (L and ) with probability L/(L+) make the root of the left the root make its left subtree the left subtree of the root join its right subtree to to make the right subtree of the root with probability L/(L+) do the symmetric moves on the right private Node join(node a, Node b) if (a == null) return a; if (b == null) return b; int cmp = key.compareto(x.key); if (Stdandom.bernoulli((double)*a.N / (a.n + b.n)) a.right = join(a.right, b); return a; else b.left = join(a, b.left ); return b; N to join these two subtrees X make the root with probability 2/3 N X 33

Deletion in randomized BSTs To delete a node containing a given key remove the node join its two subtrees Ex. Delete S in E E A S A I C H I X C H N X N Theorem. Tree still random after delete (!) Bottom line. Logarithmic guarantee for search/insert/delete 34

Summary of symbol-table implementations implementation guarantee average case ordered search insert delete search insert delete iteration? unordered array N N N N/2 N/2 N/2 no ordered array lg N N N lg N N/2 N/2 yes unordered list N N N N/2 N N/2 no ordered list N N N N/2 N/2 N/2 yes BST N N N 1.38 lg N 1.38 lg N? yes randomized BST 7 lg N 7 lg N 7 lg N 1.38 lg N 1.38 lg N 1.38 lg N yes andomized BSTs provide the desired guarantees probabilistic, with exponentially small chance of error Next lecture: Can we do better? 35