Collision Resolution. Ananda Gunawardena

Similar documents
Review of Hashing: Integer Keys

CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions. Linda Shapiro Spring 2016

Universal hashing. In other words, the probability of a collision for two different keys x and y given a hash function randomly chosen from H is 1/m.

Fundamental Algorithms

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

Record Storage and Primary File Organization

Hash Tables. Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited

Chapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search

Data Structures in Java. Session 15 Instructor: Bert Huang

Practical Survey on Hash Tables. Aurelian Țuțuianu

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

DATA STRUCTURES USING C

BM307 File Organization

Lecture 2 February 12, 2003

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

Sample Questions Csci 1112 A. Bellaachia

A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES

How Caching Affects Hashing

Lecture 1: Data Storage & Index

A Survey on Efficient Hashing Techniques in Software Configuration Management

Storage and File Structure

Part III Storage Management. Chapter 11: File System Implementation

Multi-dimensional index structures Part I: motivation

A Comparison of Dictionary Implementations

Data Structures Fibonacci Heaps, Amortized Analysis

Data Structures in the Java API

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Partial Fractions Decomposition

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING LESSON PLAN

DATABASE DESIGN - 1DL400

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Chapter 8: Structures for Files. Truong Quynh Chi Spring- 2013

1.2 Linear Equations and Rational Equations

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

Binary Search Tree Intro to Algorithms Recitation 03 February 9, 2011

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT


Part 5: More Data Structures for Relations

CHAPTER 13: DISK STORAGE, BASIC FILE STRUCTURES, AND HASHING

1 Shapes of Cubic Functions

recursion, O(n), linked lists 6/14

Common Data Structures

Cuckoo Filter: Practically Better Than Bloom

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 221] edge. parent

H/wk 13, Solutions to selected problems

Database Systems. National Chiao Tung University Chun-Jen Tsai 05/30/2012

Anti-persistence: History Independent Data Structures

File System Management

CIS 631 Database Management Systems Sample Final Exam

Java Software Structures

Chapter 13: Query Processing. Basic Steps in Query Processing

Algorithmic Aspects of Big Data. Nikhil Bansal (TU Eindhoven)

Linear and quadratic Taylor polynomials for functions of several variables.

Summary of important mathematical operations and formulas (from first tutorial):

CSE 326: Data Structures B-Trees and B+ Trees

Black Problems - Prime Factorization, Greatest Common Factor and Simplifying Fractions

Partial Fractions. (x 1)(x 2 + 1)

Algorithms. Algorithms GEOMETRIC APPLICATIONS OF BSTS. 1d range search line segment intersection kd trees interval search trees rectangle intersection

QUEUES. Primitive Queue operations. enqueue (q, x): inserts item x at the rear of the queue q

Factoring. Factoring Polynomial Equations. Special Factoring Patterns. Factoring. Special Factoring Patterns. Special Factoring Patterns

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

Arrays. Atul Prakash Readings: Chapter 10, Downey Sun s Java tutorial on Arrays:

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

2) What is the structure of an organization? Explain how IT support at different organizational levels.

Chapter 11: File System Implementation. Chapter 11: File System Implementation. Objectives. File-System Structure

Integrals of Rational Functions

Queues and Stacks. Atul Prakash Downey: Chapter 15 and 16

Greatest Common Factor and Least Common Multiple

CSE373: Data Structures and Algorithms Lecture 1: Introduction; ADTs; Stacks/Queues. Linda Shapiro Spring 2016

Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80)

Live Agent for Support Agents

Lab Experience 17. Programming Language Translation

Four strategies to help move beyond lean and build a competitive advantage

Lecture 2: Data Structures Steven Skiena. skiena

From Last Time: Remove (Delete) Operation

BCS2B02: OOP Concepts and Data Structures Using C++

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Data Structures and Algorithms

- Easy to insert & delete in O(1) time - Don t need to estimate total memory needed. - Hard to search in less than O(n) time

An Introduction to the RSA Encryption Method

March 29, S4.4 Theorems about Zeros of Polynomial Functions

Ordered Lists and Binary Trees

Structure for String Keys

TREE BASIC TERMINOLOGIES

Data Structures and Algorithms!

CompSci-61B, Data Structures Final Exam

Bitmap Algorithms for Counting Active Flows on High Speed Links. Elisa Jasinska

CSE 2123 Collections: Sets and Iterators (Hash functions and Trees) Jeremy Morris

Data Structures and Algorithms!

Factors Galore C: Prime Factorization

Speeding Up CNC Machining with ZW3D Tool Path Editor. ZW3D CAD/CAM White Paper

Working with Spreadsheets

Solutions of Linear Equations in One Variable

Lecture 2 Mathcad Basics

Transcription:

Collision Resolution Ananda Gunawardena

Hashing

Big Idea in Hashing Let S={a 1,a 2, a m } be a set of objects that we need to map into a table of size N. Find a function such that H:S [1 n] Ideally we d like to have a 1-1 map But it is not easy to find one Also function must be easy to compute Also picking a primeas the table size can help to have a better distribution of values

Collisions Two keys mapping to the same location in the hash table is called Collision Collisions can be reduced with a selection of a good hash function But it is not possible to avoid collisions altogether Unless we can find a perfect hash function Which is hard to do

Collision Resolving strategies Few Collision Resolution ideas Separate chaining Some Open addressing techniques Linear Probing Quadratic Probing

Separate Chaining

Separate Chaining Collisions can be resolved by creating a list of keys that map to the same value

Separate Chaining Use an array of linked lists LinkedList[ ] Table; Table = new LinkedList(N), where N is the table size Define Load Factorof Table as λ = number of keys/size of the table (λcan be more than 1) Still need a good hash function to distribute keys evenly For search and updates

Linear Probing

Linear Probing The idea: Table remains a simple array of size N On insert(x), compute f(x) mod N, if the cell is full, find another by sequentially searching for the next available slot Go to f(x)+1, f(x)+2 etc.. On find(x), compute f(x) mod N, if the cell doesn t match, look elsewhere. Linear probing function can be given by h(x, i) = (f(x) + i) mod N (i=1,2,.)

Figure 20.4 Linear probing hash table after each insertion Data Structures & Problem Solving using JAVA/2E Mark Allen Weiss 2002 Addison Wesley

Linear Probing Example Consider H(key) = key Mod 6 (assume N=6) H(11)=5, H(10)=4, H(17)=5, H(16)=4,H(23)=5 Draw the Hash table 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5

Clustering Problem Clustering is a significant problem in linear probing. Why? Illustration of primary clustering in linear probing (b) versus no clustering (a) and the less significant secondary clustering in quadratic probing (c). Long lines represent occupied cells, and the load factor is 0.7. Data Structures & Problem Solving using JAVA/2E Mark Allen Weiss 2002 Addison Wesley

Deleting items from a hash table

Deleting items How about deleting items from Hash table? Item in a hash table connectsto others in the table(eg: BST). Deleting items will affect finding the others Lazy Delete Just mark the items as inactive rather than removing it.

Lazy Delete Naïve removal can leave gaps! 0 a 2 b 3 c 3 e 5 d Insert f 0 a 2 b 3 c 3 e 5 d 3 f Remove e 0 a 2 b 3 c 5 d 3 f Find f 0 a 2 b 3 c 5 d 3 f 8 j 8 u 10 g 8 s 8 j 8 u 10 g 8 s 8 j 8 u 10 g 8 s 8 j 8 u 10 g 8 s 3 f means search key f and hash key 3

Lazy Delete Cleverremoval 0 a 2 b 3 c 3 e 5 d Insert f 0 a 2 b 3 c 3 e 5 d 3 f Remove e 0 a 2 b 3 c gone 5 d 3 f Find f 0 a 2 b 3 c gone 5 d 3 f 8 j 8 u 10 g 8 s 8 j 8 u 10 g 8 s 8 j 8 u 10 g 8 s 8 j 8 u 10 g 8 s 3 f means search key f and hash key 3

Load Factor (open addressing) definition: The load factor λof a probing hash table is the fraction of the table that is full. The load factor ranges from 0 (empty) to 1 (completely full). It is better to keep the load factorunder 0.7 Doublethe table size and rehash if load factor gets high Cost of Hash function f(x) must be minimized When collisions occur, linear probing can always find an empty cell But clustering can be a problem

Quadratic Probing

Quadratic probing Another open addressing method Resolve collisions by examining certain cells (1,4,9, ) away from the original probe point Collision policy: Define h 0 (k), h 1 (k), h 2 (k), h 3 (k), where h i (k) = (hash(k) + i 2 ) mod size Caveat: May not find a vacant cell! Table must be less than half full (λ < ½) (Linear probing always finds a cell.)

Quadratic probing Another issue Suppose the table size is 16. Probe offsets that will be tried: 1 mod 16 = 1 4mod 16 = 4 9mod 16 = 9 16mod 16 = 0 25mod 16 = 9 36 mod 16 = 4 49 mod 16 = 1 64 mod 16 = 0 81mod 16 = 1 only four different values!

Figure 20.6 A quadratic probing hash table after each insertion (note that the table size was poorly chosen because it is not a prime number). Data Structures & Problem Solving using JAVA/2E Mark Allen Weiss 2002 Addison Wesley