Data Structures in Java. Session 15 Instructor: Bert Huang http://www1.cs.columbia.edu/~bert/courses/3134



Similar documents
Review of Hashing: Integer Keys

Sample Questions Csci 1112 A. Bellaachia

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Hash Tables. Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited

Data Structures and Algorithms Written Examination

Dynamic Programming. Lecture Overview Introduction

DATA ANALYSIS II. Matrix Algorithms

CS 2112 Spring Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

DATA STRUCTURES USING C

Java Software Structures

CSE373: Data Structures & Algorithms Lecture 14: Hash Collisions. Linda Shapiro Spring 2016

CSE 2123 Collections: Sets and Iterators (Hash functions and Trees) Jeremy Morris

Exam study sheet for CS2711. List of topics

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

CompSci-61B, Data Structures Final Exam

Network (Tree) Topology Inference Based on Prüfer Sequence

Seminar. Path planning using Voronoi diagrams and B-Splines. Stefano Martina

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

3.1 Solving Systems Using Tables and Graphs

Efficient Data Structures for Decision Diagrams

Lecture 1: Course overview, circuits, and formulas

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

Universal hashing. In other words, the probability of a collision for two different keys x and y given a hash function randomly chosen from H is 1/m.

Data Structures in the Java API

Chapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search

Data Structure [Question Bank]

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

Part 2: Community Detection

Discrete Mathematics & Mathematical Reasoning Chapter 10: Graphs

Discuss the size of the instance for the minimum spanning tree problem.

SCAN: A Structural Clustering Algorithm for Networks

Introduction to Data Structures

Ordered Lists and Binary Trees

Big O and Limits Abstract Data Types Data Structure Grand Tour.

Discrete Mathematics Problems

Zeros of Polynomial Functions

Outline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits

B490 Mining the Big Data. 2 Clustering

Mining Social Network Graphs

Lecture 2: Data Structures Steven Skiena. skiena

Data Structures and Algorithms

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Computer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li

Distance Degree Sequences for Network Analysis

Basic Programming and PC Skills: Basic Programming and PC Skills:

Lecture 7: NP-Complete Problems

Algorithmic Aspects of Big Data. Nikhil Bansal (TU Eindhoven)

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Analysis of Algorithms, I

Math Common Core Sampler Test

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

COMPSCI 105 S2 C - Assignment 2 Due date: Friday, 23 rd October 7pm

Visualising Java Data Structures as Graphs

Lecture 18: Applications of Dynamic Programming Steven Skiena. Department of Computer Science State University of New York Stony Brook, NY

Arithmetic and Algebra of Matrices

Factoring A Quadratic Polynomial

Scheduling Shop Scheduling. Tim Nieberg

Problem Set 7 Solutions

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

Introduction to Programming System Design. CSCI 455x (4 Units)

Mathematics for Algorithm and System Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

SECTIONS NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES

Zero: If P is a polynomial and if c is a number such that P (c) = 0 then c is a zero of P.

IE 680 Special Topics in Production Systems: Networks, Routing and Logistics*

Algorithms and data structures

Guessing Game: NP-Complete?

GCE Computing. COMP3 Problem Solving, Programming, Operating Systems, Databases and Networking Report on the Examination.

Practical Survey on Hash Tables. Aurelian Țuțuianu

Fundamental Algorithms

Lexical Analysis and Scanning. Honors Compilers Feb 5 th 2001 Robert Dewar

THE NUMBER OF GRAPHS AND A RANDOM GRAPH WITH A GIVEN DEGREE SEQUENCE. Alexander Barvinok

Foundations of Programming

Discrete Mathematics. Hans Cuypers. October 11, 2007

Applied Algorithm Design Lecture 5

1 Shapes of Cubic Functions

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing

Merkle Hash Trees for Distributed Audit Logs

Social Media Mining. Graph Essentials

POLYNOMIAL FUNCTIONS

Course on Social Network Analysis Graphs and Networks

KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS

2) Write in detail the issues in the design of code generator.

1 Review of Least Squares Solutions to Overdetermined Systems

Approximation Algorithms

Matrix Multiplication

A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES

A Fast Algorithm For Finding Hamilton Cycles

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

Solutions to Homework 6

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

1. Define: (a) Variable, (b) Constant, (c) Type, (d) Enumerated Type, (e) Identifier.

CSE373: Data Structures and Algorithms Lecture 1: Introduction; ADTs; Stacks/Queues. Linda Shapiro Spring 2016

DATABASE DESIGN - 1DL400

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Transcription:

Data Structures in Java Session 15 Instructor: Bert Huang http://www1.cs.columbia.edu/~bert/courses/3134

Announcements Homework 4 on website No class on Tuesday Midterm grades almost done

Review Indexing by the key needs too much memory Index into smaller size array, pray you donʼt get collisions If collisions occur, separate chaining, lists in array probing, try different array locations

Todayʼs Plan Rehashing Hash functions Graphs introduction

Rehashing Like ArrayLists, we have to guess the number of elements we need to insert into a hash table Whatever our collision policy is, the hash table becomes inefficient when load factor is too high. To alleviate load, rehash: create larger table, scan current table, insert items into new table using new hash function

When to Rehash For quadratic probing, insert may fail if load > 1/2 We can rehash as soon as load > 1/2 Or, we can rehash only when insert fails Heuristically choose a load factor threshold, rehash when threshold breached

Rehash Example Current Table: quad. probing with h(x) = (x mod 7) 8, 0, 25, 17, 7 New table h(x) = (x mod 17) 0 8 7 17 25 0 17 7 8 25 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Rehash Cost No profound algorithm: re-insert each item Linear time If you rehash, inserting N items costs O(1)*N + O(N) = O(N) Insert still costs O(1) amortized

Hash function design Spread the output as much as possible Consider function h(x) = x mod 5 What if our keys are always in tens? Less obvious collision-causing patterns can occur i.e., hashing images by the intensity of the first pixel if images have border

Hashing a String Simple but bad h(x) add up all the character codes (ASCII/ Unicode) ASCII 'a' is 97 If keys are lowercase 5 character words, h(x) > 485

Hashing a String II Weiss: Treat first 3 characters of a string as a 3 digit, base 27 number Once again, ʻaʼ is 97, ʻAʼ is 65

String.hashCode() Java's built in String hashcode() method s[0]*31^(n-1) + s[1]*31^(n-2) +... + s[n-1] nth degree polynomial of base 31 String characters are coefficients

Hash Function Demo

Built-in Java HashSet HashSet stores a set of objects, all hashed by their hashcode() method HashSet<String> table = new HashSet<String>(); table.add( Hello ); table.contains( Hello ); // returns true

Built-in Java HashMap HashMap stores set of pairs of objects, First object is the key, second is the value. Hashed by keyʼs hashcode() HashMap<String,Integer> table = new HashMap<String,Integer>(); table.set( hello, 42); // pairs hello to 42 if hello is not already in the table, creates new pair. Otherwise, overwrites old Integer table.get( hello ); // returns 42

Hashed File Systems Gmail and Dropbox (for example) use a hashed file system All files are stored in a hash table, so attachments are not stored redundantly Saves server storage space and speeds up transactions

Graphs Trees Linked Lists Graphs

Graphs Linked List Tree Graph

Graph Terminology A graph is a set of nodes and edges nodes aka vertices edges aka arcs, links Edges exist between pairs of nodes if nodes x and y share an edge, they are adjacent

Graph Terminology Edges may have weights associated with them Edges may be directed or undirected A path is a series of adjacent vertices the length of a path is the sum of the edge weights along the path (1 if unweighted) A cycle is a path that starts and ends on a node

Graph Properties An undirected graph with no cycles is a tree A directed graph with no cycles is a special class called a directed acyclic graph (DAG) In a connected graph, a path exists between every pair of vertices A complete graph has an edge between every pair of vertices

Graph Applications: A few examples Computer networks The World Wide Web Social networks Public transportation Probabilistic Inference Flow Charts

Implementation Option 1: Store all nodes in an indexed list Represent edges with adjacency matrix Option 2: Explicitly store adjacency lists

Adjacency Matrices 2d-array A of boolean variables A[i][j] is true when node i is adjacent to node j If graph is undirected, A is symmetric 1 1 2 3 4 5 1 2 3 4 5 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 0 0 0 1 0 2 3 4 5

Adjacency Lists Each node stores references to its neighbors 1 2 3 2 1 4 3 1 4 4 2 3 5 5 4 1 2 3 4 5

Reading Weiss Section 5 (Hashing) Weiss Section 9.1