Class Overview. CSE 326: Data Structures. Goals. Goals. Data Structures. Goals. Introduction



Similar documents
CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

CSE373: Data Structures and Algorithms Lecture 1: Introduction; ADTs; Stacks/Queues. Linda Shapiro Spring 2016

recursion, O(n), linked lists 6/14

In mathematics, it is often important to get a handle on the error term of an approximation. For instance, people will write

Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Data Structures. Algorithm Performance and Big O Analysis

Algorithms. Margaret M. Fleck. 18 October 2010

Sequential Data Structures

Data Structures and Algorithms Stacks and Queues

Analysis of a Search Algorithm

This lecture. Abstract data types Stacks Queues. ADTs, Stacks, Queues Goodrich, Tamassia

CS/COE

Queues Outline and Required Reading: Queues ( 4.2 except 4.2.4) COSC 2011, Fall 2003, Section A Instructor: N. Vlajic

The Running Time of Programs

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

CSC148 Lecture 8. Algorithm Analysis Binary Search Sorting

Course: Programming II - Abstract Data Types. The ADT Stack. A stack. The ADT Stack and Recursion Slide Number 1

Module 2 Stacks and Queues: Abstract Data Types

CSC 180 H1F Algorithm Runtime Analysis Lecture Notes Fall 2015

Why? A central concept in Computer Science. Algorithms are ubiquitous.

Common Data Structures

Chapter 3: Restricted Structures Page 1

Lecture Notes on Binary Search Trees

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Data Structure [Question Bank]

6. Standard Algorithms

Binary search algorithm

St S a t ck a ck nd Qu Q eue 1

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

csci 210: Data Structures Recursion

Computer Science 210: Data Structures. Searching

What Is Recursion? Recursion. Binary search example postponed to end of lecture

Lecture 11: Tail Recursion; Continuations

Zabin Visram Room CS115 CS126 Searching. Binary Search

Linked Lists Linked Lists, Queues, and Stacks

Linked Lists, Stacks, Queues, Deques. It s time for a chainge!

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

Course: Programming II - Abstract Data Types. The ADT Queue. (Bobby, Joe, Sue, Ellen) Add(Ellen) Delete( ) The ADT Queues Slide Number 1

10CS35: Data Structures Using C

Data Structures Fibonacci Heaps, Amortized Analysis

Data Structures and Algorithms

STACKS,QUEUES, AND LINKED LISTS

14.1 Rent-or-buy problem

PES Institute of Technology-BSC QUESTION BANK

Loop Invariants and Binary Search

Analysis of Binary Search algorithm and Selection Sort algorithm

Introduction to Data Structures and Algorithms

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Ordered Lists and Binary Trees

Many algorithms, particularly divide and conquer algorithms, have time complexities which are naturally

Data Structures and Algorithms

Binary Heap Algorithms

Lecture Notes on Linear Search

Chapter 8: Bags and Sets

Dynamic Programming. Lecture Overview Introduction

Analysis of Computer Algorithms. Algorithm. Algorithm, Data Structure, Program

Section IV.1: Recursive Algorithms and Recursion Trees

Stacks. Stacks (and Queues) Stacks. q Stack: what is it? q ADT. q Applications. q Implementation(s) CSCU9A3 1

Pseudo code Tutorial and Exercises Teacher s Version

CHAPTER 4 ESSENTIAL DATA STRUCTRURES

DATA STRUCTURES USING C

Recursion. Definition: o A procedure or function that calls itself, directly or indirectly, is said to be recursive.

Lecture Notes on Binary Search Trees

Node-Based Structures Linked Lists: Implementation

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Programming with Data Structures

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Abstract Data Type. EECS 281: Data Structures and Algorithms. The Foundation: Data Structures and Abstract Data Types

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

11 Multivariate Polynomials

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Sample Questions Csci 1112 A. Bellaachia

Last not not Last Last Next! Next! Line Line Forms Forms Here Here Last In, First Out Last In, First Out not Last Next! Call stack: Worst line ever!

To My Parents -Laxmi and Modaiah. To My Family Members. To My Friends. To IIT Bombay. To All Hard Workers

Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Algorithms and Data Structures

Lecture 1: Course overview, circuits, and formulas

Introduction to Data Structures

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

What is a Stack? Stacks and Queues. Stack Abstract Data Type. Java Interface for Stack ADT. Array-based Implementation

Outline. Computer Science 331. Stack ADT. Definition of a Stack ADT. Stacks. Parenthesis Matching. Mike Jacobson

7.1 Our Current Model

From Last Time: Remove (Delete) Operation

Cpt S 223. School of EECS, WSU

DATA STRUCTURE - QUEUE

Recursion vs. Iteration Eliminating Recursion

Data Structures and Algorithms Written Examination

This Unit: Floating Point Arithmetic. CIS 371 Computer Organization and Design. Readings. Floating Point (FP) Numbers

Quiz 4 Solutions EECS 211: FUNDAMENTALS OF COMPUTER PROGRAMMING II. 1 Q u i z 4 S o l u t i o n s

Day 1. This is CS50 for MBAs. Harvard Busines School. Spring Cheng Gong

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Introduction to data structures

Math 4310 Handout - Quotient Vector Spaces

QUEUES. Primitive Queue operations. enqueue (q, x): inserts item x at the rear of the queue q

Chapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search

Name: Class: Date: 9. The compiler ignores all comments they are there strictly for the convenience of anyone reading the program.

Transcription:

Class Overview CSE 326: Data Structures Introduction Introduction to many of the basic data structures used in computer software Understand the data structures Analyze the algorithms that use them Know when to apply them Practice design and analysis of data structures. Practice using these data structures by writing programs. Make the transformation from programmer to computer scientist 2 Goals You will understand what the tools are for storing and processing common data types which tools are appropriate for which need So that you can make good design choices as a developer, project manager, or system customer You will be able to Justify your design decisions via formal reasoning Communicate ideas about programs clearly and precisely Goals I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships. Linus Torvalds, 2006 3 4 Goals Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won t usually need your flowcharts; they ll be obvious. Fred Brooks, 1975 Data Structures Clever ways to organize information in order to enable efficient computation What do we mean by clever? What do we mean by efficient? 5 6

Picking the best Data Structure for the job The data structure you pick needs to support the operations you need Ideally it supports the operations you will use most often in an efficient manner Examples of operations: A List with operations insert and delete A Stack with operations push and pop Terminology Abstract Data Type (ADT) Mathematical description of an object with set of operations on the object. Useful building block. Algorithm A high level, language independent, description of a step-by-step process Data structure A specific family of algorithms for implementing an abstract data type. Implementation of data structure A specific implementation in a specific language 7 8 Terminology examples A stack is an abstract data type supporting push, pop and isempty operations A stack data structure could use an array, a linked list, or anything that can hold data One stack implementation is java.util.stack; another is java.util.linkedlist Concepts vs. Mechanisms Abstract Pseudocode Algorithm A sequence of high-level, language independent operations, which may act upon an abstracted view of data. Abstract Data Type (ADT) A mathematical description of an object and the set of operations on the object. Concrete Specific programming language Program A sequence of operations in a specific programming language, which may act upon real data in the form of numbers, images, sound, etc. Data structure A specific way in which a program s data is represented, which reflects the programmer s design choices/goals. 9 10 Why So Many Data Structures? Ideal data structure: fast, elegant, memory efficient Generates tensions: time vs. space performance vs. elegance generality vs. simplicity one operation s performance vs. another s The study of data structures is the study of tradeoffs. That s why we have so many of them! Today s Outline Introductions Administrative Info What is this course about? Review: Queues and stacks 11 12

First Example: Queue ADT FIFO: First In First Out Queue operations create destroy enqueue G enqueue dequeue is_empty F E D C B dequeue A 13 Circular Array Queue Data Structure Q 0 size - 1 b c d e f front enqueue(object x) { Q[back] = x ; back = (back + 1) % size dequeue() { x = Q[front] ; front = (front + 1) % size; return x ; back How test for empty list? How to find K-th element in the queue? What is complexity of these operations? Limitations of this structure? 14 Linked List Queue Data Structure b c d e f front void enqueue(object x) { if (is_empty()) front = back = new Node(x) else back->next = new Node(x) back = back->next bool is_empty() { return front == null back Object dequeue() { assert(!is_empty) return_data = front->data temp = front front = front->next delete temp return return_data 15 Circular Array vs. Linked List Too much space Kth element accessed easily Not as complex Could make array more robust Can grow as needed Can keep growing No back looping around to front Linked list code more complex 16 Second Example: Stack ADT Stacks in Practice LIFO: Last In First Out Stack operations create destroy A push pop top is_empty B C D E F F E D C B A Function call stack Removing recursion Balancing symbols (parentheses) Evaluating Reverse Polish Notation 17 18

Algorithm Analysis: Why? Data Structures Asymptotic Analysis Correctness: Does the algorithm do what is intended. Performance: What is the running time of the algorithm. How much storage does it consume. Different algorithms may be correct Which should I use? 20 Recursive algorithm for sum Write a recursive function to find the sum of the first n integers stored in array v. Proof by Induction Basis Step: The algorithm is correct for a base case or two by inspection. sum(integer array v, integer n) returns integer if n = 0 then sum = 0 else sum = nth number + sum of first n-1 numbers return sum Inductive Hypothesis (n=k): Assume that the algorithm works correctly for the first k cases. 21 Inductive Step (n=k+1): Given the hypothesis above, show that the k+1 case will be calculated correctly. 22 Program Correctness by Induction Algorithms vs Programs Basis Step: sum(v,0) = 0. Inductive Hypothesis (n=k): Assume sum(v,k) correctly returns sum of first k elements of v, i.e. v[0]+v[1]+ +v[k-1]+v[k] Inductive Step (n=k+1): sum(v,n) returns v[k]+sum(v,k-1)= (by inductive hyp.) v[k]+(v[0]+v[1]+ +v[k-1])= v[0]+v[1]+ +v[k-1]+v[k] Proving correctness of an algorithm is very important a well designed algorithm is guaranteed to work correctly and its performance can be estimated Proving correctness of a program (an implementation) is fraught with weird bugs Abstract Data Types are a way to bridge the gap between mathematical algorithms and programs 23 24

Comparing Two Algorithms GOAL: Sort a list of names I ll buy a faster CPU I ll use C++ instead of Java wicked fast! Ooh look, the O4 flag! Who cares how I do it, I ll add more memory! Can t I just get the data pre-sorted?? Comparing Two Algorithms What we want: Rough Estimate Ignores Details Really, independent of details Coding tricks, CPU speed, compiler optimizations, These would help any algorithms equally Don t just care about running time not a good enough measure 25 26 Big-O Analysis Ignores details What details? CPU speed Programming language used Amount of memory Compiler Order of input Size of input sorta. Analysis of Algorithms Efficiency measure how long the program runs time complexity how much memory it uses space complexity Why analyze at all? Decide what algorithm to implement before actually doing it Given code, get a sense for where bottlenecks must be, without actually measuring it 27 28 Asymptotic Analysis One detail we won t ignore: problem size, # of input elements Complexity as a function of input size n T(n) = 4n + 5 T(n) = 0.5 n log n -2n + 7 T(n) = 2 n + n 3 + 3n What happens as n grows? Why Asymptotic Analysis? Most algorithms are fast for small n Time difference too small to be noticeable External things dominate (OS, disk I/O, ) BUT n is often large in practice Databases, internet, graphics, Difference really shows up as n grows! 29 30

Exercise - Searching Analyzing Code 2 3 5 16 37 50 73 75 126 bool ArrayFind(int array[], int n, int key){ // Insert your algorithm here Basic Java operations Consecutive statements Conditionals Loops Function calls Recursive functions Constant time Sum of times Larger branch plus test Sum of iterations Cost of function body Solve recurrence relation What algorithm would you choose to implement this code snippet? 31 32 Linear Search Analysis Binary Search Analysis bool LinearArrayFind(int array[], int n, int key ) { for( int i = 0; i < n; i++ ) { if( array[i] == key ) // Found it! return true; return false; Best Case: 3 Worst Case: 2n+1 bool BinArrayFind( int array[], int low, int high, int key ) { // The subarray is empty if( low > high ) return false; // Search this subarray recursively int mid = (high + low) / 2; if( key == array[mid] ) { return true; else if( key < array[mid] ) { return BinArrayFind( array, low, mid-1, key ); else { return BinArrayFind( array, mid+1, high, key ); Best case: 4 Worst case: log n? 33 34 Solving Recurrence Relations 1. Determine the recurrence relation. What is/are the base case(s)? 2. Expand the original relation to find an equivalent general expression in terms of the number of expansions. Data Structures Asymptotic Analysis 3. Find a closed-form expression by setting the number of expansions to a value which reduces the problem to a base case 35

Linear Search vs Binary Search Fast Computer vs. Slow Computer Linear Search Binary Search Best Case 4 at [0] 4 at [middle] Worst Case 3n+2 4 log n + 4 So which algorithm is better? What tradeoffs can you make? 37 38 Fast Computer vs. Smart Programmer (round 1) Fast Computer vs. Smart Programmer (round 2) 39 40 Asymptotic Analysis Asymptotic analysis looks at the order of the running time of the algorithm A valuable tool when the input gets large Ignores the effects of different machines or different implementations of an algorithm Intuitively, to find the asymptotic runtime, throw away the constants and low-order terms Linear search is T(n) = 3n + 2 O(n) Binary search is T(n) = 4 log 2 n + 4 O(log n) Asymptotic Analysis Eliminate low order terms 4n + 5 0.5 n log n + 2n + 7 n 3 + 2 n + 3n Eliminate coefficients 4n 0.5 n log n n log n 2 => Remember: the fastest algorithm has the slowest growing function for its runtime 41 42

Properties of Logs log AB = log A + log B log2 A log2 B Proof: A = 2, B = 2 log2 A log2 B (log AB = 2 2 = 2 log AB = log A + log B Similarly: log(a/b) = log A log B log(a B ) = B log A 2 A+ log2 B) Any log is equivalent to log-base-2 Order Notation: Intuition f(n) = n 3 + 2n 2 g(n) = 100n 2 + 1000 Although not yet apparent, as n gets sufficiently large, f(n) will be greater than or equal to g(n) 43 44 Definition of Order Notation Upper bound: T(n) = O(f(n)) Big-O Exist positive constants c and n such that T(n) c f(n) for all n n Lower bound: T(n) = Ω(g(n)) Omega Exist positive constants c and n such that T(n) c g(n) for all n n Definition of Order Notation O( f(n) ) : a set or class of functions g(n) O( f(n) ) iff there exist positive consts c and n 0 such that: g(n) c f(n) for all n n 0 Tight bound: T(n) = θ(f(n)) Theta When both hold: T(n) = O(f(n)) T(n) = Ω(f(n)) 45 Example: 100n 2 + 1000 5 (n 3 + 2n 2 ) for all n 19 So g(n) O( f(n) ) 46 Order Notation: Example Some Notes on Notation Sometimes you ll see g(n) = O( f(n) ) This is equivalent to g(n) O( f(n) ) 100n 2 + 1000 5 (n 3 + 2n 2 ) for all n 19 So f(n) O( g(n) ) 47 What about the reverse? O( f(n) ) = g(n) 48

Big-O: Common Names constant: O(1) logarithmic: O(log n) (log k n, log n 2 O(log n)) linear: O(n) log-linear: O(n log n) quadratic: O(n 2 ) cubic: O(n 3 ) polynomial: O(n k ) (k is a constant) exponential: O(c n ) (c is a constant > 1) Meet the Family O( f(n) ) is the set of all functions asymptotically less than or equal to f(n) o( f(n) ) is the set of all functions asymptotically strictly less than f(n) Ω( f(n) ) is the set of all functions asymptotically greater than or equal to f(n) ω( f(n) ) is the set of all functions asymptotically strictly greater than f(n) θ( f(n) ) is the set of all functions asymptotically equal to f(n) 49 50 Meet the Family, Formally g(n) O( f(n) ) iff There exist c and n 0 such that g(n) c f(n) for all n n 0 g(n) o( f(n) ) iff There exists a n 0 such that g(n) < c f(n) for all c and n n 0 Equivalent to: lim n g(n)/f(n) = 0 g(n) Ω( f(n) ) iff There exist c and n 0 such that g(n) c f(n) for all n n 0 g(n) ω( f(n) ) iff There exists a n 0 such that g(n) > c f(n) for all c and n n 0 Equivalent to: lim n g(n)/f(n) = g(n) θ( f(n) ) iff g(n) O( f(n) ) and g(n) Ω( f(n) ) Big-Omega et al. Intuitively Asymptotic Notation Mathematics Relation O Ω θ = o < ω > 51 52 Pros and Cons of Asymptotic Analysis Perspective: Kinds of Analysis Running time may depend on actual data input, not just length of input Distinguish Worst Case Your worst enemy is choosing input Best Case Average Case Assumes some probabilistic distribution of inputs Amortized Average time over many operations 53 54

Types of Analysis 16n 3 log 8 (10n 2 ) + 100n 2 = O(n 3 log n) Two orthogonal axes: Bound Flavor Upper bound (O, o) Lower bound (Ω, ω) Asymptotically tight (θ) Analysis Case Worst Case (Adversary) Average Case Best Case Amortized 55 Eliminate low-order terms Eliminate constant coefficients 16n 3 log 8 (10n 2 ) + 100n 2 16n 3 log 8 (10n 2 ) n 3 log 8 (10n 2 ) n 3 (log 8 (10) + log 8 (n 2 )) n 3 log 8 (10) + n 3 log 8 (n 2 ) n 3 log 8 (n 2 ) 2n 3 log 8 (n) n 3 log 8 (n) n 3 log 8 (2)log(n) n 3 log(n)/3 n 3 log(n) 56