Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.



Similar documents
Binary search algorithm

Zabin Visram Room CS115 CS126 Searching. Binary Search

Analysis of Binary Search algorithm and Selection Sort algorithm

Analysis of Computer Algorithms. Algorithm. Algorithm, Data Structure, Program

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

Why? A central concept in Computer Science. Algorithms are ubiquitous.

In mathematics, it is often important to get a handle on the error term of an approximation. For instance, people will write

Class Overview. CSE 326: Data Structures. Goals. Goals. Data Structures. Goals. Introduction

Binary Heap Algorithms

Complexity Theory. IE 661: Scheduling Theory Fall 2003 Satyaki Ghosh Dastidar

CSC148 Lecture 8. Algorithm Analysis Binary Search Sorting

Data Structures. Algorithm Performance and Big O Analysis

The Running Time of Programs

The Goldberg Rao Algorithm for the Maximum Flow Problem

Chapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search

Discuss the size of the instance for the minimum spanning tree problem.

CSC 180 H1F Algorithm Runtime Analysis Lecture Notes Fall 2015

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

CAD Algorithms. P and NP

6. Standard Algorithms

recursion, O(n), linked lists 6/14

CS/COE

Searching Algorithms

Data Structures and Algorithms

Lecture 2: Universality

What Is Recursion? Recursion. Binary search example postponed to end of lecture

Recursive Algorithms. Recursion. Motivating Example Factorial Recall the factorial function. { 1 if n = 1 n! = n (n 1)! if n > 1

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

arxiv: v1 [math.pr] 5 Dec 2011

Discrete Optimization

CS 575 Parallel Processing

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Fairness in Routing and Load Balancing

SIMS 255 Foundations of Software Design. Complexity and NP-completeness

5. Binary objects labeling

Lecture 1: Course overview, circuits, and formulas

Zeros of Polynomial Functions

Sequential Data Structures

Data Structures and Algorithms Written Examination

Cost Model: Work, Span and Parallelism. 1 The RAM model for sequential computation:

Randomized algorithms

Determining the Optimal Combination of Trial Division and Fermat s Factorization Method

Algorithms are the threads that tie together most of the subfields of computer science.

Method To Solve Linear, Polynomial, or Absolute Value Inequalities:

Introduction to Algorithms. Part 3: P, NP Hard Problems

Primality Testing and Factorization Methods

Integer Factorization using the Quadratic Sieve

Lecture 10: Regression Trees

Binary Number System. 16. Binary Numbers. Base 10 digits: Base 2 digits: 0 1

Elementary factoring algorithms

Reminder: Complexity (1) Parallel Complexity Theory. Reminder: Complexity (2) Complexity-new

Study of algorithms for factoring integers and computing discrete logarithms

2. (a) Explain the strassen s matrix multiplication. (b) Write deletion algorithm, of Binary search tree. [8+8]

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Zeros of a Polynomial Function

Classification - Examples

Quantum Computing Lecture 7. Quantum Factoring. Anuj Dawar

Bounded Cost Algorithms for Multivalued Consensus Using Binary Consensus Instances

Is n a Prime Number? Manindra Agrawal. March 27, 2006, Delft. IIT Kanpur

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,

Sample Questions Csci 1112 A. Bellaachia

Lecture 10: Dynamic Memory Allocation 1: Into the jaws of malloc()

9th Max-Planck Advanced Course on the Foundations of Computer Science (ADFOCS) Primal-Dual Algorithms for Online Optimization: Lecture 1

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Single-Link Failure Detection in All-Optical Networks Using Monitoring Cycles and Paths

Approximation Algorithms

Lecture 13 - Basic Number Theory.

Near Optimal Solutions

Many algorithms, particularly divide and conquer algorithms, have time complexities which are naturally

Data Structure with C

Lecture 4: BK inequality 27th August and 6th September, 2007

Outline. Introduction Linear Search. Transpose sequential search Interpolation search Binary search Fibonacci search Other search techniques

HY345 Operating Systems

Recursion. Definition: o A procedure or function that calls itself, directly or indirectly, is said to be recursive.

Chapter 3. if 2 a i then location: = i. Page 40

Faster deterministic integer factorisation

Theory of Computation Chapter 2: Turing Machines

Why you shouldn't use set (and what you should use instead) Matt Austern

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

! Solve problem to optimality. ! Solve problem in poly-time. ! Solve arbitrary instances of the problem. !-approximation algorithm.

The PageRank Citation Ranking: Bring Order to the Web

Gamma Distribution Fitting

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Lecture Notes on Linear Search

System Interconnect Architectures. Goals and Analysis. Network Properties and Routing. Terminology - 2. Terminology - 1

CMSC 858T: Randomized Algorithms Spring 2003 Handout 8: The Local Lemma

Data Structures Fibonacci Heaps, Amortized Analysis

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Binary Trees and Huffman Encoding Binary Search Trees

Computer Algorithms. NP-Complete Problems. CISC 4080 Yanjun Li

Algorithm Design and Analysis

NUMBER SYSTEMS. William Stallings

Chapter Load Balancing. Approximation Algorithms. Load Balancing. Load Balancing on 2 Machines. Load Balancing: Greedy Scheduling

Transcription:

Algorithms Efficiency of algorithms Computational resources: time and space Best, worst and average case performance How to compare algorithms: machine-independent measure of efficiency Growth rate Complexity measure O( ) Algorithms Algorithm is a well-defined sequence of steps which leads to solving a certain problem. Steps should be: concrete unambiguous there should be finitely many of them Efficiency of algorithms How much time does it need How much memory (space) does it use Binary search and linear search One seems faster than the other Can we characterise the difference more precisely? Best, worst and average case Linear search: Best performance: the item we search for is in the first position; examines one position Worst performance: item not in the array or in the last position; examines all positions Average performance (given that the item is in the array): examines half of the array Binary search Best case - item in the middle, one check Worst case - item in the last possible division; the maximal number of times an array of length N can be divided is log 2 N Average case: item is found after performing half of the possible number of divisions; ½ log2 N 1

Which is more useful? For real time programming: the worst case For getting a general idea of running time: average case; however, often difficult to establish For choosing between several available algorithms: helps to know what is the best case (maybe your data are in the best case format, for example random). How to compare Suppose we settle on comparing the worst case performance of linear and binary search. Where do we start? Timing... Machine Independence The evaluation of efficiency should be as machine independent as possible. For the time complexity of an algorithm, we count the number of basic operations the algorithm performs we calculate how this number depends on the size of the input. Space complexity: how much extra space is needed in terms of the space used to represent the input. "Basic operations"? "Size of input"? Some clarifications Size of input It is up to us to decide what is a useful parameter to vary when we measure efficiency of an algorithm. For algorithms searching a linear collection, the natural choice for size of input is the number of items in the collection which we are searching (e.g. length of the array). Size of input contd. Graph search: we may be interested in how time grows when the number of nodes increases, or in how time grows when the number of edges increases. Sometimes we measure efficiency as a function of several parameters: e.g. number nodes and edges in a graph. 2

Basic operations Basic operations are operations which take constant time (at most time C for some constant C). In other words, time required to perform the operation does not grow with the size of the operands, or is bounded by a constant. Basic operations contd. If we are not sure how operations are implemented, we have to exercise judgement: can something be in principle implemented as a constant time operation? For example, adding two 32-bit integers can be, but adding a list of 32-bit integers can not: it is going to take longer for a longer list. Example linearsearch(int[] arr, int value){ for(int i=0; i<arr.length; i++) if(arr[i]==value) return true; return false;} Basic operations: comparing two integers; incrementing i. Constant time (at most some C) spent at each iteration. Size of input: length of the array N. Time usage in the worst case: t(n)=c * N. Binary search binarysearch(int[] arr, int value){ int left = 0; int right = arr.length - 1; int middle; while (right >= left) { middle = (left+right)/2; if (value == arr[middle]) return true; if (value < arr[middle]) right=middle-1; else left = middle+1; } return false; } Analysis of binary search Size of input = size of the array, say N Basic operations: assignments and comparisons Total number of steps: 3 assignments plus a block of assignment, check and assignment repeated log2 N times. Assume 3 assignments take at most time C 1 and at each iteration we spend at most time C 2. Total time = C 1 + C 2 log2 N Rate of Growth We don't know how long the steps actually take; we only know it is some constant time. We can just lump all constants together and forget about them. What we are left with is the fact that the time in sequential search grows linearly with the input, while in binary search it grows logarithmically - much slower. 3

O() complexity measure Upper bound example Big O notation gives an asymptotic upper bound on the actual function which describes time/memory usage of the algorithm. The complexity of an algorithm is O(f(N)) if there exists a constant factor K and an input size N 0 such that the actual usage of time/memory by the algorithm on inputs greater than N 0 is always less than K f(n). N 0 f(n)=2n t(n)=3+n t(n) is in O(N) because for all N>3, 2N > 3+N Here, N 0 = 3 and K=2. N In other words Comments An algorithm actually makes g(n) steps, for example g(n) = C 1 + C 2 log 2 N there is an input size N' and there is a constant K, such that for all N > N', g(n) <= K f(n) then the algorithm is in O(f(N). Binary search is O(log N): C 1 + C 2 log 2 N <= (C 1 + C 2 ) log 2 N for N > 2 Obviously lots of functions form an upper bound, we try to find the closest. We also want it to be a simple function, such as constant O(1) logarithmic O(log N) linear O(N) quadratic, cubic, exponential... Typical complexity classes Algorithms which have the same O( ) complexity belong to the same complexity class. Common complexity classes: O(1) constant time: independent of input length O(log N) logarithmic: usually results from splitting the task into smaller tasks, where the size of the task is reduced by a constant fraction O(N) linear: usually results when a given constant amount of processing is carried out on each element in the input. Contd. O(N log N) : splitting into subtasks and combining the results later O(N 2 ): quadratic. Usually arises when all pairs of input elements need to be processed O(2 N ): exponential. Usually emerges from a bruteforce solution to a problem. 4

Practical hints Find the actual function which shows how the time/memory usage grows depending on the input N. Omit all constant factors. If the function contains different powers of N, (e.g. N 4 + N 3 + N 2 ), leave only the highest power (N 4 ). Similarly, an exponential (2 N ) eventually outgrows any polynomial in N. Warning about O-notation O-notation only gives sensible comparisons of algorithms when N is large Consider two algorithms for same task: Linear: g(n) = 1000 N is in O(N) Quadratic: g'(n) = N 2 /1000 is in O(N 2 ) The quadratic one is faster for N < 1 000 000. Some constant factors are machine dependent, but others are a property of the algorithm itself. Summary Big O notation is a rough measure of how the time/memory usage grows as the input size increases. Big O notation gives a machine-independent measure of efficiency which allows comparison of algorithms. It makes more sense for large input sizes. It disregards all constant factors, even those intrinsic to the algorithm. Recommended reading Shaffer, Chapter 3 (note that we are not going to use Ω and Θ notation in this course, only the upper bound O()). Informal coursework Which statements below are true? If an algorithm has time complexity O(N 2 ), it always makes precisely N 2 steps, where N is the size of the input. An algorithm with time complexity O(N) is always runs slower than an algorithm with time complexity O(log 2 (N)), for any input. An algorithm which makes C 1 log 2 (N) steps and an algorithm which makes C 2 log 4 (N) steps belong to the same complexity class (C 1 and C 2 are constants). 5