Sorting algorithms. CS 127 Fall A brief diversion: What's on the midterm. Heaps: what to know. Binary trees: What to know

Similar documents
Sorting Algorithms. Nelson Padua-Perez Bill Pugh. Department of Computer Science University of Maryland, College Park

Binary Heaps * * * * * * * / / \ / \ / \ / \ / \ * * * * * * * * * * * / / \ / \ / / \ / \ * * * * * * * * * *

Data Structure [Question Bank]

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

Binary Heap Algorithms

Data Structures and Data Manipulation

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Converting a Number from Decimal to Binary

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Heaps & Priority Queues in the C++ STL 2-3 Trees

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Exam study sheet for CS2711. List of topics

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

DATA STRUCTURES USING C

Data Structures. Topic #12

Analysis of Binary Search algorithm and Selection Sort algorithm

Algorithms. Margaret M. Fleck. 18 October 2010

Chapter 14 The Binary Search Tree

Data Structures and Algorithms Written Examination

6 March Array Implementation of Binary Trees

From Last Time: Remove (Delete) Operation

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Linda Shapiro Winter 2015

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

PES Institute of Technology-BSC QUESTION BANK

Cpt S 223. School of EECS, WSU

Data Structures. Level 6 C Module Descriptor

Data Structures Fibonacci Heaps, Amortized Analysis

Ordered Lists and Binary Trees

THIS CHAPTER studies several important methods for sorting lists, both contiguous

Sample Questions Csci 1112 A. Bellaachia

CS473 - Algorithms I

CSC148 Lecture 8. Algorithm Analysis Binary Search Sorting

DATABASE DESIGN - 1DL400

A binary heap is a complete binary tree, where each node has a higher priority than its children. This is called heap-order property

To My Parents -Laxmi and Modaiah. To My Family Members. To My Friends. To IIT Bombay. To All Hard Workers

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

For example, we have seen that a list may be searched more efficiently if it is sorted.

CSE 326: Data Structures B-Trees and B+ Trees

Big Data and Scripting. Part 4: Memory Hierarchies

Simple sorting algorithms and their complexity. Bubble sort. Complexity of bubble sort. Complexity of bubble sort. Bubble sort of lists

APP INVENTOR. Test Review

Java Software Structures

ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION /5:15-6:45 REC-200, EVI-350, RCH-106, HH-139

Physical Data Organization

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

6. Standard Algorithms

CompSci-61B, Data Structures Final Exam

Data Structures and Algorithms

Algorithm Analysis [2]: if-else statements, recursive algorithms. COSC 2011, Winter 2004, Section N Instructor: N. Vlajic

Why Use Binary Trees?

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants

Class Notes CS Creating and Using a Huffman Code. Ref: Weiss, page 433

International Journal of Software and Web Sciences (IJSWS)

Binary Heaps. CSE 373 Data Structures

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

Chapter 13: Query Processing. Basic Steps in Query Processing

Pseudo code Tutorial and Exercises Teacher s Version

Social Media Mining. Graph Essentials

IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION

CIS 631 Database Management Systems Sample Final Exam

Data Structures. Jaehyun Park. CS 97SI Stanford University. June 29, 2015

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

A COMPARATIVE STUDY OF LINKED LIST SORTING ALGORITHMS

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

CS5310 Algorithms 3 credit hours 2 hours lecture and 2 hours recitation every week

Data Structure and Algorithm I Midterm Examination 120 points Time: 9:10am-12:10pm (180 minutes), Friday, November 12, 2010

Load Balancing. Load Balancing 1 / 24

Quick Sort. Implementation

Why? A central concept in Computer Science. Algorithms are ubiquitous.

International Journal of Advanced Research in Computer Science and Software Engineering

Dynamic programming. Doctoral course Optimization on graphs - Lecture 4.1. Giovanni Righini. January 17 th, 2013

Introduction to Programming (in C++) Sorting. Jordi Cortadella, Ricard Gavaldà, Fernando Orejas Dept. of Computer Science, UPC

Lecture 1: Data Storage & Index

CS711008Z Algorithm Design and Analysis

Binary Search Trees CMPSC 122

Persistent Binary Search Trees

Section IV.1: Recursive Algorithms and Recursion Trees

Efficient Data Structures for Decision Diagrams

Introduction to Algorithms March 10, 2004 Massachusetts Institute of Technology Professors Erik Demaine and Shafi Goldwasser Quiz 1.

AP Computer Science AB Syllabus 1

Partitioning under the hood in MySQL 5.5

Biostatistics 615/815

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

Symbol Tables. Introduction

COMPUTER SCIENCE. Paper 1 (THEORY)

EE602 Algorithms GEOMETRIC INTERSECTION CHAPTER 27

Introduction to Parallel Programming and MapReduce

Rotation Operation for Binary Search Trees Idea:

Unit Write iterative and recursive C functions to find the greatest common divisor of two integers. [6]

Analysis of a Search Algorithm

Class Overview. CSE 326: Data Structures. Goals. Goals. Data Structures. Goals. Introduction

Transcription:

CS 127 Fall 2003 Sorting algorithms A brief diversion: What's on the midterm Same format as last time 5 questions, 20 points each, 100 points total 15% of your grade 2 topics: trees, trees, and more trees graphs November 3 and 7, 2003 Binary trees: What to know Characteristics Searching and traversing Inserting and deleting Balancing a tree (definition of balanced ) AVL trees Self-adjusting trees/splaying Implementations Complexity Heaps: what to know What is a heap Relationship between heap and array Making heap from array and array from heap Insertion and deletion (reorganizing a heap) Applications: what heaps are used for (and how) e.g., priority queue

B-trees and tries: what to know Characteristics In general, how to insert/delete from a B-tree Applications Graphs: what to know Characteristics (digraphs and simple graphs) and definitions Spanning tree: how to find, what it is Shortest path (Dijkstra and Bellman-Ford) Max flow (Ford-Fulkerson method) Insertion sort Selection sort Bubble sort Shell sort Heap sort Quick sort Merge sort Radix sort Sorting algorithms Complexity Number of comparisons Number of swaps Storage requirements

Decision tree A generic way to express any sorting algorithm lower bound of number of comparisons Binary tree Non-leaf nodes are possible comparisons Arcs are Y/N (results of comparisons) Number of leaves is >= n! Number of comparisons (worst and avg cases) = O(n lg n) best we can do for sorting an array, comparison-wise Simple (inefficient) sorting algorithms Insertion sort Selection sort Bubble sort Shell sort Heap sort Quick sort Merge sort Radix sort Insertion sort Compare first two elements, put in proper order Compare first three elements, put in proper order... Insertion sort: implementation public void insertionsort(comparable[] data) { Comparable temp; for (int i=1; i<data.length; i++) { temp = data[i]; // store the element in question for (int j=i; j>0 && temp.compareto(data[j-1])<0; j--) { data[j] = data[j-1]; // move data elements up data[j] = temp; // move temp into place

Insertion sort: complexity Best case: data already in order n-1 comparsions, 2(n-1) moves Worst case: data in reverse order i comparisons in each iteration, n-1 iterations,... n(n-1)/2 = O(n 2 ) moves: temp is loaded/unloaded 2(n-1) times, i moves in iteration i,... (n 2 +3n-4)/2 = O(n 2 ) Average case: data in random order Insertion sort: average complexity In iteration i, there are up to j+1 comparisons average: (i+1)/2 (if equal probability cell occupation) average # comparisons = sum of the averages = (n 2 +n-2)/4 = O(n 2 ) In iteration i, there are up to i-1 movements average: (i-1)/2 plus, 2 movements (into and out of temp) average # movements = sum of the averages = (n 2 +5n-6)/4 = O(n 2 ) Selection sort Start at the first element in the array Find the minimum element in the array save its index Swap the first element with the minimum element in the array Repeat for the remaining elements in the array In each pass, at least one element will be moved to its proper position in the array Selection sort: implementation public void selectionsort(comparable[] data) { int min; for (int i=0; i<data.length-1; i++) { min = i; for (int j=i+1; j<data.length; j++) { if (data[j].compareto(data[min]) < 0) min = j; swap(data,min,i);

Implementation of swap private void swap(comparable[] data, int pos1, int pos2) { Comparable temp = data[pos1]; data[pos1] = data[pos2]; data[pos2] = temp; Complexity of selection sort: comparisons Same for all three cases Outer loop executes n-1 times Inner loop executes (n-1) i times on i th iteration n(n-1)/2 = O(n 2 ) Complexity of selection sort: movements Best case: data sorted 0 swaps Worst case: largest element in first position, rest of data ordered swap largest with next element n-1 times each swap involves three steps O(n) Bubble sort Idea: make more data swaps each pass through the array During each pass, compare pairs of array elements, and swap if they are out of order In each pass, at least one element will be moved to its proper position in the array

Bubble sort: implementation public void bubblesort(comparable[] data) { for (int i=0; i<data.length-1; i++) { for (int j=1; j<data.length-i; j++) { if (data[j].compareto(data[j-1]) < 0) swap(data,j,j-1); Complexity of bubble sort: comparisons Same for all three cases Outer loop executes n-1 times Inner loop executes (n-1) i times on i th iteration n(n-1)/2 = O(n 2 ) Complexity of bubble sort: movements Best case: data sorted 0 swaps Worst case: data in reverse order 3*(number of comparisons) = O(n 2 ) Average case up to (n-1-i) swaps per subarray: average is (n-1-i)/2 n-1 iterations... 3n(n-1)/4 = O(n 2 ) Efficient sorting algorithms Insertion sort Selection sort Bubble sort Shell sort Heap sort Quick sort Merge sort Radix sort

Shell sort Idea: keep sorting subarrays until the entire array is sorted split the array into n1 subarrays, sort split the array into n2 > n1 subarrays, sort repeat until no subdivisions can be made array is sorted Any sorting method can be used for the subarrays typically insertion some implementations use a different sort for the last pass Choosing subarrays Take every h th element from array Decrease h each time so that you get different subarrays each time Difficult problem: picking the h's! original algorithm: use powers of 2 experimental results: select h values so that h 1 = 1 h i+1 = 3h i + 1 stop with h i for which h i+2 >= n Complexity Depends on how the increments are chosen Better than O(n 2 ) but worse than O(n lg n) Some experimental measurements: O(n 1.25 ) O(n 5/3 ) O(n lg 2 n) Idea: Heap sort start with a heap and an empty array remove the largest element and place it at the end of the array restructure the heap repeat until all elements have been removed from the heap and the array is full (and in order)

' ( )! * +, # $ - % & 4. / 6 0 1 2 8 3 Complexity of heap sort Creating the heap: O(n) Sorting the heap: worst case Swap root with element to be removed: n-1 times " Restore heap: occurs n-1 times, each iteration takes lg i steps --> O(n lg n) Total complexity = heap + swap + restore = O(n lg n) Sorting the heap: best case all elements are identical: O(n) distinct elements already in order: O(n lg n) O(n) Quicksort Very fast, recursive sorting algorithm Idea: divide array into 2 subarrays, using a pivot first half contains elements less than pivot second half contains elements greater than pivot repeat this division/partitioning until size of subarrays = 1 by preparing to sort, you have sorted the array Key concept: selecting the pivot Need to pick a pivot that will approximately split our current array Common strategies: pick the first element pick the middle element pick a random element preprocess the array Complexity of quicksort Worst case: select pivot that's the largest or smallest element in the subarray 5 O(n2) Best case: select pivot so that subarrays are roughly equal 7 O(n lg n) Average case is closer to the best case than the worst case --> O(n lg n)

? 9 : @ ; A < = B > D E I F J K H L M Idea: Merge sort divide array in half divide these arrays in half... repeat until subarrays are size 1 start merging subarrays repeat until arrray is restored and sorted Complexity of merge sort Worst case: last element of one half is less than only the last element of the other half (in any iteration) Swaps: O(n lg n) Number of swaps into and out of temp array = the length of the subarray (for each subarray) Number of swaps for subarray of size 1 = 0 C Comparisons: O(n lg n) Number of comparisons = the length of the subarray (for each subarray) Number of comparisons for subarray of size 1 = 0 Complexity of merge sort Best case: (sub)array is in order, or elements in one half precede all elements in the other half G number of comparisons = (ifirst + ilast)/2, for each iteration i number of swaps = number of swaps into and out of temp array = the length of the subarray (for each subarray) Radix sort Idea: Sort by looking at one letter or digit at a time, then move to the next letter or digit, etc. Words: put elements into bins by first letter. Recombine. Put elements into bins by second letter... bins = letters of the alphabet Numbers: put elements into bins by least significant digit. Recombine. Put elements into bins by second least significant digit... bins = digits 0-9

R N S O P T Q W X ` Y a Z b [ \ c ] ^ _ Notes on radix sort No sorting necessary! data gets sorted via the binning process Bins can be implemented as queues retain ordering with each pass Complexity of radix sort No data comparison necessary 2 operations per data element: divide by factor (ignore less significant digits) U modulo division (ignore more significant digits) V O(n) Number of swaps = 2n = O(n) Queue operations: O(n) Another implementation What if we handle the elements on a bit level? only two bins per pass no division required! do bitwise-and instead (apply mask to element) less storage (only 2 queues) Drawback: more iterations of loop! bitwise radix sort turns out to be much slower Sorting in Java Arrays: method sort in Arrays class algorithm is quicksort Lists: method sort in Collections interface algorithm is merge sort

Summary of sorting algorithms Algorithm Comparisons Swaps Other Insertion sort O(n ) 2 O(n ) 2 Selection sort O(n ) 2 O(n) Bubble sort O(n ) 2 O(n ) 2 Shell sort Heap sort Between O(n ) and O(n lg n) 2 O(n lg n) Quicksort O(n lg n) O(n lg n) Can be O(n ) in worst case 2 Merge sort O(n lg n) O(n lg n) Radix sort O(n) O(n) divisions, O(n) queue operations