Algorithms and Data Structures



Similar documents
Sequential Data Structures

Analysis of a Search Algorithm

Common Data Structures

Universidad Carlos III de Madrid

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

Introduction to Data Structures and Algorithms

Linked Lists, Stacks, Queues, Deques. It s time for a chainge!

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Linked Lists: Implementation Sequences in the C++ STL

10CS35: Data Structures Using C

Programming with Data Structures

Abstract Data Type. EECS 281: Data Structures and Algorithms. The Foundation: Data Structures and Abstract Data Types

Node-Based Structures Linked Lists: Implementation

Sequences in the C++ STL

Data Structures Fibonacci Heaps, Amortized Analysis

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

Data Structures Using C++ 2E. Chapter 5 Linked Lists

Course: Programming II - Abstract Data Types. The ADT Stack. A stack. The ADT Stack and Recursion Slide Number 1

Queue Implementations

Queues Outline and Required Reading: Queues ( 4.2 except 4.2.4) COSC 2011, Fall 2003, Section A Instructor: N. Vlajic

Queues and Stacks. Atul Prakash Downey: Chapter 15 and 16

recursion, O(n), linked lists 6/14

1.00 Lecture 35. Data Structures: Introduction Stacks, Queues. Reading for next time: Big Java: Data Structures

STACKS,QUEUES, AND LINKED LISTS

Module 2 Stacks and Queues: Abstract Data Types

14.1 Rent-or-buy problem

PES Institute of Technology-BSC QUESTION BANK

Data Structures and Algorithms Stacks and Queues

Lecture Notes on Binary Search Trees

Stacks. Linear data structures

Data Structure [Question Bank]

Outline. Computer Science 331. Stack ADT. Definition of a Stack ADT. Stacks. Parenthesis Matching. Mike Jacobson

5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.

ECE 250 Data Structures and Algorithms MIDTERM EXAMINATION /5:15-6:45 REC-200, EVI-350, RCH-106, HH-139

MAX = 5 Current = 0 'This will declare an array with 5 elements. Inserting a Value onto the Stack (Push)

Algorithms and Data Structures

Last not not Last Last Next! Next! Line Line Forms Forms Here Here Last In, First Out Last In, First Out not Last Next! Call stack: Worst line ever!

Quiz 4 Solutions EECS 211: FUNDAMENTALS OF COMPUTER PROGRAMMING II. 1 Q u i z 4 S o l u t i o n s

22c:31 Algorithms. Ch3: Data Structures. Hantao Zhang Computer Science Department

Cpt S 223. School of EECS, WSU

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

What is a Stack? Stacks and Queues. Stack Abstract Data Type. Java Interface for Stack ADT. Array-based Implementation

Stacks. Data Structures and Data Types. Collections

DATA STRUCTURES USING C

CmpSci 187: Programming with Data Structures Spring 2015

Data Structures and Data Manipulation

Data Structures and Algorithm Analysis (CSC317) Intro/Review of Data Structures Focus on dynamic sets

Lecture Notes on Linear Search

This lecture. Abstract data types Stacks Queues. ADTs, Stacks, Queues Goodrich, Tamassia

Lecture 12 Doubly Linked Lists (with Recursion)

Stack Allocation. Run-Time Data Structures. Static Structures

Algorithms and Data S tructures Structures Stack, Queues, and Applications Applications Ulf Leser

Class Overview. CSE 326: Data Structures. Goals. Goals. Data Structures. Goals. Introduction

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Introduction to Data Structures

Math 319 Problem Set #3 Solution 21 February 2002

Home Page. Data Structures. Title Page. Page 1 of 24. Go Back. Full Screen. Close. Quit

Lecture Notes on Binary Search Trees

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

DATA STRUCTURE - STACK

CHAPTER 4 ESSENTIAL DATA STRUCTRURES

APP INVENTOR. Test Review

7.1 Our Current Model

ADTs,, Arrays, Linked Lists

CompSci-61B, Data Structures Final Exam

Recursion. Definition: o A procedure or function that calls itself, directly or indirectly, is said to be recursive.

Symbol Tables. Introduction

Data Structures UNIT III. Model Question Answer

Chapter 8: Bags and Sets

Lecture 3: Finding integer solutions to systems of linear equations

2) Write in detail the issues in the design of code generator.

Singly linked lists (continued...)

Chapter 3: Restricted Structures Page 1

Linear ADTs. Restricted Lists. Stacks, Queues. ES 103: Data Structures and Algorithms 2012 Instructor Dr Atul Gupta

Section IV.1: Recursive Algorithms and Recursion Trees

CSE373: Data Structures and Algorithms Lecture 1: Introduction; ADTs; Stacks/Queues. Linda Shapiro Spring 2016

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) Total 92.

1 Review of Newton Polynomials

Intelligent Agents. Based on An Introduction to MultiAgent Systems and slides by Michael Wooldridge

Lecture 13 - Basic Number Theory.

5.1 Radical Notation and Rational Exponents

Near Optimal Solutions

D06 PROGRAMMING with JAVA

What is Linear Programming?

Short Notes on Dynamic Memory Allocation, Pointer and Data Structure

The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

Unordered Linked Lists

Chapter 13: Query Processing. Basic Steps in Query Processing

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

LINKED DATA STRUCTURES

Introduction to Stacks

SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison

Linear Programming. March 14, 2014

Data Structure and Algorithm I Midterm Examination 120 points Time: 9:10am-12:10pm (180 minutes), Friday, November 12, 2010

In mathematics, it is often important to get a handle on the error term of an approximation. For instance, people will write

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

Week 1: Introduction to Online Learning

x64 Servers: Do you want 64 or 32 bit apps with that server?

Transcription:

Algorithms and Data Structures Lists and Arrays Marcin Sydow Web Mining Lab PJWSTK Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 1 / 35

Topics covered by this lecture: Linked Lists Singly Linked Lists Doubly Linked Lists The Concept of Abstract Data Structure Stack Queue Deque The Concept of Amortised Analysis of Complexity Potential function method total cost and accounting methods Examples on Stack with multipop Unbounded Arrays Comparison of Various Representations of Sequences Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 2 / 35

Introduction Sequences Sequences of elements are the most common data structures. Two kinds of basic operations on sequences: absolute access (place indentied by index), fast on arrays relative access (place indentied as a successor, predecessor of a given element.), slow on arrays The simplest implementation of sequences: arrays, support fast (constant time) absolute access, however relative access is very slow on arrays (time Θ(n)) (Assume a sequence has n elements, and assignment and value check are dominating operations) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 3 / 35

Introduction Operations on ends of sequence It is important to realise that in many practical applications, the operations on sequence concern only the ends of the sequence (e.g. removefirst, addlast, etc.). Any insert operation on array has pessimistic linear time (slow). Thus, some other than arrays data structures can be more ecient for implementing them. Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 4 / 35

Linked Lists Linked Lists Alternative implementation that supports fast relative access operations like: return/remove rst/last element insert/remove an element after/before given element insert a list after/before an element isempty, size, etc. Linked list consists of nodes that are linked. singly linked lists doubly linked lists cyclic lists (singly or doubly linked), etc. Nodes contain: one element of sequence link(s) to other node(s) (implemented as pointers). Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 5 / 35

Linked Lists Singly Linked Lists (we use C-style notation here to explicitly show pointers) Class SLNode<Type>{ Type element *SLNode<Type> next } Class SList<Type>{ *SLNode<Type> head //points to the first element, the only access to list } head-> (2)-> (3)-> (5)-> (8)-> null Last element points to null (if empty, head points to null) Example: printing the contents of a list: print(slist l){ node = l.head while(node not null) print node->element node = node->next } Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 6 / 35

Linked Lists Doubly Linked Lists Class DLNode<Type>{ Type element *DLNode<Type> next *DLNode<Type> prev } Class DLList<Type>{ *DLNode<Type> head //points to the first element, the only access to list } Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 7 / 35

Linked Lists Cyclic Lists and Cyclic Arrays In cyclic list variant the last node is linked to the rst one It can concern singly or doubly linked lists In doubly linked case the following invariant holds for each node: (next->prev) == (prev->next) == this In some cases cyclic arrays are also useful. (an array of size n, first and last are kept to point to the ends of the sequence and they move modulo n) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 8 / 35

Linked Lists Operations on lists Examples: isempty rst last insertafter (insertbefore) moveafter (movebefore) removeafter (removebefore) pushback ( pushfront ) popback ( popfront ) concat splice size ndnext Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 9 / 35

Linked Lists Implementation of operations on linked lists Most modier list operations can be implemented with a technical general operation splice. INPUT: a,b,t - pointers to nodes into list; a anb b are in the same list, t is not between a and b OUTPUT: cut out sublist (a,...,b) and insert it after t Example implementation of splice in doubly linked lists: (notice its constant time complexity, even if it concerns arbitrarily large subsequences!) Splice(a,b,t){ // cut out (a,...,b): a' = a->prev; b' = b->next; a'->next = b'; b'->prev = a' // insert (a,...,b) after t: t'= t->next; b->next = t'; a->prev = t; t->next = a; t'->prev = b } E.g. moveafter(a,b){ splice(a,a,b)} Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 10 / 35

Linked Lists Extensions Examples of additional cheap and useful attributes to the linked list structures: size (updated in constant time after each operation (except inter-list splice) last (useful in singly-linked lists, e.g. for fast pushback) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 11 / 35

Linked Lists Linked Lists vs Arrays Linked Lists (compared with Arrays): positive thing: fast relative operations, like insert after, etc. (most of them in constant time!) positive thing: unbounded size negative thing: additional memory for pointers Remarks: Pointer size can be small compared to the element size, though. Arrays have bounded size. Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 12 / 35

Abstract Data Structure Abstract Data Structures Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 13 / 35

Abstract Data Structure Abstract Data Structure A very general and important concept ADS is dened by operations which can be executed on it (or, in other words, by its interface). ADS is not dened by the implementation (however, implementation matters in terms of time and space complexity of the operations). Abstract Data Structure can be opposed to concrete data structure (as array or linked list) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 14 / 35

Abstract Data Structure Stack, Queue, Deque Stack (of elements of type T): push(t) T pop() (modier) T top() (non-modier) LIFO (last in - rst out) Applications of stack: undo, function calls, back button in web browser, parsing, etc. Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 15 / 35

Abstract Data Structure Stack, Queue, Deque Queue (of elements of type T): inject(t) T out() (modier) T front() (non-modier) FIFO (rst in - rst out) Applications of queue: music les on itune list, shared printer, network buer, etc. Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 16 / 35

Abstract Data Structure Stack, Queue, Deque Deque Double Ended Queue (pronounced like deck). (of elements of type T): T rst() T last() pushfront(t) pushback(t) popfront() popback() Can be viewed as a generalisation of both stack and queue Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 17 / 35

Abstract Data Structure Stack, Queue, Deque Examples Abstract Data Structure can be implemented in many ways (and using various concrete data structures), for example: ADS Stack Queue Deque fast implementation possible with: SList, Array (how?) SList, CArray (how?),(why not with Array?) DList, CArray (how?), (why not with SList?, why not with Array?) all the operations in constant time Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 18 / 35

Amortised Analysis Amortised Complexity Analysis Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 19 / 35

Amortised Analysis Amortised Complexity Analysis Data structures are usually used in algorithms. A typical usage of a data structure is a sequence of m operation calls s = (o 1, o 2,..., om) on it. Denote the cost of the operation oi by ti (for 1 i m). Usually, the total cost of the sequence of operations t = 1 i m t i is more important in analysis than the costs of separate operations. Sometimes, it may be dicult to exactly compute t (total cost), especially if some operations are cheap and some expensive (we do not know the sequence in advance) In this case, an approach of amortised analysis may be useful. Each operation o i is assigned an amortised cost a i so that: t = O( 1 i m a i) (i.e. t is upper bounded by the sum of amortised costs) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 20 / 35

Amortised Analysis Methods for Amortised Analysis The most general method for computing the amortised cost of a sequence of operations on a datastructure is the method of potential function (a non-negative function that has value dependent on the current state of the data structure under study). Some less general (and possibly simpler) methods can be derived from the potential method: total cost method (we compute the total cost of m operations) accounting method (objects in data structure are assigned credits to pay for further operations in the sequence) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 21 / 35

Amortised Analysis Potential Function Method After each operation o i assign a potential function Φ i to the state i of the data structure, so that Φ 0 == 0, and it is always non-negative. dene a i = t i + Φ i Φ i 1 thus we have: 1 i m a i = 1 i m (t i + Φ i Φ i 1 ) = 1 i m t i + (Φ m Φ 0 ) thus, t = 1 i m t i 1 i m a i Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 22 / 35

Amortised Analysis Example Consider an abstract data structure StackM that supports additional multipop(int k) operation. Assume it is implemented in a standard way with a bounded array (of suciently large size). push(t e), pop(): real cost is O(1) multipop(int k): real cost is O(k) Question: What is the pessimistic cost of sequence of any m operations on initially empty stack? Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 23 / 35

Amortised Analysis Example: potential method on stackm Dene the potential function in our example as the current size of the stack: Φ(stackM) = sizeof (stackm) thus amortisedcost(push) = 1 + 1 = 2, amortisedcost(pop) = 1 1 = 0, amortisedcost(multipop(k)) = k + ( k) = 0 Thus, m operations of push, pop or multipop on initially empty stack, have total amortised cost 2m = O(m), so that amortised cost of each operation is constant (O(m)/m). Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 24 / 35

Amortised Analysis Example, cont.: Total cost method The problem is that the cost of multipop(k) depends on the number of elements currently on the stack. The real cost of multipop(k) is min(k,n), which is the number of pop() operations executed. Each element can be popped only once, so that the total number of pop() operations (also those used inside multipop) cannot be higher than number of push() operations that is not higher than m. Thus all the operations have constant amortised time. Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 25 / 35

Amortised Analysis Example, cont.: Accounting Method We pay for some operations in advance. Amortised cost of operation is a i = t i + credit i Put a coin on each element pushed to the stack. (that is cost of push is: 1 (real cost) + 1 (credit)) Then, because the real cost of any pop() is 1, we always have enough money for paying any other sequence of operations Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 26 / 35

Unbounded Arrays Indexable growing sequences Consider an abstract data structure that supports: [.] (indexing) push(t element) (add an element to the end of sequence) And additionally does not have a limit on size. How to implement it with amortised constant time complexity of both operations? Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 27 / 35

Unbounded Arrays Dynamically Growing Arrays If full, allocate 2 times bigger and copy. Now consider a sequence of n push operations (indexing has constant cost) What is the pessimistic cost of push? Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 28 / 35

Unbounded Arrays Dynamically Growing Arrays If full, allocate 2 times bigger and copy. Now consider a sequence of n push operations (indexing has constant cost) What is the pessimistic cost of push? What is amortised cost of push? Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 28 / 35

Unbounded Arrays Dynamically Growing Arrays If full, allocate 2 times bigger and copy. Now consider a sequence of n push operations (indexing has constant cost) What is the pessimistic cost of push? What is amortised cost of push?(lets use the global cost method) t i == i if i 1 is a power of 2 (else t i == 1) n i=1 t i n + lg(n) j=0 2 j < n + 2n = 3n Thus, the total cost of n operations is bounded by 3n so that amoritsed cost is 3n/n = O(1) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 28 / 35

Unbounded Arrays Dynamically Growing Arrays If full, allocate 2 times bigger and copy. Now consider a sequence of n push operations (indexing has constant cost) What is the pessimistic cost of push? What is amortised cost of push?(lets use the global cost method) t i == i if i 1 is a power of 2 (else t i == 1) n i=1 t i n + lg(n) j=0 2 j < n + 2n = 3n Thus, the total cost of n operations is bounded by 3n so that amoritsed cost is 3n/n = O(1) Exercise: What happens if the array grows by constant number k of cells instead of becoming twice bigger? Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 28 / 35

Unbounded Arrays Example: Analysis of Growing Arrays, cont. We can also use accounting method. Each push pays 3 units to account: 1 for putting it, 1 for potential copying it in future, 1 for potential future copying of one of the previous half of elements already in the array. After each re-allocate, the credit is 0. We can also use the potential method: Φ i = 2n w (where n is the current number of elements and w is the current size) Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 29 / 35

Unbounded Arrays Dynamically Growing and Shrinking Arrays Now, assume we want to extend the interface: [.] (indexing) push(t element) (add an element to the end of sequence) popback() (take the last element in the sequence) And wish that if there is too much unused space in the array it automatically shrinks Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 30 / 35

Unbounded Arrays Unbounded Arrays An unbound array u containing currently n elements, is emulated with w-element (w n) static bounded array b with the following approach: rst n positions of b are used to keep the elements, last w n are not used if n reaches w, a larger (say α = 2 times larger) bounded array b is allocated and elements copied to b if n is to small (say β = 4 times smaller than w) a smaller (say α = 2 smaller) array b is reallocated and elements copied to b What is the (worst?,average?) time complexity of: index, pushback, popback in such implementation? Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 31 / 35

Unbounded Arrays Example of Amortised Analysis on UArrays pushback and popback on unbounded array with n elements have either O(1) (constant) or O(n) (linear) cost, depending on current size w of underlying bounded array b. Lemma. Any sequence of m operations on (initially empty) unbounded array (with α = 2 and β = 4) has O(m) total cost, i.e. the amortised cost of operations of unbounded array is constant (O(m)/m). Corollary. pushback and popback operations on unbounded array have amortised constant time complexity. Exercise : Prove the Lemma. Hint: dene the potential Φ(u) = max(3n w, w/2) and use the potential method Exercise: Show that if β = α = 2, it is possible to construct a sequence of m operations that have O(m 2 ) total cost. Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 32 / 35

Unbounded Arrays Comparison of complexity of sequence operations Operation SList DList UArray CArray meaning of ' [.] n n 1 1 size 1' 1' 1 1 without external splice rst 1 1 1 1 last 1 1 1 1 insert 1 1' n n only insertafter remove 1 1' n n only removeafter pushback 1 1 1' 1' amortised pushfront 1 1 n 1' amortised popback n 1 1' 1' amortised popfront 1 1 n 1' amortised concat 1 1 n n splice 1 1 n n ndnext n n n' n' cache-ecient (all the values are surrounded by O(), n is the number of elements in the sequence) source: K.Mehlhorn, P.Sanders Algorithms and Data Structures. The Basic Toolbox, Springer 2008 Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 33 / 35

Summary Summary Linked Lists Singly Linked Lists Doubly Linked Lists The Concept of Abstract Data Structure Stack Queue Deque The Concept of Amortised Complexity Potential function method total cost and accounting methods Examples on Stack with multipop Unbounded Arrays Comparison of Various Representations of Sequences Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 34 / 35

Summary Thank you for your attention Marcin Sydow (Web Mining Lab PJWSTK) Algorithms and Data Structures 35 / 35