1 Basic Data Structures Page 1 BFHTI: Softwareschule Schweiz Basic Data Structures Dr. CAS SD01
2 Basic Data Structures Page 2 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
3 Basic Data Structures Data Structures and Abstract Data Types Page 3 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
4 Basic Data Structures Data Structures and Abstract Data Types Page 4 Data Structures A data structure is a way of storing complex data in a computer so that it can be used efficiently Carefully chosen data structures are crucial for building efficient algorithms Therefore, the quality and performance of large systems depends heavily on choosing the best data structure Different data structures are suited to different kinds of applications, and some are highly specialized to certain tasks Many basic data structures are included in standard libraries of modern programming languages (e.g. Java Collection API) The fundamental building blocks of most data structures are arrays, records, and references
5 Basic Data Structures Data Structures and Abstract Data Types Page 5 Abstract Data Types An abstract data type (ADT) is an abstraction of a data structure An ADT specifies Data stored Operations on the data Error conditions associated with the data The objectoriented programming paradigm supports the creation of complex ADTs ADTs are specified as interfaces ADTs are implemented as classes (which themselves implement the ADT interface) Concrete data structures are wrapped into objects of corresponding classes
6 Basic Data Structures Data Structures and Abstract Data Types Page 6 Properties of WellDesigned ADTs Universality The same ADT can be used in different programs Encapsulation The interface provides an impenetrable barrier Simplicity The implementation details are entirely hidden Integrity Internal data is protected against improper use or bugs Flexibility The internal implementation can be changed without affecting the main application(s) Modularity Important subproblems are solved independently
7 Basic Data Structures Linear Data Structures Page 7 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
8 Basic Data Structures Linear Data Structures Page 8 Linear Data Structures A linear data structure is a collection of linearly arranged elements with various ways to access its elements Stack Insert/remove elements at one end of the collection Queue Insert/remove elements at different ends of the collection Vector Access elements w.r.t. the rank within the collection List Access elements w.r.t. the position within the collection Sequence Access elements w.r.t. both ranks and positions The elements of a linear data structure have all the same rights (no priorities, no hierarchy)
9 Basic Data Structures Linear Data Structures Page 9 The Stack ADT The stack ADT is a linear data structure that stores arbitrary elements according to the lastinfirstout (LIFO) scheme Thus insertion and deletions take place at the same end of the data structure Think of a springloaded plate dispenser Applications of stacks Visitedpage history in a web browser Undo sequence in a text editor Towers of Hanoi problem Chain of method calls in the Java Virtual Machine (JVM) Parsing of arithmetic expressions
10 Basic Data Structures Linear Data Structures Page 10 The Queue ADT The queue ADT is a linear data structure that stores arbitrary elements according to the firstinfirstout (FIFO) scheme Thus insertion and deletions take place at the opposite ends of the data structure Think of the queue at the airport security check Applications of queues Waiting lists Access to shared resources (e.g. a printer) Multiprogramming
11 Basic Data Structures Linear Data Structures Page 11 The Vector ADT The vector ADT extends the notion of an array by storing a sequence of arbitrary elements An element can be accessed, inserted, or removed by specifying its rank {0, 1, 2,...} = number of elements preceding it The size (number of stored elements) of a vector changes when elements are inserted or deleted Proper vectors have no fixed maximal size The ranks of some elements may change when elements are inserted or deleted Think of a ranking in ski racing events An exception is thrown if an incorrect rank is specified (e.g. a negative rank)
12 Basic Data Structures Linear Data Structures Page 12 The List ADT The list ADT models a linear sequence of positions Each position stores an arbitrary element List manipulations are always performed relative to some given positions Special positions are the first and the last position in the list To be as general as possible, we need a position ADT with two simple operations: getelement(): returns the element stored at the position setelement(e): sets the stored element to e The position ADT gives a unified view of diverse ways of storing data (cell of an array, node of a linked list, etc.) A list establishes a before/after relation between positions
13 Basic Data Structures Linear Data Structures Page 13 The Sequence ADT The sequence ADT is the union of the vector and the list ADT Elements can be accessed by their rank and/or their position To transform ranks into positions and vice versa, two bridge operators are needed atrank(r): returns the position at rank r rankof(p): returns the rank of a position p The sequence ADT is thus a generalpurpose data structure for storing linearly ordered collections of elements Stacks, queues, vectors, and lists are included as special cases
14 Basic Data Structures Linear Data Structures Page 14 Operations for Linear Structures I General Operations All: isempty(), size() Accessing Elements/Positions Stack: top() Queue: front() Vector: elematrank(r) List: first(), last(), before(p), after(p) Inserting Elements Stack: push(e) Queue: enqueue(e) Vector: insertatrank(r,e)
15 Basic Data Structures Linear Data Structures Page 15 Operations for Linear Structures II List: insertfirst(e), insertlast(e), insertbefore(p,e), insertafter(p,e) Removing Elements Stack: pop() Queue: dequeue() Vector: removeatrank(r) List: removeelement(p) Replacing/Swaping Elements Vector: replaceatrank(r,e), swapatranks(r,q) List: replaceelement(p,e), swapelements(p,q) Rank/position conversion Sequence: atrank(r), rankof(p)
16 Basic Data Structures Linear Data Structures Page 16 UML Diagram <<interface>> BasicSequence size() isempty() <<interface>> Position getelement() setelement(e) <<interface>> Stack push(e) pop() top() <<interface>> Queue enqueue(e) dequeue() front() <<interface>> Sequence atrank(r) rankof(p) <<interface>> Vector elematrank(r) insertatrank(r,e) removeatrank(r) replaceatrank(r) swapatranks(r,q) <<interface>> List first() last() before(p) after(p) isfirst(p) islast(p) insertfirst(e) insertlast(e) insertbefore(p,e) insertafter(p,e) removeelement(p) replaceelement(p,e) swapelements(p,q)
17 Basic Data Structures Linear Data Structures Page 17 BasicSequence Interface in Java public interface BasicSequence { public int size(); public boolean isempty(); } For more information on Java interfaces, see 6.10 in Java ist auch eine Insel
18 Basic Data Structures Linear Data Structures Page 18 Stack Interface in Java public interface Stack extends BasicSequence { public Object top() throws EmptyStackException; public void push(object e); public Object pop() throws EmptyStackException; } Stack inherits general methods from BasicSequence Requires the definition of a class EmptyStackException Generic stacks of a particular type can be defined using a technique called Java Generics (see 6.12 in Java ist auch eine Insel )
19 Basic Data Structures Linear Data Structures Page 19 Sequence Interface in Java public interface Sequence extends Vector, List { public Position atrank(int r); public Position rankof(position p); } Example of multiple inheritance of interfaces (not allowed for classes)
20 Basic Data Structures Linear Data Structures Page 20 The Collection Interface The java.util package provides some predefined interfaces and classes <<interface>> Iterable <<interface>> Collection AbstractCollection <<interface>> Queue <<interface>> List AbstractList AbstractSequentialList Vector ArrayList LinkedList Stack AttributeList RoleList See 12 in Java ist auch eine Insel RoleUnsolvedList
21 Basic Data Structures Implementing Linear Data Structures Page 21 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
22 Basic Data Structures Implementing Linear Data Structures Page 22 Implementing Linear Data Structures Linear data structures can be implemented in multiple ways: Arrays Records and references (singly/doublylinked lists) Combinations of arrays and linked lists The choice of the implementation determines the running times and space requirements of the basic operations Choosing the right implementation... depends on the intended application is a tradeoff between running time, memory space, simplicity
23 Basic Data Structures Implementing Linear Data Structures Page 23 Implementing Linear Data Structures in Java In Java, implementing a ADT means to write a class which implements the interface You may have several implementations of the same interface <<interface>> BasicSequence size() isempty() ArrayStack n S size() isempty() push(e) pop() top() <<interface>> Stack push(e) pop() top() LinkedListStack n top size() isempty() push(e) pop() top()
24 Basic Data Structures Implementing Linear Data Structures Page 24 Arrays Most programming languages provide arrays as a simple linear data structure Arrays have a fixed size N The elements are usually indexed by i {0,..., N 1} A[0] = first element A[i] = i+1th element A[N 1] = last element The running time for accessing elements is usually O(1) Random access Arrays are similar to vectors (but not identical)
25 Basic Data Structures Implementing Linear Data Structures Page 25 Linked Lists A linked list is another fundamental data structure which is easy to implement in most programming languages It consists of a linked sequence of nodes Each node is a record (or object) which contains: A data field to store an element (number, string, object, etc.) One or two references (links, pointers) pointing to the next and/or the previous nodes Other than arrays, a linked list detaches the order of the list elements from the one used to store them in memory or disk It allows only sequential (no random) access to its elements Nodes should implement the interface of the position ADT
26 Basic Data Structures Implementing Linear Data Structures Page 26 Singly Linked Lists A singly linked list is the simplest form of a linked list Each node contains a reference to the next node A single reference to the first node is kept in memory Sometimes it is useful to keep a reference to the last node Node Record Main Reference Optional Reference next element A B C
27 Basic Data Structures Implementing Linear Data Structures Page 27 DoublyLinked Lists A doublylinked list is another simple form of a linked list Each node contains two references, one to the next node and one to the previous node Usually references to both ends are kept For simplicity, special header and trailer nodes are often added Node Reference 1 Reference 2 Record prev next element A B C
28 Basic Data Structures Implementation with Arrays Page 28 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
29 Basic Data Structures Implementation with Arrays Page 29 ArrayBased Stacks The elements are added from left to right and removed from right to left A variable n keeps track of the stack size (= next available location in the array) S n N 1 When the array becomes full, i.e. for n = N, a push operation will throw a FullStackException This exception is implementationspecific (not intrinsic to stack ADT)
30 Basic Data Structures Implementation with Arrays Page 30 Implementing ArrayBased Stacks I Algorithm size() // runs in O(1) time return n Algorithm isempty() // runs in O(1) time return (n = 0) Algorithm top() // runs in O(1) time if isempty() then throw EmptyStackException else return S[n 1]
31 Basic Data Structures Implementation with Arrays Page 31 Implementing ArrayBased Stacks II Algorithm push(e) // runs in O(1) time if n = N then throw FullStackException else S[n] e n n + 1 Algorithm pop() // runs in O(1) time if isempty() then throw EmptyStackException else n n 1 return S[n]
32 Basic Data Structures Implementation with Arrays Page 32 ArrayBased Queues To implement the queue ADT, the array should be used in a circular fashion The elements are added and removed from left to right Two variables f and r keep track of the front and rear element s indices (the location r is kept empty) Q Q 0 1 f r N r f N 1 When the array becomes full, a enqueue operation will throw a FullQueueException (implementationspecific exception)
33 Basic Data Structures Implementation with Arrays Page 33 Implementing ArrayBased Queues I Algorithm size() // runs in O(1) time return (N + r f ) mod N Algorithm isempty() // runs in O(1) time return (f = r) Algorithm front() // runs in O(1) time if isempty() then throw EmptyQueueException else return Q[f ]
34 Basic Data Structures Implementation with Arrays Page 34 Implementing ArrayBased Queues II Algorithm enqueue(e) // runs in O(1) time if size() = N 1 then // the capacity is only N 1! throw FullQueueException else Q[r] e r (r + 1) mod N Algorithm dequeue() // runs in O(1) time if isempty() then throw EmptyQueueException else e Q[f ] f (f + 1) mod N return e
35 Basic Data Structures Implementation with Arrays Page 35 ArrayBased Vectors Vectors are most naturally implemented with arrays (i.e. ranks = array indices) A variable n keeps track of the size of the vector (number of elements stored) V r n N 1 Operation elematrank(r) is implemented in O(1) time by returning V [r] When the array becomes full, a insertatrank operation will throw a FullVectorException (implementationspecific exception)
36 Basic Data Structures Implementation with Arrays Page 36 Inserting Elements In the operation insertatrank(r,e), we need to make room for the new element shift forward the n r elements V [r],..., V [n 1] V r n N 1 V r n N 1 V e r n N 1 In the worst case, i.e. for r = 0, this takes O(n) time
37 Basic Data Structures Implementation with Arrays Page 37 Removing Elements In the operation removeatrank(r), we need to fill the hole left by the removed element shift backward the n r 1 elements V [r + 1],..., V [n 1] V r n N 1 V r n N 1 V r n N 1 In the worst case, i.e. for r = 0, this takes O(n) time
38 Basic Data Structures Implementation with Growable Arrays Page 38 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
39 Basic Data Structures Implementation with Growable Arrays Page 39 Growable ArrayBased Stack To solve the FullStackException problem, replace the array with a larger one when necessary Incremental strategy: increase the size by a constant c Doubling strategy: double the size Algorithm push(e) if size() = N then A new array of size... for i 0 to n 1 do A[i] S[i] S A S[n] e n n + 1
40 Basic Data Structures Implementation with Growable Arrays Page 40 Running Times of Strategies Incremental Strategy Doubling Strategy running time of push(e) running time of push(e) current number of elements current number of elements
41 Basic Data Structures Implementation with Growable Arrays Page 41 Comparison of the Strategies Consider the the total time T (n) needed to perform a series of n push operations We assume that we start with an empty stack represented by an array of size c, for the incremental strategy 1, for the doubling strategy We call amortized running time T (n)/n the average time taken by a push operation over the series of operations We only consider one primitive operation: storing an object in the array
42 Basic Data Structures Implementation with Growable Arrays Page 42 Incremental Strategy Let n be a multiple of c, i.e. n = kc The array needs to be replaced k 1 times For the total running time T (n) we get T (n) = n + c + 2c + 3c (k 1)c = n + c( (k 1)) (k 1)k = n + c 2 = = 1 2c n n The total running time for n push operations is O(n 2 ) The amortized running time for a push operation is O(n)
43 Basic Data Structures Implementation with Growable Arrays Page 43 Doubling Strategy Let n be a power of 2, i.e. n = 2 k The array needs to be replaced k = log n times For the total running time T (n) we get T (n) = n k 1 = n + 2 k 1 = 2n 1 The total running time for n push operations is O(n) The amortized running time for a push operation is O(1) As a consequence, the doubling strategy outperforms the incremental strategy for large stacks
44 Basic Data Structures Implementation with Growable Arrays Page 44 Performance Growable arrays can also be used to implement queues and vectors (and lists, but this is not very natural) Operation Doubling Incremental Strategy Strategy size O(1) O(1) isempty O(1) O(1) top, front, elematrank O(1) O(1) push, enqueue O(1) O(n) insertatrank O(n) O(n) pop, dequeue O(1) O(1) removeatrank O(n) O(n)
45 Basic Data Structures Implementation with Linked Lists Page 45 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
46 Basic Data Structures Implementation with Linked Lists Page 46 Stacks and Queues with Singly Linked Lists Stacks and queues are often implemented as singly linked lists Nodes implement the position ADT by storing: Element Reference next to the next node The top/front element is stored at the first node For stacks, only a reference to the first node needs to be kept (called top) For queues, references to both ends of the list needs to be kept (called front and rear) Keep track of the current size n of the stack/queue All operations run in O(1) time, memory grows in O(n)
47 Basic Data Structures Implementation with Linked Lists Page 47 Implementing Stack with Singly Linked Lists Algorithm push(e) // runs in O(1) time new new Node(e) new.next top top new n n + 1 Algorithm pop() // runs in O(1) time if isempty() then throw EmptyStackException else e top.getelement() top top.next n n 1 return e
48 Basic Data Structures Implementation with Linked Lists Page 48 DoublyLinked List Implementation A doublylinked list provides a natural implementation of the list ADT Nodes implement the position ADT by storing Element Reference prev to the previous node Reference next to the next node The list itself stores two references first and last first last A B C D E
49 Basic Data Structures Implementation with Linked Lists Page 49 Inserting Elements All four insertion operations need to create some new and redirect some existing links, i.e. they run in O(1) time p A B C D E A B C D E q X A B C X D E
50 Basic Data Structures Implementation with Linked Lists Page 50 Removing Elements The operation removeelement(p), which needs to redirect some existing links, runs in O(1) time p A B C X D E A B C D E p X A B C D E
51 Basic Data Structures Implementation with Linked Lists Page 51 Running Times Overview Array Doubly Singly Sequence Operation (circular, Linked Linked growable) List List size, isempty insertfirst, insertlast replaceelement, swapelements first, last, isfirst, islast, after before 1 1 n insertafter n 1 1 insertbefore, removeelement n 1 n atrank, rankof, elematrank 1 n n replaceatrank, swapatranks 1 n n insertatrank, removeatrank n n n
52 Basic Data Structures Iterators Page 52 Outline Data Structures and Abstract Data Types Linear Data Structures Implementing Linear Data Structures Implementation with Arrays Implementation with Growable Arrays Implementation with Linked Lists Iterators
53 Basic Data Structures Iterators Page 53 The Iterator ADT An iterator abstracts the process of scanning through a sequence by keeping a pointer to the current element Methods of an iterator ADT element(): returns the current element hasnext(): checks whether the iteration has completed nextelement(): advances the pointer to the next element reset(): resets the iterator Can be realized for array or linkedlist implementations The order in which the elements are traversed is not necessarily the normal rankbased or positionbased order We can use several iterators for the same sequence See in Java ist auch eine Insel (interface iterable)
More information