DATABASE DESIGN - 1DL400

Size: px
Start display at page:

Download "DATABASE DESIGN - 1DL400"

Transcription

1 DATABASE DESIGN - 1DL400 Spring 2015 A course on modern database systems!! Kjell Orsborn! Uppsala Database Laboratory! Department of Information Technology, Uppsala University,! Uppsala, Sweden 02/03/15 1

2 Introduction to Indexing Elmasri/Navathe ch 16 and 17 Padron-McCarthy/Risch ch 21 and 22 Kjell Orsborn! Uppsala Database Laboratory! Department of Information Technology, Uppsala University,! Uppsala, Sweden 02/03/15 2

3 Content Files of records Operations on files Scans Operations on scans Ordered and unordered files Index sequential and hash files Index concepts Types of indexes Clustering indexes Ordered indexes, B-trees, B+ trees Unordered indexes, hash indexes Index properties and evaluation metrics 02/03/15 3

4 Files of records A file managed by a DBMS is a sequence of records, where each record is a collection of data values representing a tuple. A file descriptor (or file header) includes Meta-information that describes the file and the records it stores, such as the attribute names and data types of the fields in the records. The records in a file are usually uniform, i.e. they are of the same size and contain the same kind of data values in each position. File records are grouped into blocks of records. The file descriptor contains disk addresses of the file blocks. The blocking factor bfr (or block size) for a file is the (average) number of file records stored in a disk block. 02/03/15 4

5 Typical file operations include:! Operations on disk files OPEN: Readies the file for access, and associates a pointer called a cursor that will refer to a current file record representing a current tuple. FIND: Searches for the first record in a file that satisfies a certain condition, and makes it the cursor position. FINDNEXT: Searches for the next record from the current cursor position that satisfies a certain condition, and makes it the current cursor position. READ: Reads the record in the current cursor position into program variables. DELETE: Removes the record at the current cursor position from the file, usually by marking the record to indicate that it is no longer valid. MODIFY: Changes the values of some fields in the record of the current cursor position, i.e. copies values from program variables to the current record. INSERT: Inserts a new record into the file and makes it the current cursor position., i.e. copies values from program variables into a new record and inserts it at the current cursor position. CLOSE: Terminates access to the file. REORGANIZE: Reorganizes the file records. For example, the records marked deleted are physically removed from the file or batches of new records are merged into the file. 02/03/15 5

6 Streamed access to database operators The result of internal database operators (e.g. a join) is usually represented as scans (~streams) of tuples The cursors represent positions in scans. Cursors can be moved iteratively forward over the scans until end-of-scan is reached! SQL queries are translated by the query optimizer into programs called execution plans.! Execution plans call physical relational algebra operators that are programs that iterate over scans of tuples and iteratively produce new scans of tuples as results.! Scans can be defined over file records index records reverse scans over files and indexes are also possible where scans move backwards records produced (emitted) by some physical relational algebra operator. 02/03/15 6

7 Streamed access to database operators SCAN operators The following basic operators are defined over scans: OPEN: opens the scan for reading tuples and sets the cursor to the first tuple. NEXT: reads the next tuple in the scan into some program variables and makes it the current cursors position of the scan EOS: true if there are no more tuples in the scan CLOSE: closes the scan and releases all its resources. Scans over physical files is represented by file pointers where a new record is read for each NEXT call.! Scans over the result of a physical operator is usually represented as iterator objects with next, eos, and close methods.! Intermediate results may either be materialized as a list of blocks of tuples or generated by some code (e.g. performing a join) when next is called 02/03/15 7

8 Scan-based database operators Examples of physical algebra operators (code) using and producing scans:! FINDALL: emit all tuples in the result of a query. Such a scan operator call is e.g. produced by the query processor to iteratively emit the result of a query. Execution plans usually contain many FINDALL operators.! STOPAFTER n.: iteratively emit the first n tuples in another scan! DISTINCT: iteratively emit the tuples of another scan where duplicate tuples are removed! SORT: emit the tuples of another in sorted orders! INDEXSCAN: Iterative emitting the tuples matching the key of an index! REVESEINDEXSCAN: Iteratively emitting matching index tuples in reverse order! MATERIALIZE: Store each tuple in a scan in a file. 02/03/15 8

9 Unordered files Also called a heap file. New records are inserted at the end of the file. A linear search through the file records is necessary to search for a record. This requires reading and searching half the file blocks on the average, and hence does not scale as the file grows. Record insertion is quite efficient. Reading the records in order of a particular field requires sorting the file records, which does not scale. 02/03/15 9

10 Ordered files Also called a index sequential (ISAM) file. File records are kept sorted by the values of an ordering field. Insertion is rather expensive: records must be inserted in the correct order. It is common to keep a separate unordered overflow (or transaction) file for batching new records to improve insertion efficiency; this is periodically merged with the main ordered file. A binary search can be used to search for a record on its ordering field value. This requires reading and searching log 2 of the file blocks on the average, an improvement over linear search. Reading the records in order of the ordering field is quite efficient. Efficiency is measured in number of read disk blocks. 02/03/15 10

11 Hash files Hashing for disk files is called External Hashing The file blocks are divided into M equal-sized buckets, numbered bucket 0, bucket 1,..., bucket M-1. Typically, a bucket corresponds to one disk block in a file. Each bucket represents one or several records (i.e. tuples) One of the fields of the records is designated to be the hash key of the file. The record with hash key value K is stored in bucket i, where i = h(k), and h is the hashing function. Search and update is very efficient on equality of the hash key. Collisions occur when a new record hashes to a bucket that is already full. An overflow file is kept for storing such records. Overflow records that hash to each bucket can be linked together. Main disadvantages of static external hashing: Fixed number of buckets M is a problem if the number of records in the file grows or shrinks. Ordered access on the hash key is quite inefficient (requires sorting the records). 02/03/15 11

12 Hash files (contd.) 02/03/15 12

13 Hash files - overflow handling using chaining Overflow locations are kept, usually, by extending the array with a number of overflow positions and by adding a pointer to the chain of overflow records for each bucket. In addition, a pointer field is added to each overflow record. A collision is resolved by placing the new record and its key in an unused overflow location and setting the pointer of the occupied hash address location to the address of that overflow location. 02/03/15 13

14 ! Basic index concepts Indexes are data structures used to speed up access to sets of records stored in a database (on disk or in main memory). E.g., author catalog in library An index (file) typically require much less storage space than the original data (file) An index consists of records, called index entries, of the form! key The (search) key of an index is the attribute(s) of the indexed set of tuples used to look up records, e.g. SSN, ISBN. The key is a record of one or several key values, which are usually stored directly in the index entry (e.g. SSN + ACCOUNT#). The value is a tuple that stores the corresponding data values The value field is usually a pointer to a data record storing the values Two basic kinds of indices: value Ordered indices: search keys are stored in sorted order Hash indices: search keys are distributed uniformly across buckets using a hash function. 02/03/15 14

15 Types of indexes Primary Index Defined on an ordered data file The data file is ordered on one ore several key field(s) Includes one index entry for each block in the data file; the index entry has the key field value for the first record in the block, which is called the block anchor This makes primary indexes very compact Clustering Index Defined on an ordered data file The data file is ordered on one or several non-key field(s) unlike the primary index, which requires that the ordering field of the data file has a distinct value for each distinct value of the index. The index value for each distinct value of the search key points to the first data block that contains records with that field value. For example, think on an index of four character string used to index a files ordered on long strings. 02/03/15 15

16 Types of indexes (cont.) Secondary Index A secondary index provides a secondary means of accessing a file for which some primary access already exists. It is a non-clustering index since the indexed records are not ordered by the index keys Retrieving all records pointed to from a secondary index can be very slow if the table is large. Unique index A unique index contains key thus having a single unique value for each key A non unique index, indexes a non-key position and has a set of values for each key value. 02/03/15 16

17 Clustered and unclustered indexes clustered index unclustered (or non-clustered) index 02/03/15 17

18 More index concepts The index is often specified on one attribute of the indexed tuples, but can also be specified on several attributes. The index is sometimes called an access path to the indexed attribute. A search over an index yields a scan whose cursor points to file records Scans over ordered indexes are usually represented data structures representing upper and lower limits of key values in a search along with the cursor Indexes can also be characterized as dense or sparse: A dense index has an index entry for every search key value (and hence every record) in the data file. A secondary index is usually dense. A sparse (or nondense) index, on the other hand, has index entries for only some of the search values. A primary index is usually sparse. 02/03/15 18

19 Ordered indexes - search trees A node in a search tree of order n (q n) with pointers to subtrees below Problems with general search trees: Insertion/deletion of data file records result in changes to the structure of the search tree. There are algorithms that guaranties that the search-tree conditions continue to hold also after modifications. After an update of a search tree one can get the following problems: 1. inbalance in the search tree that usually result in slower search. 2. empty nodes (after deletions) (Figure 18.8 Elmasri/Navathe 6 ed US) To avoid these types of problem one can use different types of B-trees (balanced trees) that fulfil stricter requirements than general search trees. 02/03/15 19

20 Ordered indexes using B-trees and B + -trees Most ordered multilevel indexes use a variation of the highly scalable B-tree data structure (Bayer, Acta Informatica 1(2), 1972) B-trees and B + -trees are automatically rebalanced trees with many children for each node (large fan-out) B-trees and B + -trees are variations of search trees that allow efficient insertion and deletion of new search values. In B-tree and B + -tree, each node occupies one disk block One disk block at the time is read into main memory by the DBMS The DBMS maintains a pool of disk blocks in main memory When pool is full disk blocks are flushed to disk In main memory each B-tree node should have a size close to the cache line size used, to avoid memory cache misses. B-trees also shown excellent in modern main memories with big differences in speed between data in caches and in the rest of the memory (G. Graefe & P-Å. Larsson, B-tree Indexes and CPU Caches, ICDE 2001). 02/03/15 20

21 B-trees and B + -trees indexes Each node is kept between half-full and completely full On the average the nodes are 63% full An insertion into a node that is not full is quite efficient Just fill an empty key/value slot in the node If a node is full the insertion causes a split into two nodes Splitting may propagate to neighboring tree levels A deletion is quite efficient if a node does not become less than half full If a deletion causes a node to become less than half full, it must be merged with neighboring nodes Differences between B-tree and B + -tree In a B-tree, pointers to data records exist at all levels of the tree In a B + -tree, all pointers to data records exists at the leaf-level nodes A B + -tree can have less levels (or higher capacity of search values) than the corresponding B-tree 02/03/15 21

22 B + -tree index files... A B + -tree of order n is a rooted tree satisfying the following properties: All paths from root to leaf are of the same length. Each node that is not a root or a leaf has between!n / 2" and n children. A leaf node has between!(n - 1) / 2" and n - 1 values. Special cases: if the root is not a leaf, it has at least 2 children. If the root is a leaf (that is, there are no other nodes in the tree), it can have between 0 and (n - 1) values. Advantage of B + -tree index files: automatically reorganizes itself with small, local, changes, in the face of insertions and deletions. Reorganization of entire file is not required to maintain performance. 02/03/15 22

23 ! B + -tree node structure A typical node! P 1 K 1 P 2... P n-1 K n-1 P n K i are the search-key values P i are pointers to children (for non-leaf nodes) or pointers to records or buckets of records (for leaf nodes). The search-keys in a node are ordered K 1 < K 2 < K 3 <...< K n-1 02/03/15 23

24 Leaf nodes in B + -trees All leaf nodes are at the same level and each node has between between!(n 1) / 2" and n - 1 values For i = 1, 2,..., n-1, pointer P i either points to a file record with search-key value K i, or to a bucket of pointers to file records, each record having search-key value K i. Only need bucket structure if search-key does not form a primary key. If L i, L j are leaf nodes and i < j, L i s search-key values are less than L j s search-key values P n points to next leaf node in search-key order P i-1 Brighton P i Downtown P n leaf node Brighton A Downtown A Downtown A account file /03/15 24

25 Non-leaf nodes in B + -trees Non leaf nodes form a multi-level sparse index on the leaf nodes. For a non-leaf node with n pointers: Each internal node (except the root) has between!n / 2" and n children (tree pointers) If the root is an internal node it as at least two children. An internal node with q pointers, q n, has q-1 search keys (values) All the search-keys in the subtree to which P 1 points are less than K 1 For 2 i n-1, all the search-keys in the subtree to which P i points have values greater than or equal to K i-1 and less than K i All the search-keys in the subtree to which P m points are greater than or equal to K m-1 P 1 K 1 P 2... P n-1 K n-1 P n 02/03/15 25

26 Example of a B + -tree Perryridge Mianus Redwood Brighton Downtown Mianus Perryridge Redwood Round Hill B + -tree for account file (n = 3) 02/03/15 26

27 Example of a B + -tree Perryridge Brighton Downtown Mianus Perryridge Redwood Round Hill B + -tree for account file (n = 5) Leaf nodes must have between 2 and 4 values (!(n - 1) / 2" and n - 1, with n = 5 ). Non-leaf nodes other than root must have between 3 and 5 children (!(n / 2" and n with n = 5 ). Root must have at least 2 children. 02/03/15 27

28 Observations about B + -trees Since the inter-node connections are done by pointers, there is no assumption that in the B + -tree, the logically close blocks are physically close. The non-leaf levels of the B + -tree form a hierarchy of sparse indexes. The B + -tree contains a relatively small number of levels (logarithmic in the size of the main file), thus searches can be conducted efficiently. Insertions and deletions to the main file can be handled efficiently, as the index can be restructured in logarithmic time. 02/03/15 28

29 Queries on B + -trees In processing a query, a path is traversed in the tree from the root to some leaf node. Find all records with a search-key value of k. Start with the root node. Examine the node for the smallest search-key value > k. If such a value exists, assume it is K i. Then follow P i to the child node. Otherwise k K m-1, where there are m pointers in the node. Then follow P m to the child node. If the node reached by following the pointer above is not a leaf node, repeat the above procedure on the node, and follow the corresponding pointer. Eventually reach a leaf node. If key K i = k, follow pointer P i to the desired record or bucket. Else no record with search-key value k exists. 02/03/15 29

30 Updates on B + -trees: insertion Find the leaf node in which the search-key value would appear. If the search-key value is already there in the leaf node, record is added to file and if necessary pointer is inserted into bucket. If the search-key value is not there, then add the record to the main file and create bucket if necessary. Then: if there is room in the leaf node, insert (search-key value, record/bucket pointer) pair into leaf node at appropriate position. if there is no room in the leaf node, split it and insert (search-key value, record/ bucket pointer) pair as discussed in the next slide. 02/03/15 30

31 Updates on B + -trees: insertion... Splitting a node: take the n (search-key value, pointer) pairs (including the one being inserted) in sorted order. Place the first!n/2" in the original node, and the rest in a new node. let the new node be p, and let k be the least key value in p. Insert (k,p) in the parent of the node being split. If the parent is full, split it and propagate the split further up. The splitting of nodes proceeds upwards till a node that is not full is found. In the worst case the root node may be split increasing the height of the tree by 1. 02/03/15 31

32 Updates on B + -trees: insertion... Perryridge Mianus Redwood Brighton Downtown Mianus Perryridge Redwood Round Hill Perryridge Downtown Mianus Redwood Brighton Clearview Downtown Mianus Perryridge Redwood Round Hill B + -tree before and after insertion of Clearview 02/03/15 32

33 Updates on B + -trees: deletion Find the record to be deleted, and remove it from the main file and from the bucket (if present). Remove (search-key value, pointer) from the leaf node if there is no bucket or if the bucket has become empty. If the node has too few entries due to the removal, and the entries in the node and a sibling fit into a single node, then Insert all the search-key values in the two nodes into a single node (the one on the left), and delete the other node. Delete the pair (K i-1, P i ), where P i is the pointer to the deleted node, from its parent, recursively using the above procedure. 02/03/15 33

34 Updates on B + -trees: deletion Otherwise, if the node has too few entries due to the removal, and the entries in the node and a sibling do not fit into a single node, then Redistribute the pointers between the node and a sibling such that both have more than the minimum number of entries. Update the corresponding search-key value in the parent of the node. The node deletions may cascade upwards till a node which has!n/2" or more pointers is found. If the root node has only one pointer after deletion, it is deleted and the sole child becomes the root. 02/03/15 34

35 Examples of B + -tree deletion Perryridge Downtown Mianus Redwood Brighton Clearview Downtown Mianus Perryridge Redwood Round Hill Perryridge Mianus Redwood Brighton Clearview Mianus Perryridge Redwood Round Hill Result after deleting Downtown from account The removal of the leaf node containing Downtown did not result in its parent having too few pointers. So the cascaded deletions stopped with the deleted leaf node s parent. 02/03/15 35

36 Examples of B + -tree deletion... Perryridge Downtown Mianus Redwood Brighton Clearview Downtown Mianus Perryridge Redwood Round Hill Mianus Downtown Redwood Brighton Clearview Downtown Mianus Redwood Round Hill Deletion of Perryridge instead of Downtown The deleted Perryridge node s parent became too small, but its sibling did not have space to accept one more pointer. So redistribution is performed. Observe that the root node s search-key value changes as a result. 02/03/15 36

37 B-tree index files Similar to B + -tree, but B-tree allows search-key values to appear only once; eliminates redundant storage of search keys. Search keys in nonleaf nodes appear nowhere else in the B-tree; an additional pointer field for each search key in a nonleaf node must be included. Generalized B-tree leaf node:!! P 1 K 1 P 2... P n-1 K n-1 P n Nonleaf node in B-trees where extra pointers B i are the bucket or file record pointers.! P 1 B 1 K 1 P 2 B 2 K 2... P m-1 B m-1 K m-1 P m 02/03/15 37

38 B-trees example Nodes in a B-tree are uniform: ( (key, value, subtree) ) Each node in a B-tree contains an array of key/value pairs and pointers to subtrees. There is no difference between leaf nodes and intermediate nodes. Assume there are 10 8 tuples in the index Assume a block size of 8192 and that each B-tree node is on the average 63% full. Assume each (key, value, subtree) triple uses 12 bytes. 8192*0.63/12 = 430 = triples per node The average depth of the B-tree will be log 430 (10 8 ) = 3.0 A key/value pair can be accessed with 3 block reads (disk accesses) on the average Root node kept in main memory => 2 blocks to read 02/03/15 38

39 B + -trees example In a B + -tree, intermediate nodes contain keys + pointers to subtrees (no values) Intermediate nodes: ((key, subtree) ) In a B + -tree, only the leaf-level nodes contain pointers to data records: Leaf nodes: ((key, value).). The leaf nodes are linked to enable fast scans. Because there are fewer data pointers in B+-trees, the fan-out (average number of children) becomes larger with B + -trees than with B-trees. Assume a block size of 8192 and that each B-tree node is on the average 63% full. Assume each (key, value) or (key, subtree) pairs uses 8 bytes. 8192*0.63/8 = 645 = pairs per node The average depth of the B-tree will be log 645 (10 8 ) = 2.8 A key/value pair can be accessed with = 1.8 block reads on the average 02/03/15 39

40 Hash indexes Hash indexes organizes the keys with their associated value pointers, into a hash table structure. Hash indexes are very fast for accessing a value (or a set of values) for a given key. Hash indexes do not support range searches. Hash indexes require dynamic hash tables that gracefully grow or shrink without significant delays as the database is updated Regular hashing problematic when files grow and shrink dynamically Dynamic and graceful scale-up and scale-down of tables needed in DBMSs Special hashing techniques have been developed to allow in incremental and dynamic growth and shrinking of the number of file records in hash files. The most used one is called LH, Linear Hashing (Litwin, VLDB 1980) Linear Hashing is also shown to be preferable when storing dynamic hash tables in main memory (P-Å Larson, Dynamic Hash Tables, CACM 31(4), 1988). 02/03/15 40

41 Example of a hash index 02/03/15 41

42 Outline of Linear Hashing (LH) No directory is created One starts with M buckets [0, 1,, M-1],! and the hash function h 0 =k mod 2 0 M,! and the number of bucket divisions n = 0! M 0 M-1 M M If the file load factor becomes to high a new bucket is created M+n and bucket number n is split up between bucket n and M+n via a new hash function h 1 =k mod 2 1 M! Thus we can keep the file load factor, l, within wanted interval by letting it trig possible splits and combinations.! l = r / bfr * N, where l = file load factor, r = no of records, N = no of buckets 02/03/15 42

43 LH is a dynamic hash algorithm Linear Hashing (LH) A hash file has a number of buckets with some capacity b >> 1! Not one hash functions but a series of hash functions h i (k) of keys k as the hash file evolves.! Typically hash by division h i (k) = k mod 2 i, i = 0, 1, 2, 3, 4,.! Buckets split through the replacement of h i with h i+1 ; i = 0, 1, as the file grows! As the file grows and the load (number of keys) of the hash table thereby increases beyond some threshold, buckets are successively split. For split buckets b/2 keys move towards new buckets. Shrinking the files is similarly possible by joining buckets when the table load decreases below some threshold. 02/03/15 43

44 LH example - file evolution b = 4 i = 0 p=0 h 0 (k)=k mod h 0 02/03/15 44

45 LH file evolution b = 4 i = 1 p = 0 h 1 (k)=k mod h 1 h 1 02/03/15 45

46 LH file evolution b = 4 i = 1 p = 0 h 1 (k)=k mod h 1 h 1 02/03/15 46

47 LH file evolution b = 4 i = 1 p = 1 h 2 (k)=k mod h 2 h 1 h 2 02/03/15 47

48 LH file evolution b = 4 i = 1 p = 1 h 2 (k)=k mod h 2 h 1 h 2 02/03/15 48

49 LH file evolution b = 4 i = 1 p = 1 h 2 (k)=k mod h 2 h 2 h 2 h 2 02/03/15 49

50 LH file evolution b = 4 i = 2 p = 0 h 2 (k)=k mod h 2 h 2 h 2 h 2 02/03/15 50

51 Main-memory LH LH shown excellent also for main memory hash tables (P-Å Larson 1988).! No need for specifying size when hash table allocated! Dynamic grow and shrink! No buckets but just overflow chains possible in main memory (i.e. b = 1). As for B-tree a bucket size close to cache line size is preferred.! When say 200% full (twice as many keys as size of table) after insert move p forward and add bucket! When say 50% full after delete move p backwards and delete bucket! Problem: Need dynamically extensible array to hold hash table 02/03/15 51

DATABASDESIGN FÖR INGENJÖRER - 1DL124

DATABASDESIGN FÖR INGENJÖRER - 1DL124 1 DATABASDESIGN FÖR INGENJÖRER - 1DL124 Sommar 2005 En introduktionskurs i databassystem http://user.it.uu.se/~udbl/dbt-sommar05/ alt. http://www.it.uu.se/edu/course/homepage/dbdesign/st05/ Kjell Orsborn

More information

Chapter 8: Structures for Files. Truong Quynh Chi tqchi@cse.hcmut.edu.vn. Spring- 2013

Chapter 8: Structures for Files. Truong Quynh Chi tqchi@cse.hcmut.edu.vn. Spring- 2013 Chapter 8: Data Storage, Indexing Structures for Files Truong Quynh Chi tqchi@cse.hcmut.edu.vn Spring- 2013 Overview of Database Design Process 2 Outline Data Storage Disk Storage Devices Files of Records

More information

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Chapter 13. Disk Storage, Basic File Structures, and Hashing Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible Hashing

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1 Slide 13-1 Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible

More information

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing Chapter 13 Disk Storage, Basic File Structures, and Hashing Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files

More information

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers B+ Tree and Hashing B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers B+ Tree Properties Balanced Tree Same height for paths

More information

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Chapter 13 Disk Storage, Basic File Structures, and Hashing. Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright 2004 Pearson Education, Inc. Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files

More information

Database Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques

Database Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques Database Systems Session 8 Main Theme Physical Database Design, Query Execution Concepts and Database Programming Techniques Dr. Jean-Claude Franchitti New York University Computer Science Department Courant

More information

Record Storage and Primary File Organization

Record Storage and Primary File Organization Record Storage and Primary File Organization 1 C H A P T E R 4 Contents Introduction Secondary Storage Devices Buffering of Blocks Placing File Records on Disk Operations on Files Files of Unordered Records

More information

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium. Chapter 4: Record Storage and Primary File Organization 1 Record Storage and Primary File Organization INTRODUCTION The collection of data that makes up a computerized database must be stored physically

More information

Lecture 1: Data Storage & Index

Lecture 1: Data Storage & Index Lecture 1: Data Storage & Index R&G Chapter 8-11 Concurrency control Query Execution and Optimization Relational Operators File & Access Methods Buffer Management Disk Space Management Recovery Manager

More information

Chapter 13: Query Processing. Basic Steps in Query Processing

Chapter 13: Query Processing. Basic Steps in Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate

More information

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:

More information

CSE 326: Data Structures B-Trees and B+ Trees

CSE 326: Data Structures B-Trees and B+ Trees Announcements (4//08) CSE 26: Data Structures B-Trees and B+ Trees Brian Curless Spring 2008 Midterm on Friday Special office hour: 4:-5: Thursday in Jaech Gallery (6 th floor of CSE building) This is

More information

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:

More information

Binary Heap Algorithms

Binary Heap Algorithms CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005 2009 Glenn G. Chappell

More information

Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3

Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3 Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM

More information

Big Data and Scripting. Part 4: Memory Hierarchies

Big Data and Scripting. Part 4: Memory Hierarchies 1, Big Data and Scripting Part 4: Memory Hierarchies 2, Model and Definitions memory size: M machine words total storage (on disk) of N elements (N is very large) disk size unlimited (for our considerations)

More information

Symbol Tables. Introduction

Symbol Tables. Introduction Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The

More information

Databases and Information Systems 1 Part 3: Storage Structures and Indices

Databases and Information Systems 1 Part 3: Storage Structures and Indices bases and Information Systems 1 Part 3: Storage Structures and Indices Prof. Dr. Stefan Böttcher Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents: - database buffer -

More information

Data storage Tree indexes

Data storage Tree indexes Data storage Tree indexes Rasmus Pagh February 7 lecture 1 Access paths For many database queries and updates, only a small fraction of the data needs to be accessed. Extreme examples are looking or updating

More information

Heaps & Priority Queues in the C++ STL 2-3 Trees

Heaps & Priority Queues in the C++ STL 2-3 Trees Heaps & Priority Queues in the C++ STL 2-3 Trees CS 3 Data Structures and Algorithms Lecture Slides Friday, April 7, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks

More information

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb

Outline BST Operations Worst case Average case Balancing AVL Red-black B-trees. Binary Search Trees. Lecturer: Georgy Gimel farb Binary Search Trees Lecturer: Georgy Gimel farb COMPSCI 220 Algorithms and Data Structures 1 / 27 1 Properties of Binary Search Trees 2 Basic BST operations The worst-case time complexity of BST operations

More information

A Comparison of Dictionary Implementations

A Comparison of Dictionary Implementations A Comparison of Dictionary Implementations Mark P Neyer April 10, 2009 1 Introduction A common problem in computer science is the representation of a mapping between two sets. A mapping f : A B is a function

More information

Data Warehousing und Data Mining

Data Warehousing und Data Mining Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data

More information

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ Answer the following 1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ 2) Which data structure is needed to convert infix notations to postfix notations? Stack 3) The

More information

Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8

Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8 Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

CHAPTER 13: DISK STORAGE, BASIC FILE STRUCTURES, AND HASHING

CHAPTER 13: DISK STORAGE, BASIC FILE STRUCTURES, AND HASHING Chapter 13: Disk Storage, Basic File Structures, and Hashing 1 CHAPTER 13: DISK STORAGE, BASIC FILE STRUCTURES, AND HASHING Answers to Selected Exercises 13.23 Consider a disk with the following characteristics

More information

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants

R-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions

More information

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13 External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing

More information

Data Structures Fibonacci Heaps, Amortized Analysis

Data Structures Fibonacci Heaps, Amortized Analysis Chapter 4 Data Structures Fibonacci Heaps, Amortized Analysis Algorithm Theory WS 2012/13 Fabian Kuhn Fibonacci Heaps Lacy merge variant of binomial heaps: Do not merge trees as long as possible Structure:

More information

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process

More information

Raima Database Manager Version 14.0 In-memory Database Engine

Raima Database Manager Version 14.0 In-memory Database Engine + Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized

More information

CIS 631 Database Management Systems Sample Final Exam

CIS 631 Database Management Systems Sample Final Exam CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files

More information

File Management. Chapter 12

File Management. Chapter 12 Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

Review of Hashing: Integer Keys

Review of Hashing: Integer Keys CSE 326 Lecture 13: Much ado about Hashing Today s munchies to munch on: Review of Hashing Collision Resolution by: Separate Chaining Open Addressing $ Linear/Quadratic Probing $ Double Hashing Rehashing

More information

Storage and File Structure

Storage and File Structure Storage and File Structure Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files

More information

Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University

Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications

More information

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing

More information

SQL Query Evaluation. Winter 2006-2007 Lecture 23

SQL Query Evaluation. Winter 2006-2007 Lecture 23 SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution

More information

File System Management

File System Management Lecture 7: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation

More information

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92. Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure

More information

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level) Comp 5311 Database Management Systems 16. Review 2 (Physical Level) 1 Main Topics Indexing Join Algorithms Query Processing and Optimization Transactions and Concurrency Control 2 Indexing Used for faster

More information

Performance Evaluation of Natural and Surrogate Key Database Architectures

Performance Evaluation of Natural and Surrogate Key Database Architectures Performance Evaluation of Natural and Surrogate Key Database Architectures Sebastian Link 1, Ivan Luković 2, Pavle ogin *)1 1 Victoria University of Wellington, Wellington, P.O. Box 600, New Zealand sebastian.link@vuw.ac.nz

More information

DISTRIBUTED AND PARALLELL DATABASE

DISTRIBUTED AND PARALLELL DATABASE DISTRIBUTED AND PARALLELL DATABASE SYSTEMS Tore Risch Uppsala Database Laboratory Department of Information Technology Uppsala University Sweden http://user.it.uu.se/~torer PAGE 1 What is a Distributed

More information

MS SQL Performance (Tuning) Best Practices:

MS SQL Performance (Tuning) Best Practices: MS SQL Performance (Tuning) Best Practices: 1. Don t share the SQL server hardware with other services If other workloads are running on the same server where SQL Server is running, memory and other hardware

More information

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database

More information

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and:

A binary search tree or BST is a binary tree that is either empty or in which the data element of each node has a key, and: Binary Search Trees 1 The general binary tree shown in the previous chapter is not terribly useful in practice. The chief use of binary trees is for providing rapid access to data (indexing, if you will)

More information

DATABASDESIGN FÖR INGENJÖRER - 1DL124

DATABASDESIGN FÖR INGENJÖRER - 1DL124 1 DATABASDESIGN FÖR INGENJÖRER - 1DL124 Sommar 2007 En introduktionskurs i databassystem http://user.it.uu.se/~udbl/dbt-sommar07/ alt. http://www.it.uu.se/edu/course/homepage/dbdesign/st07/ Kjell Orsborn

More information

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n)

Tables so far. set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n) Hash Tables Tables so far set() get() delete() BST Average O(lg n) O(lg n) O(lg n) Worst O(n) O(n) O(n) RB Tree Average O(lg n) O(lg n) O(lg n) Worst O(lg n) O(lg n) O(lg n) Table naïve array implementation

More information

6. Storage and File Structures

6. Storage and File Structures ECS-165A WQ 11 110 6. Storage and File Structures Goals Understand the basic concepts underlying different storage media, buffer management, files structures, and organization of records in files. Contents

More information

Ordered Lists and Binary Trees

Ordered Lists and Binary Trees Data Structures and Algorithms Ordered Lists and Binary Trees Chris Brooks Department of Computer Science University of San Francisco Department of Computer Science University of San Francisco p.1/62 6-0:

More information

Questions 1 through 25 are worth 2 points each. Choose one best answer for each.

Questions 1 through 25 are worth 2 points each. Choose one best answer for each. Questions 1 through 25 are worth 2 points each. Choose one best answer for each. 1. For the singly linked list implementation of the queue, where are the enqueues and dequeues performed? c a. Enqueue in

More information

Algorithms and Data Structures

Algorithms and Data Structures Algorithms and Data Structures Part 2: Data Structures PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering (CiE) Summer Term 2016 Overview general linked lists stacks queues trees 2 2

More information

DATA STRUCTURES USING C

DATA STRUCTURES USING C DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

Chapter 13 File and Database Systems

Chapter 13 File and Database Systems Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation

More information

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly

More information

Database 2 Lecture I. Alessandro Artale

Database 2 Lecture I. Alessandro Artale Free University of Bolzano Database 2. Lecture I, 2003/2004 A.Artale (1) Database 2 Lecture I Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 artale@inf.unibz.it http://www.inf.unibz.it/

More information

Data Structure with C

Data Structure with C Subject: Data Structure with C Topic : Tree Tree A tree is a set of nodes that either:is empty or has a designated node, called the root, from which hierarchically descend zero or more subtrees, which

More information

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D.

1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D. 1. The memory address of the first element of an array is called A. floor address B. foundation addressc. first address D. base address 2. The memory address of fifth element of an array can be calculated

More information

The Advantages and Disadvantages of Network Computing Nodes

The Advantages and Disadvantages of Network Computing Nodes Big Data & Scripting storage networks and distributed file systems 1, 2, in the remainder we use networks of computing nodes to enable computations on even larger datasets for a computation, each node

More information

Converting a Number from Decimal to Binary

Converting a Number from Decimal to Binary Converting a Number from Decimal to Binary Convert nonnegative integer in decimal format (base 10) into equivalent binary number (base 2) Rightmost bit of x Remainder of x after division by two Recursive

More information

Binary Trees and Huffman Encoding Binary Search Trees

Binary Trees and Huffman Encoding Binary Search Trees Binary Trees and Huffman Encoding Binary Search Trees Computer Science E119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Motivation: Maintaining a Sorted Collection of Data A data dictionary

More information

Structure for String Keys

Structure for String Keys Burst Tries: A Fast, Efficient Data Structure for String Keys Steen Heinz Justin Zobel Hugh E. Williams School of Computer Science and Information Technology, RMIT University Presented by Margot Schips

More information

Sample Questions Csci 1112 A. Bellaachia

Sample Questions Csci 1112 A. Bellaachia Sample Questions Csci 1112 A. Bellaachia Important Series : o S( N) 1 2 N N i N(1 N) / 2 i 1 o Sum of squares: N 2 N( N 1)(2N 1) N i for large N i 1 6 o Sum of exponents: N k 1 k N i for large N and k

More information

Chapter 12 File Management

Chapter 12 File Management Operating Systems: Internals and Design Principles Chapter 12 File Management Eighth Edition By William Stallings Files Data collections created by users The File System is one of the most important parts

More information

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child

Binary Search Trees. Data in each node. Larger than the data in its left child Smaller than the data in its right child Binary Search Trees Data in each node Larger than the data in its left child Smaller than the data in its right child FIGURE 11-6 Arbitrary binary tree FIGURE 11-7 Binary search tree Data Structures Using

More information

Query Processing C H A P T E R12. Practice Exercises

Query Processing C H A P T E R12. Practice Exercises C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass

More information

External Memory Geometric Data Structures

External Memory Geometric Data Structures External Memory Geometric Data Structures Lars Arge Department of Computer Science University of Aarhus and Duke University Augues 24, 2005 1 Introduction Many modern applications store and process datasets

More information

Persistent Binary Search Trees

Persistent Binary Search Trees Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents

More information

Binary search tree with SIMD bandwidth optimization using SSE

Binary search tree with SIMD bandwidth optimization using SSE Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous

More information

In-Memory Databases MemSQL

In-Memory Databases MemSQL IT4BI - Université Libre de Bruxelles In-Memory Databases MemSQL Gabby Nikolova Thao Ha Contents I. In-memory Databases...4 1. Concept:...4 2. Indexing:...4 a. b. c. d. AVL Tree:...4 B-Tree and B+ Tree:...5

More information

Merkle Hash Trees for Distributed Audit Logs

Merkle Hash Trees for Distributed Audit Logs Merkle Hash Trees for Distributed Audit Logs Subject proposed by Karthikeyan Bhargavan Karthikeyan.Bhargavan@inria.fr April 7, 2015 Modern distributed systems spread their databases across a large number

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2016S-06 Binary Search Trees David Galles Department of Computer Science University of San Francisco 06-0: Ordered List ADT Operations: Insert an element in the list

More information

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT

DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT DNS LOOKUP SYSTEM DATA STRUCTURES AND ALGORITHMS PROJECT REPORT By GROUP Avadhut Gurjar Mohsin Patel Shraddha Pandhe Page 1 Contents 1. Introduction... 3 2. DNS Recursive Query Mechanism:...5 2.1. Client

More information

M-way Trees and B-Trees

M-way Trees and B-Trees Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Once upon a time... in a course that we all like to

More information

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2)

Sorting revisited. Build the binary search tree: O(n^2) Traverse the binary tree: O(n) Total: O(n^2) + O(n) = O(n^2) Sorting revisited How did we use a binary search tree to sort an array of elements? Tree Sort Algorithm Given: An array of elements to sort 1. Build a binary search tree out of the elements 2. Traverse

More information

Couchbase Server Under the Hood

Couchbase Server Under the Hood Couchbase Server Under the Hood An Architectural Overview Couchbase Server is an open-source distributed NoSQL document-oriented database for interactive applications, uniquely suited for those needing

More information

Hash Tables. Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited

Hash Tables. Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited Hash Tables Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited We ve considered several data structures that allow us to store and search for data

More information

How To Create A Tree From A Tree In Runtime (For A Tree)

How To Create A Tree From A Tree In Runtime (For A Tree) Binary Search Trees < 6 2 > = 1 4 8 9 Binary Search Trees 1 Binary Search Trees A binary search tree is a binary tree storing keyvalue entries at its internal nodes and satisfying the following property:

More information

Vector storage and access; algorithms in GIS. This is lecture 6

Vector storage and access; algorithms in GIS. This is lecture 6 Vector storage and access; algorithms in GIS This is lecture 6 Vector data storage and access Vectors are built from points, line and areas. (x,y) Surface: (x,y,z) Vector data access Access to vector

More information

Full and Complete Binary Trees

Full and Complete Binary Trees Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

More information

Lecture Notes on Binary Search Trees

Lecture Notes on Binary Search Trees Lecture Notes on Binary Search Trees 15-122: Principles of Imperative Computation Frank Pfenning André Platzer Lecture 17 October 23, 2014 1 Introduction In this lecture, we will continue considering associative

More information

Chapter 1 File Organization 1.0 OBJECTIVES 1.1 INTRODUCTION 1.2 STORAGE DEVICES CHARACTERISTICS

Chapter 1 File Organization 1.0 OBJECTIVES 1.1 INTRODUCTION 1.2 STORAGE DEVICES CHARACTERISTICS Chapter 1 File Organization 1.0 Objectives 1.1 Introduction 1.2 Storage Devices Characteristics 1.3 File Organization 1.3.1 Sequential Files 1.3.2 Indexing and Methods of Indexing 1.3.3 Hash Files 1.4

More information

A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES

A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES ULFAR ERLINGSSON, MARK MANASSE, FRANK MCSHERRY MICROSOFT RESEARCH SILICON VALLEY MOUNTAIN VIEW, CALIFORNIA, USA ABSTRACT Recent advances in the

More information

Lecture Notes on Binary Search Trees

Lecture Notes on Binary Search Trees Lecture Notes on Binary Search Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 17 March 17, 2010 1 Introduction In the previous two lectures we have seen how to exploit the structure

More information

Chapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search

Chapter Objectives. Chapter 9. Sequential Search. Search Algorithms. Search Algorithms. Binary Search Chapter Objectives Chapter 9 Search Algorithms Data Structures Using C++ 1 Learn the various search algorithms Explore how to implement the sequential and binary search algorithms Discover how the sequential

More information

Project Group High- performance Flexible File System 2010 / 2011

Project Group High- performance Flexible File System 2010 / 2011 Project Group High- performance Flexible File System 2010 / 2011 Lecture 1 File Systems André Brinkmann Task Use disk drives to store huge amounts of data Files as logical resources A file can contain

More information

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation

Memory Allocation. Static Allocation. Dynamic Allocation. Memory Management. Dynamic Allocation. Dynamic Storage Allocation Dynamic Storage Allocation CS 44 Operating Systems Fall 5 Presented By Vibha Prasad Memory Allocation Static Allocation (fixed in size) Sometimes we create data structures that are fixed and don t need

More information

Fundamental Algorithms

Fundamental Algorithms Fundamental Algorithms Chapter 7: Hash Tables Michael Bader Winter 2014/15 Chapter 7: Hash Tables, Winter 2014/15 1 Generalised Search Problem Definition (Search Problem) Input: a sequence or set A of

More information

Chapter 10: Storage and File Structure

Chapter 10: Storage and File Structure Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files Data-Dictionary Storage

More information

Original-page small file oriented EXT3 file storage system

Original-page small file oriented EXT3 file storage system Original-page small file oriented EXT3 file storage system Zhang Weizhe, Hui He, Zhang Qizhen School of Computer Science and Technology, Harbin Institute of Technology, Harbin E-mail: wzzhang@hit.edu.cn

More information

Binary Search Trees 3/20/14

Binary Search Trees 3/20/14 Binary Search Trees 3/0/4 Presentation for use ith the textbook Data Structures and Algorithms in Java, th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldasser, Wiley, 04 Binary Search Trees 4

More information

Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design

Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design Physical Database Design Process Physical Database Design Process The last stage of the database design process. A process of mapping the logical database structure developed in previous stages into internal

More information