How To Optimize A Trie Structure

Size: px
Start display at page:

Download "How To Optimize A Trie Structure"

Transcription

1 , October, 2014, San Francisco, USA Evaluating Alternative Structures for Prefix Trees Feras Hanandeh, Izzat Alsmadi and Muhammad M. Kwafha Abstract Prefix trees or tries are data structures that are used to store data or index of data. The goal is to be able to store and retrieve data by executing queries in quick and reliable manners. In principle, the structure of the trie depends on having letters in nodes at the different levels to point to the actual words in the leafs. However, the exact structure of the trie may vary based on several aspects. In this paper, we evaluated different structures for building tries. Using datasets of words of different sizes, we evaluated the different forms of trie structures. Results showed that some characteristics may impact significantly, positively or negatively, the size and the performance of the trie. We investigated different forms and structures for the trie. Results showed that using an array of pointers in each level to represent the different alphabet letters is the best choice. it is a leaf). This can dynamically change if more words are added. Finding a word in a trie depends on the size of the ree or the number of words along with its structure. The depth of the nodes that a query can go depends on the number of words in the searched for word. If the word does not exist in the tree, the longest node sequence is performed. Figure 1 show a simple example of trie structure where each node stores an array of alphabet keys or pointers. Index Terms data structures, indexing, tree structure, trie, information retrieval. I. INTRODUCTION Data structures are used to save and retrieve a large amount of aggregated data. They can vary in structure based on the nature or the purpose for having or using such data structure. Performance or the speed of storing and accessing data in those data structures are the main characteristics that can judge the quality of any data structure. Any current natural language such as: English, French, Arabic, etc. can have number of words up to one million words although dictionaries may not contain all such words especially as Languages continuously grow to add new words or borrow words from other languages. Current versions of Oxford English Dictionary may have up to half million words. As such, a software product or web application that needs or use a dictionary should have efficient data structures for effective: storage, access and expansion of data or words. Prefix trees or tries are data structures that are usually used to store dictionaries or words. Their nature of structure can facilitate retrieving queried words quickly. In each trie, nodes form the children that can further be parents for lower nodes. Nodes contain letters that represent keys or pointers to words (or the rest of the words) at the lowest leaf levels. In principle, each node can contain the searched for word (if Feras Hanandeh, Dept. of Computer Information Systems, Faculty of Prince Al-Hussein Bin Abdallah II For Information Technology, Hashemite University, Zarqa, Jordan, feras@hu.edu.jo Izzat Alsmadi, Dept. of Computer Information Systems, IT & CS Faculty, Prince Sultan University, Riyadh, KSA, ialsmadi@cis.psu.edu.sa Muhammad M. Kwafha,Dept. of Computer & Information Technology, Al- Balqa Applied University, Irbid, Jordan, mohkofahi@hotmail.com. Figure 1: A trie structure for example 1 Figure 2 shows another example of a trie where each node contains the array of alphabets and an underlined letter indicates that this letter is a leaf for at least one word. As we will describe later, using array of pointers can optimize size in many cases. Figure 2: A trie structure for example 2 There are several factors that should be considered when designing a trie. For example: The memory size that each node should have affects the total trie size. The size of the whole trie is the aggregated size of all its nodes. Nodes can be empty, store one letter as

2 , October, 2014, San Francisco, USA pointer, or store up to certain number of letters (say 20) to represent words. A good trie structure can store the same number of words with minimum size. Of course taking a small number of words is not enough as a good trie structure. It should not be customized based on a limited number of words (for example 10 words that all start with the letters: AB). Such trie may be better in comparison with others based only on this specific set of words. The node level specifies its location in words. For example, a node with letter E in the second level may represent all words that their second letter is E (e.g. tea, sea, key). If the nodes represent single letters, then examples of match words will only be for example: sea, set, sell, seat, etc. While in the current large amount of available memories and disk sizes, size may not be a critical factor, however, if such trie can achieve lowering the overall size without impacting other aspects such as speed of storage or retrieval, then this can be still a major comparison factor. Trie structure nature: certain structures can facilitate not only quicker access and retrieval of information; they can further facilitate smooth expansion of such structures. In many cases, it is necessary for a dictionary to accept the addition of new words and hence the structure should facilitate this expansion smoothly and dynamically. This can be for either fragmentation or for allocation/reallocation. In some other cases, it maybe necessary to compress, encode or encrypt data in those tree structures. The performance of the structure can be highly dependent on the dataset of words, the distribution of words, or the number of letters in the words. It can be also language dependent especially if the number of letters in the language alphabet is small or large. Each node of a tree store two kinds of data: user data and trie-related data. Let say alphabet has at most 256 letters, thus letter could be saved in a byte. For space optimization, it is best to store nodes in a pair form (e.g. KeyValuePair<char, TrieNode>). The first part which will take only one charachter stores the key part of the node which is one charachter size. The second part is a node, an object which further can be represented by charachters and nodes. As we will show later on, this is shown to be the best optimized approach to use for building a trie. This can be represented by a template in C++ or generic in Java and C#. While the subject of prefix trees or tries is not new, new techniques to improve the structure of the trie showed significant improvements. This is largely due to the creation of new programming language components or libraries. The rest of this paper is organized as follows: Section two presents' related studies to the paper subject, section three shows the proposed methodology, section four presents implementation and experimental results, Last section presents the conclusions and future work. II. RELATED WORK Data structures such as trees evolved from the early 19th century. Tries evolved from trees. They were called different names such as: Radix tree, compact trie, bucket trie, prefix tree, Crit bit tree, and PATRICIA (Practical Algorithm to Retrieve Information Coded in Alphanumeric). Some current papers argue that new data structures such as Hash tables or Linked structures can be better alternatives for tries in terms of flexibility and performance. In the current general form, it is believed that tries were first proposed by Morrison [1].Those different names may have some differences in the detail structure. For example, unlike PATRICIA tree nodes that store keys and words, with the exception of leaf nodes, nodes in the trie work merely as pointers to words [9, 10]. Before Morrison, Fredkin published a paper titled (Trie Memory) [2] describing the trie structure. Fredkin wrote in 2008 As defined by me, nearly 50 years ago, it is properly pronounced "tree" as in the word "retrieval". At least that was my intent when I gave it the name "Trie". The idea behind the name was to combine reference to both the structure (a tree structure) and a major purpose (data storage and retrieval). Looking at the size issue, it is estimated that using PATRICIA trie for 100,000 words, in current typical desktop, trie size can be around 5 MBytes (assuming average size of english keywords of 5 letters. In [3], Heinz claimed that Burst-trie version of prefix trees is the fastest. This trie tried to further reduce the number of nodes by collapsing similar nodes that share same prefixes. The Buckets or nodes were represented using linked lists. Later papers claimed that Bursts can be further reduced using caches. The main goal or enhancement of Burst trie over traditional trie is in reducing the number of required search cycles to retrieve subject or query. Tanhermhong et al in [12] proposed a new tri-based structure called (structure-shared trie) with the goal of data compression to reduce size. The main idea is to utilize unused space within the trie structure. Using tries for approximate string matching discussed by Shang and Merrett in [11]. The proposed approach claimed to have a search process that is independent of the subject document size. The search depends on an approximate match between query and searched for text. Such algorithms can be used for recommendations systems where the information system can suggest alternative approximate terms for users queries. Askitis and Sinha proposed HAT-trie as an alternative better trie representation [4]. This is based on the previous approach: Bucket-trie where Buckets are divided using B- tree splitting. The paper tried to improve Burst-tries through caching and using Hash tables. Authors assembled several datasets of texts for comparison for their data structure with some known ones (i.e. known design structures for tries) and showed improvement in performance and memory size.

3 , October, 2014, San Francisco, USA Askitis and Sinha [13] proposed an approach to combine a trie and a Hashtable (Hat trie). The goal is to optimize in memory data usage in terms of efficiency and speed. Askitis and Zobel in [15] discussed optimizing efficient data structures such as Hash tables and Burst tries based on caching. Authors claimed that storage can be significantly reduced. be used in a particular trie instance. This is why many recent trie structure approaches tried to optimize space through some compression techniques. In [6], Knuth proposed enhancing performance in tries through flexible size pointers of array lists in comparison with original fixed size pointers. This requires non-root nodes to be highly occupied in order for such alternative to be competitive. Nodes can share keys with neighbors rather than split when get full. Behdadfar and Saidi used trie search for IP routing table search optimization where they tried to scale tries to deal with large size data structures such as those of IP addresses [7]. They tried to sort nodes through encoding their addresses with numbers which can be reached faster based on encoded numbers. They found out that some methods can be optimized for search or query as methods maybe optimized for an addition or update process to the trie table. Bando and Chao tried also to use trie compression for enhancing IP lookup [8]. The proposed Flash trie data structure used compression techniques to reduce the overall size of the trie. A recent paper, Leis et al in 2013 [14], discussed an approach to enhance Radix trees for efficient indexing in main memory. The approach is compared in terms of performance with Hash-tables. Several other recent papers such as Böhm et al in [15] discussed recent issues related to in-memory trie structures optimization in terms of size, efficiency, performance, etc. III. TRIE STRUCTURE IMPLEMENTATION We tried several alternative structures for trie or prefix trees. The different alternatives are based either on using different variable or data types (e.g. strings, array of strings, characters, array of characters, templates, generics, etc.) or based on the way the trie nodes are formed. The general agreed upon structure for the trie assume nodes in the different levels with alphabet letters in those nodes as prefixes to the words. Each node can hold only one letter from each word. However, if the letter is a leaf, it can hold more than one letter if no split is required. Later on, an addition of a word may require this node of several letters to be further divided. In other words, in a complete full structure, a word of three letters will span over three nodes letter in each node. That does not prevent a single node to be an array of pointers of several different letters as it can be still seen as a single node given the ability of some data types representation to hold an array of pointers. Figure 3 shows a sample trie structure. Not all node elements should Figure 3: A trie structure New technologies and data structures impact the usage of older data structures such as trees. As such, we tried to evaluate the impact of utilizing new programming or data type elements in the structure of the trie in terms of size or memory usage and space utilization. In order to make a fair comparison, we implemented all different alternative implementation in one programming language (.NET C#). For each program, three classes are created, Trie, Node and Main class. In the main class, for each experiment a trie is created, experimental text files are parsed and save in a string array word by word. After that, words from the string array are inserted in the trie nodes through a loop. Size metrics are then collected at the end of the insertion process. IV. EXPERIMENT RESULTS To investigate the efficiency of the different evaluated trie structures, several text files of words from English dictionary are assembled. They are selected randomly from the dictionary with only size or number of words as the main concern. The small sets are selected all from the start of the dictionary. This means that all those below a certain size (e.g words) will be from the first English letter (A). As tries or prefix trees structure depends on the first letters of the words or prefixes, selecting all words that start from one letter may indicate different results from selecting a set of words that span over all alphabet letters. Later random selection of words from all dictionary locations will be implemented to ensure a fair distribution of words. In the largest tested size, we tried to a dictionary file of 349,900 words or MByte memory size. Table 1 below summarizes files selected for testing, number of words and approximate memory size. Memory size is approximate since we selected each test set based on the number of words and repeat the process several times where some selections maybe slightly different from others based on the selected words.

4 , October, 2014, San Francisco, USA TABLE 1 NO OF WORDS AND SIZE OF EVALUATED DICTIONARY FILES No. Of words Size (MByte) No. of selections , , , , , Size of tries and nodes is measured based on debugging source code and also using memory profilers. In the first experiment three different alternative trie structures (prog1, prog2, and prog3) were experimented on small data files. For each selected size (i.e. 100, 500, 2,000, 5,000, and 10,000 words), 5 different selections are taken and average values of those five selections is considered. The first two columns (size of created nodes, and number of nodes) were collected from memory profilers while the last two (root direct children and trie size) were collected from code debugging. No. Of words (program) 100(1) TABLE 2 SUMMARY OF RESULTS FOR THE FIRST EXPERIMENT Size of created Nodes (Kbyte) No. of nodes in each level (max) Root direct children Trie size: children/nodes Results in Table 2 shows that trie structure of program two is better in terms of size utilization for all text file sizes. Number of nodes in each level column has the same value for all program alternatives as this has to do with the general trie-node structure which is similar. However, the first column on the right (trie size) shows significantly why program two is the best in terms of size utilization. Trie size is a method we defined (Figure 4) to show the total number of actually utilized nodes. Notice that this number changes based on stored data. This value is the same for programs 1 and 3. The value of actual trie size is significantly reduced in program 2 when size of text file is large. The key to the difference between program 2 at one side from programs 1 and 3 at another side can be seen looking the second column from the right (number of root direct children). Program two creates for each trie level nodes of the number of alphabets regardless of the size or the nature of the data. However, for programs 1 and 3 and since all selected data in Table 2 are from the letter (A) in the dictionary, those two programs have one child node from the root which is the node of the (A) letter. Program 2 created 27 nodes even if one node or letter is utilized. However, each node stores two elements: a key of type character for the key letter in the node and 27 elements of type Node. The utilization of this structure can be further seen if we selected words from all letters of the dictionary. In the second experiment three different alternative trie structures (prog1, prog2, and prog3) were experimented on large files. We used the two large files (i.e. 72,858 and 349,900 words) to evaluate the difference between the three different alternative structures. Table 3 below shows the summary of results. 100(2) 100(3) 500(1) 500(2) 500(3) 2000(1) 2000(2) 2000(3) 5000(1) 5000(2) 5000(3) 10000(1) 10000(2) 10000(3) , , , , , , , , Figure 4: Trie-size: Number of utilized nodes Figure 5 and 6 show snapshots from ANTS memory profiler. Two of the column variables are collected from this profiler: Nodes size and node or live instances. This is repeated for every tested file. In Figure 5 TrieNode [] is an array of nodes that was implemented in one version of the programs [program 2].

5 , October, 2014, San Francisco, USA Figure 5: A snapshot view from ANTS memory profiler: array nodes Figure 6: A snapshot view from ANTS memory profiler: single nodes Table 3 shows a summary of results for the second experiment with the large size text files. The experiment is conducted using the three developed versions of Trie. Table 3 shows summary of results. We focused only on code data since used memory profiler showed inconsistent results for large size data structures. TABLE 3 SUMMARY OF RESULTS FOR THE SECOND EXPERIMENT No. Of words (program) 72,858(1) 72,858(2) 72,858(3) 349,900(1) 349,900(2) 349,900(3) Root direct children Trie size: children/nodes 27 14, , , , , ,931 Table 3 shows that all alternative programs used the 27 first level nodes due to the large size used datasets. Size in program 2 is significantly less than the size of the other two programs in most cases. As mentioned earlier, results can vary based on the nature of data and the hardware or computer components used in the experiments or evaluation. V. CONCLUSIONS AND FUTURE WORK The continuous development and production of new hardware and programming components and data structures have impacts on several quality attributes related to software products. In this paper, we evaluated different alternative structures for prefix trees or tries. We used three different versions to implement the structure of tries. We then assembled several testing documents with different sizes that include words from English dictionary. Metrics related to size and usage of memory and computer resources are collected for all investigated test files. Results showed while all different evaluated versions share the same common trie structure, yet using some of the new programming components can significantly improve the trie structure and optimize its memory usage. While in this paper, we tried to focus only on one aspect related to size, several other aspects should be evaluated related to the speed of trie data insertion and update. The speed of query retrieval is also important especially for some applications such as library or web indexers. In the current large size applications and data, using a proper data structure that can optimize memory and resources usage and accelerate the speed of adding and retrieving data is always important and necessary. REFERENCES [1] Morrison, D. (1968) ' PATRICIA Practical Algorithm To Retrieve Information Coded in Alphanumeric', Journal of the ACM (JACM), Vol. 15, No. 4, pp [2] Fredkin, E. (1960) 'Trie Memory', CACM, Vol. 3, No. 9, pp [3] Heinz, S., Zobel, J. and Williams, H. (2002) 'Burst tries: a fast, efficient data structure for string keys', ACM Transactions on Information Systems (TOIS), Vol. 20, No. 2, pp [4] Askitis, N. and Sinha, R. (2007) 'HAT-trie: a cache-conscious triebased data structure for strings', ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science, Vol. 62, pp [5] Askitis,N. and Zobel,J(2011).' Redesigning the String Hash Table, Burst Trie, and BST to Exploit Cache', ACM Journal of Experimental Algorithmics, Vol. 15, No. 1. [6] Knuth, D. (1998) The art of computer programming, 2nd ed., Redwood City, CA, USA. [7] Behdadfar, M. and Saidi, H.(2008) ' The CPBT: a method for searching the prefixes using coded prefixes in B-tree', NETWORKING'08 Proceedings of the 7th international IFIP-TC6 networking conference on AdHoc and sensor networks, wireless networks, next generation internet, pp [8] Bando, M. and Chao, J.(2010)' FlashTrie: Hash-based Prefix- Compressed Trie for IP Route Lookup Beyond 100Gbps', IEEE Communications Society subject matter experts for publication in the IEEE INFOCOM 2010 proceedings, pp [9] Knizhnik, K. (2008) 'Patricia Tries: A Better Index For Prefix Searches', Dr. Dobb's Journal. [10] Knessl,C. and Szpankowski,W.(1999) ' Limit laws for heights in generalized tries and PATRICIA tries'. [11] Shang, H. and Merrettal, T. (1996) 'Tries for Approximate String Matching', IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No. 4, pp [12] Tanhermhong, T., Theeramunkong, T. and Chinnan, W. (2001) 'A Structure-Shared Trie Compression Method', Proceedings of The 15th Pacific Asia Conference.

6 , October, 2014, San Francisco, USA [13] Askitis, N. and Sinha, R. (2010) 'cache and space efficient tries for strings', The VLDB Journal The International Journal on Very Large Data Bases, Vol. 19, No. 5, pp [14] Leis, V., Kemper, A. and Neumann, T.(2013) ' The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases', 29th IEEE International Conference on Data Engineering (ICDE 2013)., Brisbane, Australia, pp [15] Böhm, M., Schlegel, B., Volk, P.B., Fischer, U., Habich, D. and Lehner, W., (2011) 'Efficient in-memory indexing with generalized prefix trees', In BTW, pp

A Comparison of Dictionary Implementations

A Comparison of Dictionary Implementations A Comparison of Dictionary Implementations Mark P Neyer April 10, 2009 1 Introduction A common problem in computer science is the representation of a mapping between two sets. A mapping f : A B is a function

More information

Structure for String Keys

Structure for String Keys Burst Tries: A Fast, Efficient Data Structure for String Keys Steen Heinz Justin Zobel Hugh E. Williams School of Computer Science and Information Technology, RMIT University Presented by Margot Schips

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

DATABASE DESIGN - 1DL400

DATABASE DESIGN - 1DL400 DATABASE DESIGN - 1DL400 Spring 2015 A course on modern database systems!! http://www.it.uu.se/research/group/udbl/kurser/dbii_vt15/ Kjell Orsborn! Uppsala Database Laboratory! Department of Information

More information

Data Structures For IP Lookup With Bursty Access Patterns

Data Structures For IP Lookup With Bursty Access Patterns Data Structures For IP Lookup With Bursty Access Patterns Sartaj Sahni & Kun Suk Kim sahni, kskim @cise.ufl.edu Department of Computer and Information Science and Engineering University of Florida, Gainesville,

More information

SMALL INDEX LARGE INDEX (SILT)

SMALL INDEX LARGE INDEX (SILT) Wayne State University ECE 7650: Scalable and Secure Internet Services and Architecture SMALL INDEX LARGE INDEX (SILT) A Memory Efficient High Performance Key Value Store QA REPORT Instructor: Dr. Song

More information

Storage Management for Files of Dynamic Records

Storage Management for Files of Dynamic Records Storage Management for Files of Dynamic Records Justin Zobel Department of Computer Science, RMIT, GPO Box 2476V, Melbourne 3001, Australia. jz@cs.rmit.edu.au Alistair Moffat Department of Computer Science

More information

CSE 326: Data Structures B-Trees and B+ Trees

CSE 326: Data Structures B-Trees and B+ Trees Announcements (4//08) CSE 26: Data Structures B-Trees and B+ Trees Brian Curless Spring 2008 Midterm on Friday Special office hour: 4:-5: Thursday in Jaech Gallery (6 th floor of CSE building) This is

More information

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers

B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers B+ Tree and Hashing B+ Tree Properties B+ Tree Searching B+ Tree Insertion B+ Tree Deletion Static Hashing Extendable Hashing Questions in pass papers B+ Tree Properties Balanced Tree Same height for paths

More information

Heaps & Priority Queues in the C++ STL 2-3 Trees

Heaps & Priority Queues in the C++ STL 2-3 Trees Heaps & Priority Queues in the C++ STL 2-3 Trees CS 3 Data Structures and Algorithms Lecture Slides Friday, April 7, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks

More information

Original-page small file oriented EXT3 file storage system

Original-page small file oriented EXT3 file storage system Original-page small file oriented EXT3 file storage system Zhang Weizhe, Hui He, Zhang Qizhen School of Computer Science and Technology, Harbin Institute of Technology, Harbin E-mail: wzzhang@hit.edu.cn

More information

Electronic Document Management Using Inverted Files System

Electronic Document Management Using Inverted Files System EPJ Web of Conferences 68, 0 00 04 (2014) DOI: 10.1051/ epjconf/ 20146800004 C Owned by the authors, published by EDP Sciences, 2014 Electronic Document Management Using Inverted Files System Derwin Suhartono,

More information

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92.

CSE 326, Data Structures. Sample Final Exam. Problem Max Points Score 1 14 (2x7) 2 18 (3x6) 3 4 4 7 5 9 6 16 7 8 8 4 9 8 10 4 Total 92. Name: Email ID: CSE 326, Data Structures Section: Sample Final Exam Instructions: The exam is closed book, closed notes. Unless otherwise stated, N denotes the number of elements in the data structure

More information

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm

Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm Lempel-Ziv Coding Adaptive Dictionary Compression Algorithm 1. LZ77:Sliding Window Lempel-Ziv Algorithm [gzip, pkzip] Encode a string by finding the longest match anywhere within a window of past symbols

More information

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions

CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly

More information

Chapter 8: Structures for Files. Truong Quynh Chi tqchi@cse.hcmut.edu.vn. Spring- 2013

Chapter 8: Structures for Files. Truong Quynh Chi tqchi@cse.hcmut.edu.vn. Spring- 2013 Chapter 8: Data Storage, Indexing Structures for Files Truong Quynh Chi tqchi@cse.hcmut.edu.vn Spring- 2013 Overview of Database Design Process 2 Outline Data Storage Disk Storage Devices Files of Records

More information

Symbol Tables. Introduction

Symbol Tables. Introduction Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The

More information

S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India e-mail: nellailath@yahoo.co.in. Dr. R. M.

S. Muthusundari. Research Scholar, Dept of CSE, Sathyabama University Chennai, India e-mail: nellailath@yahoo.co.in. Dr. R. M. A Sorting based Algorithm for the Construction of Balanced Search Tree Automatically for smaller elements and with minimum of one Rotation for Greater Elements from BST S. Muthusundari Research Scholar,

More information

Chapter 13: Query Processing. Basic Steps in Query Processing

Chapter 13: Query Processing. Basic Steps in Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers

Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers ABSTRACT Jing Fu KTH, Royal Institute of Technology Stockholm, Sweden jing@kth.se Virtual routers are a promising

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1 Slide 13-1 Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible

More information

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team

CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team CS104: Data Structures and Object-Oriented Design (Fall 2013) October 24, 2013: Priority Queues Scribes: CS 104 Teaching Team Lecture Summary In this lecture, we learned about the ADT Priority Queue. A

More information

Raima Database Manager Version 14.0 In-memory Database Engine

Raima Database Manager Version 14.0 In-memory Database Engine + Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized

More information

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm

Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College

More information

DiskTrie: An Efficient Data Structure Using Flash Memory for Mobile Devices

DiskTrie: An Efficient Data Structure Using Flash Memory for Mobile Devices DiskTrie: An Efficient Data Structure Using Flash Memory for Mobile Devices (Extended Abstract) N.M. Mosharaf Kabir Chowdhury, Md. Mostofa Akbar, and M. Kaykobad Department of Computer Science & Engineering

More information

IP Address Lookup Using A Dynamic Hash Function

IP Address Lookup Using A Dynamic Hash Function IP Address Lookup Using A Dynamic Hash Function Xiaojun Nie David J. Wilson Jerome Cornet Gerard Damm Yiqiang Zhao Carleton University Alcatel Alcatel Alcatel Carleton University xnie@math.carleton.ca

More information

Analysis of Compression Algorithms for Program Data

Analysis of Compression Algorithms for Program Data Analysis of Compression Algorithms for Program Data Matthew Simpson, Clemson University with Dr. Rajeev Barua and Surupa Biswas, University of Maryland 12 August 3 Abstract Insufficient available memory

More information

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Chapter 13 Disk Storage, Basic File Structures, and Hashing. Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright 2004 Pearson Education, Inc. Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files

More information

Efficient Multi-Feature Index Structures for Music Data Retrieval

Efficient Multi-Feature Index Structures for Music Data Retrieval header for SPIE use Efficient Multi-Feature Index Structures for Music Data Retrieval Wegin Lee and Arbee L.P. Chen 1 Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan 300,

More information

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Chapter 13. Disk Storage, Basic File Structures, and Hashing Chapter 13 Disk Storage, Basic File Structures, and Hashing Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and Extendible Hashing

More information

An overview of FAT12

An overview of FAT12 An overview of FAT12 The File Allocation Table (FAT) is a table stored on a hard disk or floppy disk that indicates the status and location of all data clusters that are on the disk. The File Allocation

More information

Rethinking SIMD Vectorization for In-Memory Databases

Rethinking SIMD Vectorization for In-Memory Databases SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest

More information

Big Data and Scripting. Part 4: Memory Hierarchies

Big Data and Scripting. Part 4: Memory Hierarchies 1, Big Data and Scripting Part 4: Memory Hierarchies 2, Model and Definitions memory size: M machine words total storage (on disk) of N elements (N is very large) disk size unlimited (for our considerations)

More information

Databases and Information Systems 1 Part 3: Storage Structures and Indices

Databases and Information Systems 1 Part 3: Storage Structures and Indices bases and Information Systems 1 Part 3: Storage Structures and Indices Prof. Dr. Stefan Böttcher Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents: - database buffer -

More information

FPGA-based Multithreading for In-Memory Hash Joins

FPGA-based Multithreading for In-Memory Hash Joins FPGA-based Multithreading for In-Memory Hash Joins Robert J. Halstead, Ildar Absalyamov, Walid A. Najjar, Vassilis J. Tsotras University of California, Riverside Outline Background What are FPGAs Multithreaded

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 7, July 23 ISSN: 2277 28X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Greedy Algorithm:

More information

Ordered Lists and Binary Trees

Ordered Lists and Binary Trees Data Structures and Algorithms Ordered Lists and Binary Trees Chris Brooks Department of Computer Science University of San Francisco Department of Computer Science University of San Francisco p.1/62 6-0:

More information

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R

Binary Search Trees. A Generic Tree. Binary Trees. Nodes in a binary search tree ( B-S-T) are of the form. P parent. Key. Satellite data L R Binary Search Trees A Generic Tree Nodes in a binary search tree ( B-S-T) are of the form P parent Key A Satellite data L R B C D E F G H I J The B-S-T has a root node which is the only node whose parent

More information

Data Warehousing und Data Mining

Data Warehousing und Data Mining Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data

More information

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing

Chapter 13. Chapter Outline. Disk Storage, Basic File Structures, and Hashing Chapter 13 Disk Storage, Basic File Structures, and Hashing Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files

More information

Scalable Prefix Matching for Internet Packet Forwarding

Scalable Prefix Matching for Internet Packet Forwarding Scalable Prefix Matching for Internet Packet Forwarding Marcel Waldvogel Computer Engineering and Networks Laboratory Institut für Technische Informatik und Kommunikationsnetze Background Internet growth

More information

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:

More information

New Hash Function Construction for Textual and Geometric Data Retrieval

New Hash Function Construction for Textual and Geometric Data Retrieval Latest Trends on Computers, Vol., pp.483-489, ISBN 978-96-474-3-4, ISSN 79-45, CSCC conference, Corfu, Greece, New Hash Function Construction for Textual and Geometric Data Retrieval Václav Skala, Jan

More information

Persistent Binary Search Trees

Persistent Binary Search Trees Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents

More information

Lecture 1: Data Storage & Index

Lecture 1: Data Storage & Index Lecture 1: Data Storage & Index R&G Chapter 8-11 Concurrency control Query Execution and Optimization Relational Operators File & Access Methods Buffer Management Disk Space Management Recovery Manager

More information

Bitmap Index as Effective Indexing for Low Cardinality Column in Data Warehouse

Bitmap Index as Effective Indexing for Low Cardinality Column in Data Warehouse Bitmap Index as Effective Indexing for Low Cardinality Column in Data Warehouse Zainab Qays Abdulhadi* * Ministry of Higher Education & Scientific Research Baghdad, Iraq Zhang Zuping Hamed Ibrahim Housien**

More information

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles B-Trees Algorithms and data structures for external memory as opposed to the main memory B-Trees Previous Lectures Height balanced binary search trees: AVL trees, red-black trees. Multiway search trees:

More information

Dynamic Programming. Lecture 11. 11.1 Overview. 11.2 Introduction

Dynamic Programming. Lecture 11. 11.1 Overview. 11.2 Introduction Lecture 11 Dynamic Programming 11.1 Overview Dynamic Programming is a powerful technique that allows one to solve many different types of problems in time O(n 2 ) or O(n 3 ) for which a naive approach

More information

Limitations of Packet Measurement

Limitations of Packet Measurement Limitations of Packet Measurement Collect and process less information: Only collect packet headers, not payload Ignore single packets (aggregate) Ignore some packets (sampling) Make collection and processing

More information

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer Computers CMPT 125: Lecture 1: Understanding the Computer Tamara Smyth, tamaras@cs.sfu.ca School of Computing Science, Simon Fraser University January 3, 2009 A computer performs 2 basic functions: 1.

More information

Stateful Inspection Firewall Session Table Processing

Stateful Inspection Firewall Session Table Processing International Journal of Information Technology, Vol. 11 No. 2 Xin Li, ZhenZhou Ji, and MingZeng Hu School of Computer Science and Technology Harbin Institute of Technology 92 West Da Zhi St. Harbin, China

More information

Reducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan

Reducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan Reducer Load Balancing and Lazy Initialization in Map Reduce Environment S.Mohanapriya, P.Natesan Abstract Big Data is revolutionizing 21st-century with increasingly huge amounts of data to store and be

More information

ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771

ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771 ENHANCEMENTS TO SQL SERVER COLUMN STORES Anuhya Mallempati #2610771 CONTENTS Abstract Introduction Column store indexes Batch mode processing Other Enhancements Conclusion ABSTRACT SQL server introduced

More information

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium. Chapter 4: Record Storage and Primary File Organization 1 Record Storage and Primary File Organization INTRODUCTION The collection of data that makes up a computerized database must be stored physically

More information

Record Storage and Primary File Organization

Record Storage and Primary File Organization Record Storage and Primary File Organization 1 C H A P T E R 4 Contents Introduction Secondary Storage Devices Buffering of Blocks Placing File Records on Disk Operations on Files Files of Unordered Records

More information

DATA STRUCTURES USING C

DATA STRUCTURES USING C DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give

More information

Longest Common Extensions via Fingerprinting

Longest Common Extensions via Fingerprinting Longest Common Extensions via Fingerprinting Philip Bille, Inge Li Gørtz, and Jesper Kristensen Technical University of Denmark, DTU Informatics, Copenhagen, Denmark Abstract. The longest common extension

More information

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding

Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding Lossless Grey-scale Image Compression using Source Symbols Reduction and Huffman Coding C. SARAVANAN cs@cc.nitdgp.ac.in Assistant Professor, Computer Centre, National Institute of Technology, Durgapur,WestBengal,

More information

APP INVENTOR. Test Review

APP INVENTOR. Test Review APP INVENTOR Test Review Main Concepts App Inventor Lists Creating Random Numbers Variables Searching and Sorting Data Linear Search Binary Search Selection Sort Quick Sort Abstraction Modulus Division

More information

A High-Throughput In-Memory Index, Durable on Flash-based SSD

A High-Throughput In-Memory Index, Durable on Flash-based SSD A High-Throughput In-Memory Index, Durable on Flash-based SSD Insights into the Winning Solution of the SIGMOD Programming Contest 2011 Thomas Kissinger, Benjamin Schlegel, Matthias Boehm, Dirk Habich,

More information

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers

A Comparative Study on Vega-HTTP & Popular Open-source Web-servers A Comparative Study on Vega-HTTP & Popular Open-source Web-servers Happiest People. Happiest Customers Contents Abstract... 3 Introduction... 3 Performance Comparison... 4 Architecture... 5 Diagram...

More information

DNA Mapping/Alignment. Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky

DNA Mapping/Alignment. Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky DNA Mapping/Alignment Team: I Thought You GNU? Lars Olsen, Venkata Aditya Kovuri, Nick Merowsky Overview Summary Research Paper 1 Research Paper 2 Research Paper 3 Current Progress Software Designs to

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

Review of Hashing: Integer Keys

Review of Hashing: Integer Keys CSE 326 Lecture 13: Much ado about Hashing Today s munchies to munch on: Review of Hashing Collision Resolution by: Separate Chaining Open Addressing $ Linear/Quadratic Probing $ Double Hashing Rehashing

More information

Database 2 Lecture I. Alessandro Artale

Database 2 Lecture I. Alessandro Artale Free University of Bolzano Database 2. Lecture I, 2003/2004 A.Artale (1) Database 2 Lecture I Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 artale@inf.unibz.it http://www.inf.unibz.it/

More information

A Survey on Efficient Hashing Techniques in Software Configuration Management

A Survey on Efficient Hashing Techniques in Software Configuration Management A Survey on Efficient Hashing Techniques in Software Configuration Management Bernhard Grill Vienna University of Technology Vienna, Austria Email: e1028282@student.tuwien.ac.at Abstract This paper presents

More information

Class Notes CS 3137. 1 Creating and Using a Huffman Code. Ref: Weiss, page 433

Class Notes CS 3137. 1 Creating and Using a Huffman Code. Ref: Weiss, page 433 Class Notes CS 3137 1 Creating and Using a Huffman Code. Ref: Weiss, page 433 1. FIXED LENGTH CODES: Codes are used to transmit characters over data links. You are probably aware of the ASCII code, a fixed-length

More information

Scalable Bloom Filters

Scalable Bloom Filters Scalable Bloom Filters Paulo Sérgio Almeida Carlos Baquero Nuno Preguiça CCTC/Departamento de Informática Universidade do Minho CITI/Departamento de Informática FCT, Universidade Nova de Lisboa David Hutchison

More information

Load Balancing on a Grid Using Data Characteristics

Load Balancing on a Grid Using Data Characteristics Load Balancing on a Grid Using Data Characteristics Jonathan White and Dale R. Thompson Computer Science and Computer Engineering Department University of Arkansas Fayetteville, AR 72701, USA {jlw09, drt}@uark.edu

More information

To convert an arbitrary power of 2 into its English equivalent, remember the rules of exponential arithmetic:

To convert an arbitrary power of 2 into its English equivalent, remember the rules of exponential arithmetic: Binary Numbers In computer science we deal almost exclusively with binary numbers. it will be very helpful to memorize some binary constants and their decimal and English equivalents. By English equivalents

More information

Database Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques

Database Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques Database Systems Session 8 Main Theme Physical Database Design, Query Execution Concepts and Database Programming Techniques Dr. Jean-Claude Franchitti New York University Computer Science Department Courant

More information

Enhanced Algorithm for Efficient Retrieval of Data from a Secure Cloud

Enhanced Algorithm for Efficient Retrieval of Data from a Secure Cloud OPEN JOURNAL OF MOBILE COMPUTING AND CLOUD COMPUTING Volume 1, Number 2, November 2014 OPEN JOURNAL OF MOBILE COMPUTING AND CLOUD COMPUTING Enhanced Algorithm for Efficient Retrieval of Data from a Secure

More information

10CS35: Data Structures Using C

10CS35: Data Structures Using C CS35: Data Structures Using C QUESTION BANK REVIEW OF STRUCTURES AND POINTERS, INTRODUCTION TO SPECIAL FEATURES OF C OBJECTIVE: Learn : Usage of structures, unions - a conventional tool for handling a

More information

Hash Tables. Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited

Hash Tables. Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited Hash Tables Computer Science E-119 Harvard Extension School Fall 2012 David G. Sullivan, Ph.D. Data Dictionary Revisited We ve considered several data structures that allow us to store and search for data

More information

Number Representation

Number Representation Number Representation CS10001: Programming & Data Structures Pallab Dasgupta Professor, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur Topics to be Discussed How are numeric data

More information

Chapter 12 File Management

Chapter 12 File Management Operating Systems: Internals and Design Principles Chapter 12 File Management Eighth Edition By William Stallings Files Data collections created by users The File System is one of the most important parts

More information

Data Structures and Algorithms

Data Structures and Algorithms Data Structures and Algorithms CS245-2016S-06 Binary Search Trees David Galles Department of Computer Science University of San Francisco 06-0: Ordered List ADT Operations: Insert an element in the list

More information

Analysis of Algorithms I: Binary Search Trees

Analysis of Algorithms I: Binary Search Trees Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary

More information

Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework

Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Lossless Data Compression Standard Applications and the MapReduce Web Computing Framework Sergio De Agostino Computer Science Department Sapienza University of Rome Internet as a Distributed System Modern

More information

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen

CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 14: DATA STORAGE AND REPRESENTATION Data Storage Memory Hierarchy Disks Fields, Records, Blocks Variable-length

More information

In-memory databases and innovations in Business Intelligence

In-memory databases and innovations in Business Intelligence Database Systems Journal vol. VI, no. 1/2015 59 In-memory databases and innovations in Business Intelligence Ruxandra BĂBEANU, Marian CIOBANU University of Economic Studies, Bucharest, Romania babeanu.ruxandra@gmail.com,

More information

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ Answer the following 1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ 2) Which data structure is needed to convert infix notations to postfix notations? Stack 3) The

More information

M-way Trees and B-Trees

M-way Trees and B-Trees Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Once upon a time... in a course that we all like to

More information

Binary Heap Algorithms

Binary Heap Algorithms CS Data Structures and Algorithms Lecture Slides Wednesday, April 5, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005 2009 Glenn G. Chappell

More information

Data Structures. Topic #12

Data Structures. Topic #12 Data Structures Topic #12 Today s Agenda Sorting Algorithms insertion sort selection sort exchange sort shell sort radix sort As we learn about each sorting algorithm, we will discuss its efficiency Sorting

More information

A 10-Gbps High-Speed Single-Chip Network Intrusion Detection and Prevention System

A 10-Gbps High-Speed Single-Chip Network Intrusion Detection and Prevention System A 0-Gbps High-Speed Single-Chip Network Intrusion Detection and Prevention System N. Sertac Artan, Rajdip Ghosh, Yanchuan Guo, and H. Jonathan Chao Department of Electrical and Computer Engineering Polytechnic

More information

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM

THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM THE SECURITY AND PRIVACY ISSUES OF RFID SYSTEM Iuon Chang Lin Department of Management Information Systems, National Chung Hsing University, Taiwan, Department of Photonics and Communication Engineering,

More information

Data storage Tree indexes

Data storage Tree indexes Data storage Tree indexes Rasmus Pagh February 7 lecture 1 Access paths For many database queries and updates, only a small fraction of the data needs to be accessed. Extreme examples are looking or updating

More information

Streaming Lossless Data Compression Algorithm (SLDC)

Streaming Lossless Data Compression Algorithm (SLDC) Standard ECMA-321 June 2001 Standardizing Information and Communication Systems Streaming Lossless Data Compression Algorithm (SLDC) Phone: +41 22 849.60.00 - Fax: +41 22 849.60.01 - URL: http://www.ecma.ch

More information

Storing Measurement Data

Storing Measurement Data Storing Measurement Data File I/O records or reads data in a file. A typical file I/O operation involves the following process. 1. Create or open a file. Indicate where an existing file resides or where

More information

Object Request Reduction in Home Nodes and Load Balancing of Object Request in Hybrid Decentralized Web Caching

Object Request Reduction in Home Nodes and Load Balancing of Object Request in Hybrid Decentralized Web Caching 2012 2 nd International Conference on Information Communication and Management (ICICM 2012) IPCSIT vol. 55 (2012) (2012) IACSIT Press, Singapore DOI: 10.7763/IPCSIT.2012.V55.5 Object Request Reduction

More information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information

Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University sekine@cs.nyu.edu Kapil Dalwani Computer Science Department

More information

CIS 631 Database Management Systems Sample Final Exam

CIS 631 Database Management Systems Sample Final Exam CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files

More information

Chapter 4: Computer Codes

Chapter 4: Computer Codes Slide 1/30 Learning Objectives In this chapter you will learn about: Computer data Computer codes: representation of data in binary Most commonly used computer codes Collating sequence 36 Slide 2/30 Data

More information

Tune That SQL for Supercharged DB2 Performance! Craig S. Mullins, Corporate Technologist, NEON Enterprise Software, Inc.

Tune That SQL for Supercharged DB2 Performance! Craig S. Mullins, Corporate Technologist, NEON Enterprise Software, Inc. Tune That SQL for Supercharged DB2 Performance! Craig S. Mullins, Corporate Technologist, NEON Enterprise Software, Inc. Table of Contents Overview...................................................................................

More information

Index Internals Heikki Linnakangas / Pivotal

Index Internals Heikki Linnakangas / Pivotal Index Internals Heikki Linnakangas / Pivotal Index Access Methods in PostgreSQL 9.5 B-tree GiST GIN SP-GiST BRIN (Hash) but first, the Heap Heap Stores all tuples in table Unordered Copenhagen Amsterdam

More information

Part III Storage Management. Chapter 11: File System Implementation

Part III Storage Management. Chapter 11: File System Implementation Part III Storage Management Chapter 11: File System Implementation 1 Layered File System 2 Overview: 1/4 A file system has on-disk and in-memory information. A disk may contain the following for implementing

More information

Image Compression through DCT and Huffman Coding Technique

Image Compression through DCT and Huffman Coding Technique International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information