CSCE-608 Database Systems COURSE PROJECT #2
|
|
|
- Mae Carroll
- 10 years ago
- Views:
Transcription
1 CSCE-608 Database Systems Fall 2015 Instructor: Dr. Jianer Chen Teaching Assistant: Yi Cui Office: HRBB 315C Office: HRBB 501C Phone: Phone: Office Hours: T,Th 10:50am-12:00noon Office Hours: W,F 4:00pm-5:00pm COURSE PROJECT #2 (Due December 8, 2015) Remark. This is a project by teams of no more than two students. This course project is to design and implement a simple SQL (called Tiny-SQL) interpreter. The grammar for Tiny-SQL is given in Appendix 1. Your interpreter should accept SQL queries that are valid in terms of the grammar of Tiny-SQL, execute the queries, and output the results of the execution. Your interpreter should include the following components: A parser: the parser accepts the input Tiny-SQL query and converts it into a parse tree. A logical query plan generator: the logical query plan generator converts a parse tree into a logical query plan. This phase also includes any possible logical query plan optimization. A physical query plan generator: this generator converts an optimized logical query plan into an executable physical query plan. This phase also includes any possible physical query plan optimization. A set of subroutines that implement a variety of data query operations necessary for execution of queries in Tiny-SQL. The subroutines should use the released library StorageManager, which simulates computer disks and memory. The StorageManager library is provided to support physical execution of the Tiny-SQL interpreter. The library simulates a fictitious disk containing 100 tracks of unlimited length and a small fictitious (main) memory of 10 data blocks. The StorageManager is configured to simulate the speed of 50M memory and a 300M relation on a Megatron 747 disk when the memory capacity is 10 blocks and the relation holds 60 tuples. With the hardware limit, the Tiny-SQL interpreter must be implemented wisely using query plans and algorithms. Otherwise, it will not be possible to handle small data such as two relations of 60 tuples or six relations of only 5 tuples. Further descriptions of the library are given in Appendix 2. Specific instructions and requirements of the project are as follows. Interface: Your Tiny-SQL interpreter should have an interface. Single-user text-based interface is sufficient. The interface accepts one Tiny-SQL statement at a line. In addition, the interface should be able to read a file containing many Tiny-SQL statements, one statement per line, and be able to output the query results to a file. 1
2 Parser: You can develop the parser either by writing your own procedures (which should be feasible because Tiny-SQL has a very simple grammar) or by using a parser generator such as LEX, YACC, and JavaCC, if you are familiar with them. We would recommend that you write your own parser, in particular for students who do not have extensive experience in compiler constructions. Please make sure that you follow the TinySQL grammar. Be careful to allow space between words, and do not allow a semicolon at the end of a statement. Logical query plan generator: You will need a tree data structure. You should at least optimize the selection operations in the tree. Physical query plan generator: You should at least optimize the join operations. Implementation of physical operators: You should implement the one-pass algorithms for projection, selection, product, join, duplicate elimination, and sorting as well as the two-pass algorithms for join, duplicate elimination, and sorting. Hints: (1) Based on the StorageManager, you can use at most 10 memory blocks; (2) Among the join operations, cross-join is easier to implement but will not work for large relations; and (3) To handle WHERE conditions, it is suggested that you do not further decompose the condition into a subtree. Instead, keep the entire condition as a single node in your parse tree. Read Appendix 3 on how to start your project. Testing will be done in two ways: A set of data and queries are provided to test the correctness of your interpreter. The data include 11 tables and as much as 60 tuples in one table. A file is provided for you to record the number of disk I/O s used by the interpreter. You do not have to worry about Tiny-SQL statements outside the test set. Use your 200-tuple data from Project #1. Design experiments to show the running time and the number of disk I/O s required to execute sample queries. Collect several measurements of running time versus data size, and disk I/O s versus data size. Any extra optimization techniques for Tiny-SQL statements implemented in your project will be considered as bonus points. The library StorageManager and sample Tiny-SQL statements are available on the TA s web page at Write a report containing at least the following components: A description of the software architecture or the program flow. An explanation of each major part in your project, including data structures and algorithms. Please do not repeat the details of algorithms from the textbook. Experiments and results demonstrating all the queries your interpreter is capable to execute, and the performance of your interpreter. Discussion on any optimization techniques implemented in your project and their effects. You will submit your source code, a README note of how to install your software, and the report via CSNET. Also submit a hard copy of the report. 2
3 Appendix 1. Tiny-SQL Grammar letter ::= a b c d e f g h i j k l m n o p q r s t u v w x y z digit ::= integer ::= digit+ comp-op ::= < > = table-name ::= letter[digit letter] attribute-name ::= letter[digit letter] column-name ::= [table-name.]attribute-name literal ::= whatever-except-( )-and-( ) statement ::= create-table-statement drop-table-statement select-statement delete-statement insert-statement create-table-statement ::= CREATE TABLE table-name(attribute-type-list) data-type ::= INT STR20 attribute-type-list ::= attribute-name data-type attribute-name data-type, attribute-type-list drop-table-statement ::= DROP TABLE table-name select-statement ::= SELECT [DISTINCT] select-list FROM table-list [WHERE search-condition] [ORDER BY colume-name] select-list ::= * select-sublist select-sublist ::= column-name column-name, select-sublist table-list ::= table-name table-name, table-list delete-statement ::= DELETE FROM table-name [WHERE search-condition] insert-statement ::= INSERT INTO table-name(attribute-list) insert-tuples insert-tuples ::= VALUES (value-list) select-statement attribute-list ::= attribute-name attribute-name, attribute-list value ::= literal integer NULL value-list ::= value value, value-list search-condition ::= boolean-term boolean-term OR search-condition boolean-term ::= boolean-factor boolean-factor AND boolean-term boolean-factor ::= [NOT] boolean-primary boolean-primary ::= comparison-predicate "[" search-condition "]" comparison-predicate ::= expression comp-op expression expression ::= term term + expression term - expression term ::= factor factor * term factor / term factor ::= column-name literal integer ( expression ) 3
4 Appendix 2. Additional Descriptions of the StorageManager Library The StorageManager library supports physical execution of Tiny-SQL interpreter. The library simulates a fictitious disk containing 100 tracks of unlimited length and a small fictitious (main) memory containing 10 data blocks. To simplify the database system, every relation is stored separately on one disk track. Each of the disk tracks and the memory is partitioned into blocks of the same size. The blocks are numbered in sequence, and the data are transferred between the disk and the memory block by block. The StorageManager is configured to simulate the speed of 50M memory and a 300M relation on a Megatron 747 disk where the memory capacity is 10 blocks, and the relation holds 60 tuples. To achieve the effect, each block is configured to hold only 8 data fields. The disk latency is set up according to the Megatron 747 disk but 320 times slower for transfer time of one block. The performance of a Tiny-SQL interpreter varies when implemented in different ways. Additional library functions are provided to time the performance of your Tiny-SQL interpreter. In addition to hardware simulation, the StorageManager supports creation and deletion of a relation with a specified schema. Thus, you do not need to worry about how to store a relation on the fictitious disk. When executing queries, the Tiny-SQL interpreter only needs to specify which relation block to transfer to which memory block or the opposite direction. However, in order to locate a tuple, you must know for each particular relation, how many tuples are stored in one relation block. Every relation/memory block stores a fix number of 8 fields. Because different relation has a different schema and a different number of fields, the number of tuples a block can hold varies from one relation to another. Every time your Tiny-SQL interpreter will be started with an empty disk and an empty memory, because the StorageManager library simulates the fictitious disk and memory only in the program memory. When the interpreter terminates, the data in the simulated storage are lost. The data structure of StorageManager is summarized below in a bottom-up fashion: Field.h: A field type can be an integer INT or a string STR20. Class Tuple: A tuple (i.e. a record/row in a relation/table) contains at most MAX NUM OF FIELDS IN RELATION = 8 fields. Each field in a tuple has offset 0,1,2,..., respectively. The order is defined in the schema. You can access a field by its offset or its field name. Class Block: A disk/relation or memory block contains a number of records/tuples that belong to the same relation. A tuple cannot be split and stored in more than one blocks. Each block is defined to hold at most FIELDS PER BLOCK = 8 fields, but different number of tuples. The maximum number of tuples held in a block can be calculated from the size of a tuple, i.e. the number of fields in a tuple: The max# of tuples held in a block = FIELDS PER BLOCK/num of fields in tuple You can also get the number by simply calling Schema::getTuplesPerBlock(). Class Relation: Each relation is assumed to be stored in consecutive disk blocks on a single track of the disk (i.e., in a clustered way). The disk blocks on the track are numbered by 0,1,2,.... The tuples in a relation cannot be read directly. You have to copy disk blocks of the relation to memory blocks before accessing the tuples inside the blocks. For example, to delete a tuple in a relation block, you should first copy the block to the memory, invalidate the selected tuple inside the memory block, and finally copy the modified blocks back to the relation. Please NOTE that a hole is left in the block after the deletion. Be sure to deal with 4
5 the holes when doing every SQL operation. You can decide whether to remove trailing holes from a relation. The Relation class provides functions to create a new Tuple. Class Schema: A schema specifies what a tuple of a particular relation contains, including field names, and field types in a defined order. The field names and types are given offsets according to the defined order. Every schema specifies at most total MAX NUM OF FIELDS IN RELATION = 8 fields. The size of a tuple is the total number of fields specified in the schema. The tuple size will affect the number of tuples held in one relation block or memory block. Class SchemaManager: A schema manager stores relations and schemas, and maps a relation name to a relation and the corresponding schema. You will always create a relation through the schema manager by specifying a relation name and a schema. You will get access to relations from SchemaManager. Class Disk: Simplified assumptions are made for disks. A disk contains 100 tracks. We assume each relation reside on one single track of blocks on the disk. Reading or writing blocks of a relation takes time below: avg seek time + avg rotation latency + avg transfer time per block num of consecutive blocks The number of disk I/O s is calculated by the number of blocks read or written. Class MainMemory: The simulated memory holds NUM OF BLOCKS IN MEMORY = 10 blocks numbered by 0,1,2,.... You can get total number of blocks in the memory by calling MainMemory::getMemorySize(). When accessing data of a relation, you have to copy the disk blocks of the relation to the simulated main memory. Then, access the tuples in the simulated main memory. Or in the other direction, you will copy the memory blocks to disk blocks of a relation when writing data to a relation. Because the size of memory is limited, you have to implement the database operations wisely. We assume there is no latency in accessing memory. Below is a short description of how to use the library. At the beginning of your program, you have to create a MainMemory, a Disk, and a Schema- Manager. The three objects will be used throughout the whole program. Create a relation: You should always start with creating a Schema object. Then create a Relation from the SchemaManager by specifying a name, say Student, and a Schema. The SchemaManager will return a handle/pointer to the created Relation. NOTE: You have to handle the addresses of disk and memory blocks by yourselves. The library does not decide for you which block to store the data on a disk track or in a memory. Create a tuple: Create a Tuple of the Relation using the Relation pointer. It may sound weird, but you have to store the newly created Tuple to the simulated memory to complete this step. First get a pointer to an empty memory Block, say block 7, using the MainMemory object. Store the created Tuple by appending it to the empty memory Block 7. Store a tuple to a relation: Copy the memory block containing the tuple to a block of the created Relation, say, copying memory block 7 to the block 0 of the Relation Student. 5
6 Browse a relation: Each relation is stored in a dense/clustered/contiguous way on one disk track, i.e., the data are stored in consecutive disk blocks 0, 1, 2,....,and there is no size limit. To browse data of a Relation, copy relation blocks to memory blocks, and access the tuples in the Blocks of the MainMemory. There are two ways to access tuples in the MainMemory: You can get a pointer to a specific Block of the MainMemory, and access the Tuples inside the Block. Otherwise, you can get a number of the Tuples stored in consecutive memory blocks by specifying the range of the Memory blocks. The Tuples will be returned in a vector. It is then convenient to sort or build a heap on the vector of Tuples. Append a tuple to a relation (and maintain clustered disk blocks): Every time before you append a tuple, you should copy the last relation block, say block R, to the memory block M. Append the new tuple to the memory block M if the block M is not full. Next, copy the memory block M back to the relation block R. However, if the memory block M is full, which says the last relation block R is full, you have to append the tuple to the next relation block R+1. Thus instead, get an empty memory block M, insert the tuple to beginning of the memory block M, and copy that memory block M to the relation block R+1. Delete a tuple from a relation: Assume we want to delete a Tuple from the Block R of a relation. The simplest way is to copy the Block R to memory, null the Tuple, and write the modified memory block back to the relation block R. In this way, there might be holes in a Block and a Relation, so be sure to consider the holes in every SQL operation. You certainly can have other way of managing tuple deletion. Bear in mind that reading and writing disk/relation blocks take time. Temporary relations: You will need to create temporary relations on the emulated disks to store intermediate query results. Be aware that field names from joining relations might be the same, which might cause problem for the temporary relations. You will have to take care of the problem. You are suggested to read the usage of each class in.h files. Then, trace the test program TestStorageManager.cpp, which demonstrates how to use the library. You do not need to read the implementation of the library in StorageManager.cpp. If you have questions about how a function is implemented and used, please contact the TA. 6
7 Appendix 3. How to Start the Project 1. You should have C++/Java preliminary knowledge: object-oriented programming; the standard data structure vector/arraylist; and how to make heap on a vector using C++. Java users must implement heaps. 2. You should know how to define and use tree data structure. 3. Read textbook chapters (Ed. 2009): Data storage: 13.2 (Disks), 13.3 (Using secondary storage effectively), (Representing data elements) Logical query: (Logical query languages), 16 (Query compilers) Query execution: Once you understand the data storage, read the StorageManager description, and trace the object-oriented test program TestStorageManager.cpp. Basically every function available from the StorageManager appears in the test program. 5. Because of the simple structure of Tiny-SQL, you may not have to construct the formal parse tree, as long as you can correctly identify the necessary components in the input query. In particular, we recommend that you keep the entire condition in a WHERE clause, instead of further decomposing it into a parse subtree. The condition can be converted into an expression in the postfix form that can be easily manipulated by a stack structure. 6. Possible project time-line (You might need to adjust the order of implementation. If you fall behind, seek help soon.): Week 1: Read about data storage, and the StorageManager library. Start your implementation in both ways, from top down and from bottom up. At the top end, write the interface and the parser. At the bottom end, implement procedures of CREATE TABLE, DROP TABLE, INSERT, DELETE and singletable SELECT *. NOTE: If you are using a parser generator, you should aim at finishing the parser in one week. Week 2: At the parser end, implement evaluation of WHERE clauses. Start with a singletable version, but bear in mind this need to be expanded to multiple tables. At the physical layer, read about query execution. Implement product operations for multiple tables. Integrate the product procedure with the parser that handles WHERE clauses. Now your interpreter should be able to handle most of test queries for small tables. Testing: When running test queries, first use 6 and 3 tuples for tables course and course2, respectively; 5, 3, and 1 tuples for tables r, s, and t, respectively; and 2, 2, 2, 1, 1, and 1 tuples for tables t1, t2, t3, t4, t5, and t6, respectively. If the results are correct, continue with larger tables. Use 60 and 16 tuples for course and course2, respectively; 41, 3, and 3 tuples for r, s, and t, respectively; and 5, 4, 4, 2, 2, and 1 tuples for t1, t2, t3, t4, t5, and t6, respectively. You should have no problem handling this amount of data in this stage. 7
8 Week 3: From top down, read about logical query and start implementing logical query tree. The tree may contain operators such as projection, selection, product, join, duplicate elimination, and sorting. You will need to modify the whole project to integrate the logical query tree into the project. From bottom up, implement two-pass natural-join algorithm. Create temporary relations if necessary. Optimize the join tree. Testing: Use all the tuples provided, that is, 60 and 24 tuples for course and course2; 41, 41, and 41 tuples for r, s, and t; and 5, 4, 4, 2, 2, and 1 tuples for t1, t2, t3, t4, t5, and t6. Week 4: Implement logical query tree and its optimization. Implement operations such as projection, duplicate-elimination and sorting for single table. Expand to multiple tables. Week 5: Project integration Week 6 Performance evaluation and report writing Week 7: Testing and bug fix 8
What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World
COSC 304 Introduction to Systems Introduction Dr. Ramon Lawrence University of British Columbia Okanagan [email protected] What is a database? A database is a collection of logically related data for
Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3
Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM
CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen
CS 464/564 Introduction to Database Management System Instructor: Abdullah Mueen LECTURE 14: DATA STORAGE AND REPRESENTATION Data Storage Memory Hierarchy Disks Fields, Records, Blocks Variable-length
Database Administration with MySQL
Database Administration with MySQL Suitable For: Database administrators and system administrators who need to manage MySQL based services. Prerequisites: Practical knowledge of SQL Some knowledge of relational
Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design
Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database
In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR
1 2 2 3 In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR The uniqueness of the primary key ensures that
In-Memory Database: Query Optimisation. S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027)
In-Memory Database: Query Optimisation S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027) Introduction Basic Idea Database Design Data Types Indexing Query
Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006. Final Examination. January 24 th, 2006
Master of Sciences in Informatics Engineering Programming Paradigms 2005/2006 Final Examination January 24 th, 2006 NAME: Please read all instructions carefully before start answering. The exam will be
CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY
CHAPTER 2 DATABASE MANAGEMENT SYSTEM AND SECURITY 2.1 Introduction In this chapter, I am going to introduce Database Management Systems (DBMS) and the Structured Query Language (SQL), its syntax and usage.
6. Storage and File Structures
ECS-165A WQ 11 110 6. Storage and File Structures Goals Understand the basic concepts underlying different storage media, buffer management, files structures, and organization of records in files. Contents
CSE 562 Database Systems
UB CSE Database Courses CSE 562 Database Systems CSE 462 Database Concepts Introduction CSE 562 Database Systems Some slides are based or modified from originals by Database Systems: The Complete Book,
Knocker main application User manual
Knocker main application User manual Author: Jaroslav Tykal Application: Knocker.exe Document Main application Page 1/18 U Content: 1 START APPLICATION... 3 1.1 CONNECTION TO DATABASE... 3 1.2 MODULE DEFINITION...
Raima Database Manager Version 14.0 In-memory Database Engine
+ Raima Database Manager Version 14.0 In-memory Database Engine By Jeffrey R. Parsons, Senior Engineer January 2016 Abstract Raima Database Manager (RDM) v14.0 contains an all new data storage engine optimized
Physical Data Organization
Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor
HP Quality Center. Upgrade Preparation Guide
HP Quality Center Upgrade Preparation Guide Document Release Date: November 2008 Software Release Date: November 2008 Legal Notices Warranty The only warranties for HP products and services are set forth
Database 2 Lecture I. Alessandro Artale
Free University of Bolzano Database 2. Lecture I, 2003/2004 A.Artale (1) Database 2 Lecture I Alessandro Artale Faculty of Computer Science Free University of Bolzano Room: 221 [email protected] http://www.inf.unibz.it/
ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771
ENHANCEMENTS TO SQL SERVER COLUMN STORES Anuhya Mallempati #2610771 CONTENTS Abstract Introduction Column store indexes Batch mode processing Other Enhancements Conclusion ABSTRACT SQL server introduced
Optimization of Distributed Crawler under Hadoop
MATEC Web of Conferences 22, 0202 9 ( 2015) DOI: 10.1051/ matecconf/ 2015220202 9 C Owned by the authors, published by EDP Sciences, 2015 Optimization of Distributed Crawler under Hadoop Xiaochen Zhang*
Appendix M INFORMATION TECHNOLOGY (IT) YOUTH APPRENTICESHIP
Appendix M INFORMATION TECHNOLOGY (IT) YOUTH APPRENTICESHIP PROGRAMMING & SOFTWARE DEVELOPMENT AND INFORMATION SUPPORT & SERVICES PATHWAY SOFTWARE UNIT UNIT 5 Programming & and Support & s: (Unit 5) PAGE
Big Data and Scripting map/reduce in Hadoop
Big Data and Scripting map/reduce in Hadoop 1, 2, parts of a Hadoop map/reduce implementation core framework provides customization via indivudual map and reduce functions e.g. implementation in mongodb
Storage and File Structure
Storage and File Structure Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files
MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example
MapReduce MapReduce and SQL Injections CS 3200 Final Lecture Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. OSDI'04: Sixth Symposium on Operating System Design
Variable Base Interface
Chapter 6 Variable Base Interface 6.1 Introduction Finite element codes has been changed a lot during the evolution of the Finite Element Method, In its early times, finite element applications were developed
Design Patterns in Parsing
Abstract Axel T. Schreiner Department of Computer Science Rochester Institute of Technology 102 Lomb Memorial Drive Rochester NY 14623-5608 USA [email protected] Design Patterns in Parsing James E. Heliotis
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines
Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process
1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++
Answer the following 1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ 2) Which data structure is needed to convert infix notations to postfix notations? Stack 3) The
Data Tool Platform SQL Development Tools
Data Tool Platform SQL Development Tools ekapner Contents Setting SQL Development Preferences...5 Execution Plan View Options Preferences...5 General Preferences...5 Label Decorations Preferences...6
Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification
Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries
BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig
BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig Contents Acknowledgements... 1 Introduction to Hive and Pig... 2 Setup... 2 Exercise 1 Load Avro data into HDFS... 2 Exercise 2 Define an
University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao
University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly
1 Introduction. 2 An Interpreter. 2.1 Handling Source Code
1 Introduction The purpose of this assignment is to write an interpreter for a small subset of the Lisp programming language. The interpreter should be able to perform simple arithmetic and comparisons
CS 2112 Spring 2014. 0 Instructions. Assignment 3 Data Structures and Web Filtering. 0.1 Grading. 0.2 Partners. 0.3 Restrictions
CS 2112 Spring 2014 Assignment 3 Data Structures and Web Filtering Due: March 4, 2014 11:59 PM Implementing spam blacklists and web filters requires matching candidate domain names and URLs very rapidly
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
Chapter 13 File and Database Systems
Chapter 13 File and Database Systems Outline 13.1 Introduction 13.2 Data Hierarchy 13.3 Files 13.4 File Systems 13.4.1 Directories 13.4. Metadata 13.4. Mounting 13.5 File Organization 13.6 File Allocation
Persistent Binary Search Trees
Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents
5. A full binary tree with n leaves contains [A] n nodes. [B] log n 2 nodes. [C] 2n 1 nodes. [D] n 2 nodes.
1. The advantage of.. is that they solve the problem if sequential storage representation. But disadvantage in that is they are sequential lists. [A] Lists [B] Linked Lists [A] Trees [A] Queues 2. The
CHAPTER 5 INTELLIGENT TECHNIQUES TO PREVENT SQL INJECTION ATTACKS
66 CHAPTER 5 INTELLIGENT TECHNIQUES TO PREVENT SQL INJECTION ATTACKS 5.1 INTRODUCTION In this research work, two new techniques have been proposed for addressing the problem of SQL injection attacks, one
Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks]
EXAMINATIONS 2005 MID-YEAR COMP 302 Database Systems Time allowed: Instructions: 3 Hours Answer all questions. Make sure that your answers are clear and to the point. Write your answers in the spaces provided.
Optimization of SQL Queries in Main-Memory Databases
Optimization of SQL Queries in Main-Memory Databases Ladislav Vastag and Ján Genči Department of Computers and Informatics Technical University of Košice, Letná 9, 042 00 Košice, Slovakia [email protected]
COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing
COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing The scanner (or lexical analyzer) of a compiler processes the source program, recognizing
Cloud Computing at Google. Architecture
Cloud Computing at Google Google File System Web Systems and Algorithms Google Chris Brooks Department of Computer Science University of San Francisco Google has developed a layered system to handle webscale
Instant SQL Programming
Instant SQL Programming Joe Celko Wrox Press Ltd. INSTANT Table of Contents Introduction 1 What Can SQL Do for Me? 2 Who Should Use This Book? 2 How To Use This Book 3 What You Should Know 3 Conventions
Adding Indirection Enhances Functionality
Adding Indirection Enhances Functionality The Story Of A Proxy Mark Riddoch & Massimiliano Pinto Introductions Mark Riddoch Staff Engineer, VMware Formally Chief Architect, MariaDB Corporation Massimiliano
Retrieving Data Using the SQL SELECT Statement. Copyright 2006, Oracle. All rights reserved.
Retrieving Data Using the SQL SELECT Statement Objectives After completing this lesson, you should be able to do the following: List the capabilities of SQL SELECT statements Execute a basic SELECT statement
CSE 530A Database Management Systems. Introduction. Washington University Fall 2013
CSE 530A Database Management Systems Introduction Washington University Fall 2013 Overview Time: Mon/Wed 7:00-8:30 PM Location: Crow 206 Instructor: Michael Plezbert TA: Gene Lee Websites: http://classes.engineering.wustl.edu/cse530/
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
Advanced Performance Forensics
Advanced Performance Forensics Uncovering the Mysteries of Performance and Scalability Incidents through Forensic Engineering Stephen Feldman Senior Director Performance Engineering and Architecture [email protected]
CSI 2132 Lab 3. Outline 09/02/2012. More on SQL. Destroying and Altering Relations. Exercise: DROP TABLE ALTER TABLE SELECT
CSI 2132 Lab 3 More on SQL 1 Outline Destroying and Altering Relations DROP TABLE ALTER TABLE SELECT Exercise: Inserting more data into previous tables Single-table queries Multiple-table queries 2 1 Destroying
Relational Database: Additional Operations on Relations; SQL
Relational Database: Additional Operations on Relations; SQL Greg Plaxton Theory in Programming Practice, Fall 2005 Department of Computer Science University of Texas at Austin Overview The course packet
A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION
A TOOL FOR DATA STRUCTURE VISUALIZATION AND USER-DEFINED ALGORITHM ANIMATION Tao Chen 1, Tarek Sobh 2 Abstract -- In this paper, a software application that features the visualization of commonly used
Abstract. For notes detailing the changes in each release, see the MySQL for Excel Release Notes. For legal information, see the Legal Notices.
MySQL for Excel Abstract This is the MySQL for Excel Reference Manual. It documents MySQL for Excel 1.3 through 1.3.6. Much of the documentation also applies to the previous 1.2 series. For notes detailing
Hypertable Architecture Overview
WHITE PAPER - MARCH 2012 Hypertable Architecture Overview Hypertable is an open source, scalable NoSQL database modeled after Bigtable, Google s proprietary scalable database. It is written in C++ for
Thomas Jefferson High School for Science and Technology Program of Studies Foundations of Computer Science. Unit of Study / Textbook Correlation
Thomas Jefferson High School for Science and Technology Program of Studies Foundations of Computer Science updated 03/08/2012 Unit 1: JKarel 8 weeks http://www.fcps.edu/is/pos/documents/hs/compsci.htm
Rackspace Cloud Databases and Container-based Virtualization
Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many
Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design
Physical Database Design Process Physical Database Design Process The last stage of the database design process. A process of mapping the logical database structure developed in previous stages into internal
Data Management in the Cloud
Data Management in the Cloud Ryan Stern [email protected] : Advanced Topics in Distributed Systems Department of Computer Science Colorado State University Outline Today Microsoft Cloud SQL Server
MS SQL Performance (Tuning) Best Practices:
MS SQL Performance (Tuning) Best Practices: 1. Don t share the SQL server hardware with other services If other workloads are running on the same server where SQL Server is running, memory and other hardware
Data Stream Management and Complex Event Processing in Esper. INF5100, Autumn 2010 Jarle Søberg
Data Stream Management and Complex Event Processing in Esper INF5100, Autumn 2010 Jarle Søberg Outline Overview of Esper DSMS and CEP concepts in Esper Examples taken from the documentation A lot of possibilities
ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001
ICOM 6005 Database Management Systems Design Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001 Readings Read Chapter 1 of text book ICOM 6005 Dr. Manuel
SQL - QUICK GUIDE. Allows users to access data in relational database management systems.
http://www.tutorialspoint.com/sql/sql-quick-guide.htm SQL - QUICK GUIDE Copyright tutorialspoint.com What is SQL? SQL is Structured Query Language, which is a computer language for storing, manipulating
DBMS / Business Intelligence, SQL Server
DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.
Managing Objects with Data Dictionary Views. Copyright 2006, Oracle. All rights reserved.
Managing Objects with Data Dictionary Views Objectives After completing this lesson, you should be able to do the following: Use the data dictionary views to research data on your objects Query various
Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division
Answer Key UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division CS186 Fall 2003 Eben Haber Midterm Midterm Exam: Introduction to Database Systems This exam has
Introduction to Object-Oriented and Object-Relational Database Systems
, Professor Uppsala DataBase Laboratory Dept. of Information Technology http://www.csd.uu.se/~udbl Extended ER schema Introduction to Object-Oriented and Object-Relational Database Systems 1 Database Design
Introduction to IR Systems: Supporting Boolean Text Search. Information Retrieval. IR vs. DBMS. Chapter 27, Part A
Introduction to IR Systems: Supporting Boolean Text Search Chapter 27, Part A Database Management Systems, R. Ramakrishnan 1 Information Retrieval A research field traditionally separate from Databases
SQL Query Evaluation. Winter 2006-2007 Lecture 23
SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution
Data processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
Binary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
Physical DB design and tuning: outline
Physical DB design and tuning: outline Designing the Physical Database Schema Tables, indexes, logical schema Database Tuning Index Tuning Query Tuning Transaction Tuning Logical Schema Tuning DBMS Tuning
Introduction to Database Systems. Chapter 1 Introduction. Chapter 1 Introduction
Introduction to Database Systems Winter term 2013/2014 Melanie Herschel [email protected] Université Paris Sud, LRI 1 Chapter 1 Introduction After completing this chapter, you should be able to:
Objectives. Chapter 2: Operating-System Structures. Operating System Services (Cont.) Operating System Services. Operating System Services (Cont.
Objectives To describe the services an operating system provides to users, processes, and other systems To discuss the various ways of structuring an operating system Chapter 2: Operating-System Structures
Common Data Structures
Data Structures 1 Common Data Structures Arrays (single and multiple dimensional) Linked Lists Stacks Queues Trees Graphs You should already be familiar with arrays, so they will not be discussed. Trees
Chapter 13: Query Processing. Basic Steps in Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
C Compiler Targeting the Java Virtual Machine
C Compiler Targeting the Java Virtual Machine Jack Pien Senior Honors Thesis (Advisor: Javed A. Aslam) Dartmouth College Computer Science Technical Report PCS-TR98-334 May 30, 1998 Abstract One of the
2) Write in detail the issues in the design of code generator.
COMPUTER SCIENCE AND ENGINEERING VI SEM CSE Principles of Compiler Design Unit-IV Question and answers UNIT IV CODE GENERATION 9 Issues in the design of code generator The target machine Runtime Storage
Eventia Log Parsing Editor 1.0 Administration Guide
Eventia Log Parsing Editor 1.0 Administration Guide Revised: November 28, 2007 In This Document Overview page 2 Installation and Supported Platforms page 4 Menus and Main Window page 5 Creating Parsing
Chapter 3: Operating-System Structures. Common System Components
Chapter 3: Operating-System Structures System Components Operating System Services System Calls System Programs System Structure Virtual Machines System Design and Implementation System Generation 3.1
Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3
Wort ftoc.tex V3-12/17/2007 2:00pm Page ix Introduction xix Part I: Finding Bottlenecks when Something s Wrong Chapter 1: Performance Tuning 3 Art or Science? 3 The Science of Performance Tuning 4 The
10CS35: Data Structures Using C
CS35: Data Structures Using C QUESTION BANK REVIEW OF STRUCTURES AND POINTERS, INTRODUCTION TO SPECIAL FEATURES OF C OBJECTIVE: Learn : Usage of structures, unions - a conventional tool for handling a
Oracle SQL. Course Summary. Duration. Objectives
Oracle SQL Course Summary Identify the major structural components of the Oracle Database 11g Create reports of aggregated data Write SELECT statements that include queries Retrieve row and column data
How to make the computer understand? Lecture 15: Putting it all together. Example (Output assembly code) Example (input program) Anatomy of a Computer
How to make the computer understand? Fall 2005 Lecture 15: Putting it all together From parsing to code generation Write a program using a programming language Microprocessors talk in assembly language
Parallel Data Preparation with the DS2 Programming Language
ABSTRACT Paper SAS329-2014 Parallel Data Preparation with the DS2 Programming Language Jason Secosky and Robert Ray, SAS Institute Inc., Cary, NC and Greg Otto, Teradata Corporation, Dayton, OH A time-consuming
Stack Allocation. Run-Time Data Structures. Static Structures
Run-Time Data Structures Stack Allocation Static Structures For static structures, a fixed address is used throughout execution. This is the oldest and simplest memory organization. In current compilers,
CIS 631 Database Management Systems Sample Final Exam
CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files
A Brief Introduction to MySQL
A Brief Introduction to MySQL by Derek Schuurman Introduction to Databases A database is a structured collection of logically related data. One common type of database is the relational database, a term
AP Computer Science AB Syllabus 1
AP Computer Science AB Syllabus 1 Course Resources Java Software Solutions for AP Computer Science, J. Lewis, W. Loftus, and C. Cocking, First Edition, 2004, Prentice Hall. Video: Sorting Out Sorting,
Partitioning under the hood in MySQL 5.5
Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael Ronström, Partitioning author Who are we? Mikael is a founder of the technology behind NDB
Oracle EXAM - 1Z0-117. Oracle Database 11g Release 2: SQL Tuning. Buy Full Product. http://www.examskey.com/1z0-117.html
Oracle EXAM - 1Z0-117 Oracle Database 11g Release 2: SQL Tuning Buy Full Product http://www.examskey.com/1z0-117.html Examskey Oracle 1Z0-117 exam demo product is here for you to test the quality of the
In-Memory Data Management for Enterprise Applications
In-Memory Data Management for Enterprise Applications Jens Krueger Senior Researcher and Chair Representative Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University
NoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?
CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 Introduction to Databases CS2 Spring 2005 (LN5) 1 Why databases? Why not use XML? What is missing from XML: Consistency
OKLAHOMA SUBJECT AREA TESTS (OSAT )
CERTIFICATION EXAMINATIONS FOR OKLAHOMA EDUCATORS (CEOE ) OKLAHOMA SUBJECT AREA TESTS (OSAT ) FIELD 081: COMPUTER SCIENCE September 2008 Subarea Range of Competencies I. Computer Use in Educational Environments
Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model
Realization of Inventory Databases and Object-Relational Mapping for the Common Information Model Department of Physics and Technology, University of Bergen. November 8, 2011 Systems and Virtualization
Big Data With Hadoop
With Saurabh Singh [email protected] The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
Integrating Big Data into the Computing Curricula
Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big
DATABASE MANAGEMENT SYSTEM
REVIEW ARTICLE DATABASE MANAGEMENT SYSTEM Sweta Singh Assistant Professor, Faculty of Management Studies, BHU, Varanasi, India E-mail: [email protected] ABSTRACT Today, more than at any previous
Principles of Database Management Systems. Overview. Principles of Data Layout. Topic for today. "Executive Summary": here.
Topic for today Principles of Database Management Systems Pekka Kilpeläinen (after Stanford CS245 slide originals by Hector Garcia-Molina, Jeff Ullman and Jennifer Widom) How to represent data on disk
Distributed File Systems
Distributed File Systems Paul Krzyzanowski Rutgers University October 28, 2012 1 Introduction The classic network file systems we examined, NFS, CIFS, AFS, Coda, were designed as client-server applications.
