# Query Processing C H A P T E R12. Practice Exercises

Save this PDF as:

Size: px
Start display at page:

Download "Query Processing C H A P T E R12. Practice Exercises"

## Transcription

1 C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass of the sort-merge algorithm, when applied to sort the following tuples on the first attribute: (kangaroo, 17), (wallaby, 21), (emu, 1), (wombat, 13), (platypus, 3), (lion, 8), (warthog, 4), (zebra, 11), (meerkat, 6), (hyena, 9), (hornbill, 2), (baboon, 12). We will refer to the tuples (kangaroo, 17) through (baboon, 12) using tuple numbers t 1 through t 12. We refer to the j th run used by the i th pass, as r i j. The initial sorted runs have three blocks each. They are: r 11 = {t 3, t 1, t 2 } r 12 = {t 6, t 5, t 4 } r 13 = {t 9, t 7, t 8 } r 14 = {t 12, t 11, t 10 } Each pass merges three runs. Therefore the runs after the of the first pass are: r 21 = {t 3, t 1, t 6, t 9, t 5, t 2, t 7, t 4, t 8 } r 22 = {t 12, t 11, t 10 } At the of the second pass, the tuples are completely sorted into one run: r 31 = {t 12, t 3, t 11, t 10, t 1, t 6, t 9, t 5, t 2, t 7, t 4, t 8 } 12.2 Consider the bank database of Figure 12.13, where the primary keys are underlined, and the following SQL query: 1

2 2 Chapter 12 Query Processing select T.branch name from branch T, branch S where T.assets > S.assets and S.branch city = Brooklyn Write an efficient relational-algebra expression that is equivalent to this query. Justify your choice. Query: T.branch name (( branch name, assets ( T (branch))) T.assets > S.assets ( assets ( (branch city= Brooklyn ) ( S (branch))))) This expression performs the theta join on the smallest amount of data possible. It does this by restricting the right hand side operand of the join to only those branches in Brooklyn, and also eliminating the unneeded attributes from both the operands Let relations r 1 (A, B, C) and r 2 (C, D, E) have the following properties: r 1 has 20,000 tuples, r 2 has 45,000 tuples, 25 tuples of r 1 fit on one block, and 30 tuples of r 2 fit on one block. Estimate the number of block transfers and seeks required, using each of the following join strategies for r 1 r 2 : a. Nested-loop join. b. Block nested-loop join. c. Merge join. d. Hash join. r 1 needs 800 blocks, and r 2 needs 1500 blocks. Let us assume M pages of memory. If M > 800, the join can easily be done in disk accesses, using even plain nested-loop join. So we consider only the case where M 800 pages. a. Nested-loop join: Using r 1 as the outer relation we need = 30, 000, 800 disk accesses, if r 2 is the outer relation we need = 36, 001, 500 disk accesses. b. Block nested-loop join: If r 1 is the outer relation, we need disk accesses, M 1 if r 2 is the outer relation we need disk accesses. M 1 c. Merge-join: Assuming that r 1 and r 2 are not initially sorted on the join key, the total sorting cost inclusive of the output is B s = 1500(2 log M 1 (1500/M) +

3 Exercises 3 2) + 800(2 log M 1 (800/M) + 2) disk accesses. Assuming all tuples with the same value for the join attributes fit in memory, the total cost is B s disk accesses. d. Hash-join: We assume no overflow occurs. Since r 1 is smaller, we use it as the build relation and r 2 as the probe relation. If M > 800/M, i.e. no need for recursive partitioning, then the cost is 3( ) = 6900 disk accesses, else the cost is 2( ) log M 1 (800) disk accesses The indexed nested-loop join algorithm described in Section can be inefficient if the index is a secondary index, and there are multiple tuples with the same value for the join attributes. Why is it inefficient? Describe a way, using sorting, to reduce the cost of retrieving tuples of the inner relation. Under what conditions would this algorithm be more efficient than hybrid merge join? If there are multiple tuples in the inner relation with the same value for the join attributes, we may have to access that many blocks of the inner relation for each tuple of the outer relation. That is why it is inefficient. To reduce this cost we can perform a join of the outer relation tuples with just the secondary index leaf entries, postponing the inner relation tuple retrieval. The result file obtained is then sorted on the inner relation addresses, allowing an efficient physical order scan to complete the join. Hybrid merge join requires the outer relation to be sorted. The above algorithm does not have this requirement, but for each tuple in the outer relation it needs to perform an index lookup on the inner relation. If the outer relation is much larger than the inner relation, this index lookup cost will be less than the sorting cost, thus this algorithm will be more efficient Let r and s be relations with no indices, and assume that the relations are not sorted. Assuming infinite memory, what is the lowest-cost way (in terms of I/O operations) to compute r s? What is the amount of memory required for this algorithm? We can store the entire smaller relation in memory, read the larger relation block by block and perform nested loop join using the larger one as the outer relation. The number of I/O operations is equal to b r + b s, and memory requirement is min(b r, b s ) + 2 pages Consider the bank database of Figure 12.13, where the primary keys are underlined. Suppose that a B + -tree index on branch city is available on relation branch, and that no other index is available. List different ways to handle the following selections that involve negation: a. (branch city< Brooklyn ) (branch)

4 4 Chapter 12 Query Processing b. (branch city= Brooklyn ) (branch) c. (branch city< Brooklyn assets<5000) (branch) a. Use the index to locate the first tuple whose branch city field has value Brooklyn. From this tuple, follow the pointer chains till the, retrieving all the tuples. b. For this query, the index serves no purpose. We can scan the file sequentially and select all tuples whose branch city field is anything other than Brooklyn. c. This query is equivalent to the query (branch city Brooklyn assets<5000)(branch) Using the branch-city index, we can retrieve all tuples with branch-city value greater than or equal to Brooklyn by following the pointer chains from the first Brooklyn tuple. We also apply the additional criteria of assets < 5000 on every tuple Write pseudocode for an iterator that implements indexed nested-loop join, where the outer relation is pipelined. Your pseudocode must define the standard iterator functions open(), next(), and close(). Show what state information the iterator must maintain between calls. Let outer be the iterator which returns successive tuples from the pipelined outer relation. Let inner be the iterator which returns successive tuples of the inner relation having a given value at the join attributes. The inner iterator returns these tuples by performing an index lookup. The functions IndexedNLJoin::open, IndexedNLJoin::close and IndexedNLJoin::next to implement the indexed nested-loop join iterator are given below. The two iterators outer and inner, the value of the last read outer relation tuple t r and a flag done r indicating whether the of the outer relation scan has been reached are the state information which need to be remembered by IndexedNLJoin between calls. IndexedNLJoin::open() outer.open(); inner.open(); done r := false; if(outer.next() false) move tuple from outer s output buffer to t r ; else done r := true;

5 Exercises 5 IndexedNLJoin::close() outer.close(); inner.close(); boolean IndexedNLJoin::next() while( done r ) if(inner.next(t r [JoinAttrs]) false) move tuple from inner s output buffer to t s ; compute t r t s and place it in output buffer; return true; else if(outer.next() false) move tuple from outer s output buffer to t r ; rewind inner to first tuple of s; else done r := true; return false; 12.8 Design sort-based and hash-based algorithms for computing the relational division operation (see Practise Exercises of Chapter 6 for a definition of the division operation). Suppose r(t S) and s(s) be two relations and r s has to be computed. For sorting based algorithm, sort relation s on S. Sort relation r on (T, S). Now, start scanning r and look at the T attribute values of the first tuple. Scan r till tuples have same value of T. Also scan s simultaneously and check whether every tuple of s also occurs as the S attribute of r, in a fashion similar to merge join. If this is the case, output that value of T and proceed with the next value of T. Relation s may have to be scanned multiple times but r will only be scanned once. Total disk accesses, after

6 6 Chapter 12 Query Processing sorting both the relations, will be r + N s, where N is the number of distinct values of T in r. We assume that for any value of T, all tuples in r with that T value fit in memory, and consider the general case at the. Partition the relation r on attributes in T such that each partition fits in memory (always possible because of our assumption). Consider partitions one at a time. Build a hash table on the tuples, at the same time collecting all distinct T values in a separate hash table. For each value of T, Now, for each value V T of T, each value s of S, probe the hash table on (V T, s). If any of the values is absent, discard the value V T, else output the value V T. In the case that not all r tuples with one value for T fit in memory, partition r and s on the S attributes such that the condition is satisfied, run the algorithm on each corresponding pair of partitions r i and s i. Output the intersection of the T values generated in each partition What is the effect on the cost of merging runs if the number of buffer blocks per run is increased, while keeping overall memory available for buffering runs fixed? Seek overhead is reduced, but the the number of runs that can be merged in a pass decreases potentially leading to more passes. A value of b b that minimizes overall cost should be chosen.

### Chapter 13: Query Processing. Basic Steps in Query Processing

Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

### 8. Query Processing. Query Processing & Optimization

ECS-165A WQ 11 136 8. Query Processing Goals: Understand the basic concepts underlying the steps in query processing and optimization and estimating query processing cost; apply query optimization techniques;

### Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Comp 5311 Database Management Systems 16. Review 2 (Physical Level) 1 Main Topics Indexing Join Algorithms Query Processing and Optimization Transactions and Concurrency Control 2 Indexing Used for faster

### Datenbanksysteme II: Implementation of Database Systems Implementing Joins

Datenbanksysteme II: Implementation of Database Systems Implementing Joins Material von Prof. Johann Christoph Freytag Prof. Kai-Uwe Sattler Prof. Alfons Kemper, Dr. Eickler Prof. Hector Garcia-Molina

### Query Processing + Optimization: Outline

Query Processing + Optimization: Outline Operator Evaluation Strategies Query processing in general Selection Join Query Optimization Heuristic query optimization Cost-based query optimization Query Tuning

### SQL Query Evaluation. Winter 2006-2007 Lecture 23

SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution

### Lecture 1: Data Storage & Index

Lecture 1: Data Storage & Index R&G Chapter 8-11 Concurrency control Query Execution and Optimization Relational Operators File & Access Methods Buffer Management Disk Space Management Recovery Manager

### Evaluation of Expressions

Query Optimization Evaluation of Expressions Materialization: one operation at a time, materialize intermediate results for subsequent use Good for all situations Sum of costs of individual operations

### CSE 444 Practice Problems. Query Optimization

CSE 444 Practice Problems Query Optimization 1. Query Optimization Given the following SQL query: Student (sid, name, age, address) Book(bid, title, author) Checkout(sid, bid, date) SELECT S.name FROM

### Chapter 14: Query Optimization

Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

### Storage and Indexing. DBS Database Systems Implementing and Optimising Query Languages. Differences between disk and main memory

DBS Database Systems Implementing and Optimising Query Languages Peter Buneman 9 November 2010 Reading: R&G Chapters 8, 9 & 10.1 Storage and Indexing We typically store data in external (secondary) storage.

### Efficient Processing of Joins on Set-valued Attributes

Efficient Processing of Joins on Set-valued Attributes Nikos Mamoulis Department of Computer Science and Information Systems University of Hong Kong Pokfulam Road Hong Kong nikos@csis.hku.hk Abstract Object-oriented

### Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

Phases of database design Application requirements Conceptual design Database Management Systems Conceptual schema Logical design ER or UML Physical Design Relational tables Logical schema Physical design

### Databases and Information Systems 1 Part 3: Storage Structures and Indices

bases and Information Systems 1 Part 3: Storage Structures and Indices Prof. Dr. Stefan Böttcher Fakultät EIM, Institut für Informatik Universität Paderborn WS 2009 / 2010 Contents: - database buffer -

### Overview of Storage and Indexing

Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

### Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

Answer Key UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division CS186 Fall 2003 Eben Haber Midterm Midterm Exam: Introduction to Database Systems This exam has

### File Management. Chapter 12

Chapter 12 File Management File is the basic element of most of the applications, since the input to an application, as well as its output, is usually a file. They also typically outlive the execution

### Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8

Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

### In-Memory Database: Query Optimisation. S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027)

In-Memory Database: Query Optimisation S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027) Introduction Basic Idea Database Design Data Types Indexing Query

### External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing

### SQL QUERY EVALUATION. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 12

SQL QUERY EVALUATION CS121: Introduction to Relational Database Systems Fall 2015 Lecture 12 Query Evaluation 2 Last time: Began looking at database implementation details How data is stored and accessed

### File Management. Chapter 12

File Management Chapter 12 File Management File management system is considered part of the operating system Input to applications is by means of a file Output is saved in a file for long-term storage

### Analysis of Binary Search algorithm and Selection Sort algorithm

Analysis of Binary Search algorithm and Selection Sort algorithm In this section we shall take up two representative problems in computer science, work out the algorithms based on the best strategy to

### OVERVIEW OF QUERY EVALUATION

12 OVERVIEW OF QUERY EVALUATION Exercise 12.1 Briefly answer the following questions: 1. Describe three techniques commonly used when developing algorithms for relational operators. Explain how these techniques

### DATABASE DESIGN - 1DL400

DATABASE DESIGN - 1DL400 Spring 2015 A course on modern database systems!! http://www.it.uu.se/research/group/udbl/kurser/dbii_vt15/ Kjell Orsborn! Uppsala Database Laboratory! Department of Information

### External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing

### Physical Data Organization

Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

### Hash joins and hash teams in Microsoft

Hash joins and hash teams in Microsoft Goetz Graefe, Ross Bunker, Shaun Cooper SQL Server Abstract The query execution engine in Microsoft SQL Server employs hash-based algorithms for inner and outer joins,

### Database System Architecture and Implementation

Database System Architecture and Implementation Kristin Tufte Execution Costs 1 Web Forms Orientation Applications SQL Interface SQL Commands Executor Operator Evaluator Parser Optimizer DBMS Transaction

### Query Optimization in a Memory-Resident Domain Relational Calculus Database System

Query Optimization in a Memory-Resident Domain Relational Calculus Database System KYU-YOUNG WHANG and RAVI KRISHNAMURTHY IBM Thomas J. Watson Research Center We present techniques for optimizing queries

### University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly

### We study algorithms for computing the equijoin of two relations in B system with a standard

Join Processing in Database with Large Main Memories Systems LEONARD D. SHAPIRO North Dakota State University We study algorithms for computing the equijoin of two relations in B system with a standard

### Symbol Tables. Introduction

Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The

### University of Aarhus. Databases 2009. 2009 IBM Corporation

University of Aarhus Databases 2009 Kirsten Ann Larsen What is good performance? Elapsed time End-to-end In DB2 Resource consumption CPU I/O Memory Locks Elapsed time = Sync. I/O + CPU + wait time I/O

### Chapter 2 Data Storage

Chapter 2 22 CHAPTER 2. DATA STORAGE 2.1. THE MEMORY HIERARCHY 23 26 CHAPTER 2. DATA STORAGE main memory, yet is essentially random-access, with relatively small differences Figure 2.4: A typical

### EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES

ABSTRACT EFFICIENT EXTERNAL SORTING ON FLASH MEMORY EMBEDDED DEVICES Tyler Cossentine and Ramon Lawrence Department of Computer Science, University of British Columbia Okanagan Kelowna, BC, Canada tcossentine@gmail.com

### Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3

Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM

### Inside the PostgreSQL Query Optimizer

Inside the PostgreSQL Query Optimizer Neil Conway neilc@samurai.com Fujitsu Australia Software Technology PostgreSQL Query Optimizer Internals p. 1 Outline Introduction to query optimization Outline of

### Storage and File Structure

Storage and File Structure Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files

### Why? A central concept in Computer Science. Algorithms are ubiquitous.

Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online

### Chapter 13: Query Optimization

Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog

### Operating Systems CSE 410, Spring 2004. File Management. Stephen Wagner Michigan State University

Operating Systems CSE 410, Spring 2004 File Management Stephen Wagner Michigan State University File Management File management system has traditionally been considered part of the operating system. Applications

### Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines Edward Bortnikov & Ronny Lempel Yahoo Labs, Haifa Indexing in Search Engines Information Retrieval s two main stages: Indexing process

### What Is Recursion? Recursion. Binary search example postponed to end of lecture

Recursion Binary search example postponed to end of lecture What Is Recursion? Recursive call A method call in which the method being called is the same as the one making the call Direct recursion Recursion

### Oracle Database 11g: SQL Tuning Workshop Release 2

Oracle University Contact Us: 1 800 005 453 Oracle Database 11g: SQL Tuning Workshop Release 2 Duration: 3 Days What you will learn This course assists database developers, DBAs, and SQL developers to

Advanced Oracle SQL Tuning Seminar content technical details 1) Understanding Execution Plans In this part you will learn how exactly Oracle executes SQL execution plans. Instead of describing on PowerPoint

### SQL Server Query Tuning

SQL Server Query Tuning Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author Pro SQL Server

### Performance Tuning for the Teradata Database

Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document

### Architecture Sensitive Database Design: Examples from the Columbia Group

Architecture Sensitive Database Design: Examples from the Columbia Group Kenneth A. Ross Columbia University John Cieslewicz Columbia University Jun Rao IBM Research Jingren Zhou Microsoft Research In

### Understanding Query Processing and Query Plans in SQL Server. Craig Freedman Software Design Engineer Microsoft SQL Server

Understanding Query Processing and Query Plans in SQL Server Craig Freedman Software Design Engineer Microsoft SQL Server Outline SQL Server engine architecture Query execution overview Showplan Common

### Big Data and Scripting. Part 4: Memory Hierarchies

1, Big Data and Scripting Part 4: Memory Hierarchies 2, Model and Definitions memory size: M machine words total storage (on disk) of N elements (N is very large) disk size unlimited (for our considerations)

### Generating code for holistic query evaluation Konstantinos Krikellas 1, Stratis D. Viglas 2, Marcelo Cintra 3

Generating code for holistic query evaluation Konstantinos Krikellas 1, Stratis D. Viglas 2, Marcelo Cintra 3 School of Informatics, University of Edinburgh, UK 1 k.krikellas@sms.ed.ac.uk 2 sviglas@inf.ed.ac.uk

### Oracle Database 11g: SQL Tuning Workshop

Oracle University Contact Us: + 38516306373 Oracle Database 11g: SQL Tuning Workshop Duration: 3 Days What you will learn This Oracle Database 11g: SQL Tuning Workshop Release 2 training assists database

### CIS 631 Database Management Systems Sample Final Exam

CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files

### Evaluation of view maintenance with complex joins in a data warehouse environment (HS-IDA-MD-02-301)

Evaluation of view maintenance with complex joins in a data warehouse environment (HS-IDA-MD-02-301) Kjartan Asthorsson (kjarri@kjarri.net) Department of Computer Science Högskolan i Skövde, Box 408 SE-54128

### Comp 5311 Database Management Systems. 3. Structured Query Language 1

Comp 5311 Database Management Systems 3. Structured Query Language 1 1 Aspects of SQL Most common Query Language used in all commercial systems Discussion is based on the SQL92 Standard. Commercial products

### Bloom Filters. Christian Antognini Trivadis AG Zürich, Switzerland

Bloom Filters Christian Antognini Trivadis AG Zürich, Switzerland Oracle Database uses bloom filters in various situations. Unfortunately, no information about their usage is available in Oracle documentation.

### Adaptive Join Operators for Result Rate Optimization on Streaming Inputs

1 Adaptive Join Operators for Result Rate Optimization on Streaming Inputs Mihaela A. Bornea, Vasilis Vassalos, Yannis Kotidis #1, Antonios Deligiannakis 2 # Department of Informatics, Department of Electronic

### Why Query Optimization? Access Path Selection in a Relational Database Management System. How to come up with the right query plan?

Why Query Optimization? Access Path Selection in a Relational Database Management System P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, T. Price Peyman Talebifard Queries must be executed and execution

### 6. Standard Algorithms

6. Standard Algorithms The algorithms we will examine perform Searching and Sorting. 6.1 Searching Algorithms Two algorithms will be studied. These are: 6.1.1. inear Search The inear Search The Binary

### Operating Systems: Internals and Design Principles. Chapter 12 File Management Seventh Edition By William Stallings

Operating Systems: Internals and Design Principles Chapter 12 File Management Seventh Edition By William Stallings Operating Systems: Internals and Design Principles If there is one singular characteristic

### Operating Systems. Virtual Memory

Operating Systems Virtual Memory Virtual Memory Topics. Memory Hierarchy. Why Virtual Memory. Virtual Memory Issues. Virtual Memory Solutions. Locality of Reference. Virtual Memory with Segmentation. Page

### RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems CLOUD COMPUTING GROUP - LITAO DENG

1 RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems CLOUD COMPUTING GROUP - LITAO DENG Background 2 Hive is a data warehouse system for Hadoop that facilitates

### Understanding SQL Server Execution Plans. Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner

Understanding SQL Server Execution Plans Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author

### Semi-Streamed Index Join for Near-Real Time Execution of ETL Transformations

Semi-Streamed Index Join for Near-Real Time Execution of ETL Transformations Mihaela A. Bornea #1, Antonios Deligiannakis 2, Yannis Kotidis #1,Vasilis Vassalos #1 # Athens U. of Econ and Business, Technical

### Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries

### Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design

Physical Database Design Process Physical Database Design Process The last stage of the database design process. A process of mapping the logical database structure developed in previous stages into internal

### Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr.

Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr. Dan Olteanu Submitted as part of Master of Computer Science Computing Laboratory

### File Management. File Management

File Management 1 File Management File management system consists of system utility programs that run as privileged applications Input to applications is by means of a file Output is saved in a file for

### Chapter 9 Joining Data from Multiple Tables. Oracle 10g: SQL

Chapter 9 Joining Data from Multiple Tables Oracle 10g: SQL Objectives Identify a Cartesian join Create an equality join using the WHERE clause Create an equality join using the JOIN keyword Create a non-equality

### SMALL INDEX LARGE INDEX (SILT)

Wayne State University ECE 7650: Scalable and Secure Internet Services and Architecture SMALL INDEX LARGE INDEX (SILT) A Memory Efficient High Performance Key Value Store QA REPORT Instructor: Dr. Song

### Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

Database Indexes How costly is this operation (naive solution)? course per weekday hour room TDA356 2 VR Monday 13:15 TDA356 2 VR Thursday 08:00 TDA356 4 HB1 Tuesday 08:00 TDA356 4 HB1 Friday 13:15 TIN090

### Loop Invariants and Binary Search

Loop Invariants and Binary Search Chapter 4.3.3 and 9.3.1-1 - Outline Ø Iterative Algorithms, Assertions and Proofs of Correctness Ø Binary Search: A Case Study - 2 - Outline Ø Iterative Algorithms, Assertions

### Data Warehousing und Data Mining

Data Warehousing und Data Mining Multidimensionale Indexstrukturen Ulf Leser Wissensmanagement in der Bioinformatik Content of this Lecture Multidimensional Indexing Grid-Files Kd-trees Ulf Leser: Data

### ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771

ENHANCEMENTS TO SQL SERVER COLUMN STORES Anuhya Mallempati #2610771 CONTENTS Abstract Introduction Column store indexes Batch mode processing Other Enhancements Conclusion ABSTRACT SQL server introduced

### Chapter 12 File Management

Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Roadmap Overview File organisation and Access

### Chapter 12 File Management. Roadmap

Operating Systems: Internals and Design Principles, 6/E William Stallings Chapter 12 File Management Dave Bremer Otago Polytechnic, N.Z. 2008, Prentice Hall Overview Roadmap File organisation and Access

### In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR

1 2 2 3 In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR The uniqueness of the primary key ensures that

### Outline. Distributed DBMS

Outline Introduction Background Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Distributed Transaction Management Data server approach Parallel architectures

### Execution Strategies for SQL Subqueries

Execution Strategies for SQL Subqueries Mostafa Elhemali, César Galindo- Legaria, Torsten Grabs, Milind Joshi Microsoft Corp With additional slides from material in paper, added by S. Sudarshan 1 Motivation

### Query tuning by eliminating throwaway

Query tuning by eliminating throwaway This paper deals with optimizing non optimal performing queries. Abstract Martin Berg (martin.berg@oracle.com) Server Technology System Management & Performance Oracle

### CS 245 Final Exam Winter 2013

CS 245 Final Exam Winter 2013 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have 140 minutes

### Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database

### Nearest Neighbor Join with Groups and Predicates

Nearest Neighbor Join with Groups and Predicates F. Cafagna, M. Boehlen, A. Bracher Department of Computer Science Agroscope University of Zürich Switzerland DOLAP, Melbourne 23.0.20 Francesco Cafagna

### Roadmap. Caching for Client/Server Database Architectures. Client/Server Database Architectures. Client-side caching. Page caching.

Roadmap Caching for Client/Server Database Architectures CPS 296.1 Topics in Database Systems!Semantic (query) caching Dar et al. Semantic Data Caching and Replacement. VLDB, 1996 Object caching (piggyback

With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

### Recovery System C H A P T E R16. Practice Exercises

C H A P T E R16 Recovery System Practice Exercises 16.1 Explain why log records for transactions on the undo-list must be processed in reverse order, whereas redo is performed in a forward direction. Answer:

### Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks]

EXAMINATIONS 2005 MID-YEAR COMP 302 Database Systems Time allowed: Instructions: 3 Hours Answer all questions. Make sure that your answers are clear and to the point. Write your answers in the spaces provided.

### The Tower of Hanoi. Recursion Solution. Recursive Function. Time Complexity. Recursive Thinking. Why Recursion? n! = n* (n-1)!

The Tower of Hanoi Recursion Solution recursion recursion recursion Recursive Thinking: ignore everything but the bottom disk. 1 2 Recursive Function Time Complexity Hanoi (n, src, dest, temp): If (n >

### Chapter 8: Bags and Sets

Chapter 8: Bags and Sets In the stack and the queue abstractions, the order that elements are placed into the container is important, because the order elements are removed is related to the order in which

### Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:

### Zabin Visram Room CS115 CS126 Searching. Binary Search

Zabin Visram Room CS115 CS126 Searching Binary Search Binary Search Sequential search is not efficient for large lists as it searches half the list, on average Another search algorithm Binary search Very

### Multi-dimensional index structures Part I: motivation

Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for

### 2) What is the structure of an organization? Explain how IT support at different organizational levels.

(PGDIT 01) Paper - I : BASICS OF INFORMATION TECHNOLOGY 1) What is an information technology? Why you need to know about IT. 2) What is the structure of an organization? Explain how IT support at different

### AS-2261 M.Sc.(First Semester) Examination-2013 Paper -fourth Subject-Data structure with algorithm

AS-2261 M.Sc.(First Semester) Examination-2013 Paper -fourth Subject-Data structure with algorithm Time: Three Hours] [Maximum Marks: 60 Note Attempts all the questions. All carry equal marks Section A

### 1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

Answer the following 1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++ 2) Which data structure is needed to convert infix notations to postfix notations? Stack 3) The

### Relational Databases

Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute

### DATA STRUCTURES USING C

DATA STRUCTURES USING C QUESTION BANK UNIT I 1. Define data. 2. Define Entity. 3. Define information. 4. Define Array. 5. Define data structure. 6. Give any two applications of data structures. 7. Give