Query Evaluation Techniques for Large Databases. GOETZ GRAETE Presented by Long Zhang Dept. of Computer Science, UBC

Similar documents
Chapter 13: Query Processing. Basic Steps in Query Processing

Physical Data Organization

Query Processing C H A P T E R12. Practice Exercises

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

SQL Query Evaluation. Winter Lecture 23

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

Datenbanksysteme II: Implementation of Database Systems Implementing Joins

Lecture 1: Data Storage & Index

Storage in Database Systems. CMPSCI 445 Fall 2010

Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8

Performance Tuning for the Teradata Database

Big Data With Hadoop

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

DATABASE DESIGN - 1DL400

Overview of Storage and Indexing

The Classical Architecture. Storage 1 / 36

Storage and File Structure

ICOM 6005 Database Management Systems Design. Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 August 23, 2001

Big Data Technology Map-Reduce Motivation: Indexing in Search Engines

CS 525 Advanced Database Organization - Spring 2013 Mon + Wed 3:15-4:30 PM, Room: Wishnick Hall 113

Record Storage and Primary File Organization

MS SQL Performance (Tuning) Best Practices:

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 5 - DBMS Architecture

Operating Systems CSE 410, Spring File Management. Stephen Wagner Michigan State University

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

Introduction to IR Systems: Supporting Boolean Text Search. Information Retrieval. IR vs. DBMS. Chapter 27, Part A

B-Trees. Algorithms and data structures for external memory as opposed to the main memory B-Trees. B -trees

Unit Storage Structures 1. Storage Structures. Unit 4.3

SQL Server Query Tuning

Databases and Information Systems 1 Part 3: Storage Structures and Indices

In-memory Tables Technology overview and solutions

Oracle Database 11g: SQL Tuning Workshop

Inside the PostgreSQL Query Optimizer

Introduction to Parallel Programming and MapReduce

Advanced Oracle SQL Tuning

Hash joins and hash teams in Microsoft

Big Data and Scripting. Part 4: Memory Hierarchies

Physical Database Design and Tuning

Big Data Processing with Google s MapReduce. Alexandru Costan

1. Physical Database Design in Relational Databases (1)

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

INTRODUCTION TO DATABASE SYSTEMS

Tuning a Parallel Database Algorithm on a Shared-memory Multiprocessor

Technical Challenges for Big Health Care Data. Donald Kossmann Systems Group Department of Computer Science ETH Zurich

Chapter 8: Structures for Files. Truong Quynh Chi Spring- 2013

Understanding SQL Server Execution Plans. Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at

Why compute in parallel? Cloud computing. Big Data 11/29/15. Introduction to Data Management CSE 344. Science is Facing a Data Deluge!

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Lecture 6: Query optimization, query tuning. Rasmus Pagh

In-Memory Databases MemSQL

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

DBMS / Business Intelligence, SQL Server

Understanding Query Processing and Query Plans in SQL Server. Craig Freedman Software Design Engineer Microsoft SQL Server

CIS 631 Database Management Systems Sample Final Exam

6. Storage and File Structures

Symbol Tables. Introduction

Course 55144B: SQL Server 2014 Performance Tuning and Optimization

MapReduce. MapReduce and SQL Injections. CS 3200 Final Lecture. Introduction. MapReduce. Programming Model. Example

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

In-memory databases and innovations in Business Intelligence

Introduction to Database Systems. Chapter 1 Introduction. Chapter 1 Introduction

CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS

Efficient Processing of Joins on Set-valued Attributes

Data Mining in the Swamp

Data Management in the Cloud

Previous Lectures. B-Trees. External storage. Two types of memory. B-trees. Main principles

Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : Hertford College Supervisor: Dr.

Optimizing Performance. Training Division New Delhi

Storing Data: Disks and Files

Microsoft SQL Server performance tuning for Microsoft Dynamics NAV

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 13-1

Rethinking SIMD Vectorization for In-Memory Databases

Developing MapReduce Programs

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR

File Management. Chapter 12

16.1 MAPREDUCE. For personal use only, not for distribution. 333

Data storage Tree indexes

Chapter 2: Computer-System Structures. Computer System Operation Storage Structure Storage Hierarchy Hardware Protection General System Architecture

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

INTRODUCTION The collection of data that makes up a computerized database must be stored physically on some computer storage medium.

From GWS to MapReduce: Google s Cloud Technology in the Early Days

MySQL Storage Engines

In-Memory Databases Algorithms and Data Structures on Modern Hardware. Martin Faust David Schwalb Jens Krüger Jürgen Müller

Bringing Big Data Modelling into the Hands of Domain Experts

Storing Data: Disks and Files. Disks and Files. Why Not Store Everything in Main Memory? Chapter 7

Rackspace Cloud Databases and Container-based Virtualization

Chapter 13. Disk Storage, Basic File Structures, and Hashing

Oracle Database 11g: SQL Tuning Workshop Release 2

RevoScaleR Speed and Scalability

1) The postfix expression for the infix expression A+B*(C+D)/F+D*E is ABCD+*F/DE*++

I/O Management. General Computer Architecture. Goals for I/O. Levels of I/O. Naming. I/O Management. COMP755 Advanced Operating Systems 1

Spark in Action. Fast Big Data Analytics using Scala. Matei Zaharia. project.org. University of California, Berkeley UC BERKELEY

SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK

IBM DB2 for i indexing methods and strategies Learn how to use DB2 indexes to boost performance

2) What is the structure of an organization? Explain how IT support at different organizational levels.

Transcription:

Query Evaluation Techniques for Large Databases GOETZ GRAETE Presented by Long Zhang Dept. of Computer Science, UBC zhlong73@cs.ubc.ca 1

Purpose To survey efficient algorithms and software architecture of query execution engines for executing complex queries over large databases. 2

Restrictions Restrictive data definition and manipulation languages can make application development and maintenance unbearably cumbersome. Data volumes might be so large or complex that the real or perceived performance advantage of file systems is considered more important than all other criteria, e.g. the higher levels of abstraction and programmer productivity typically achieved with database management systems. 3

Topics algorithms and there costs sorting versus hashing parallelism resource allocation and scheduling general performance-enhancing techniques 4

Which parts we concern? User Interface Database Query Language Query Optimizer Query Execution Engine Files and Indices I/O Buffer Disk 5

Query Execution Engine What is it? Collection of query execution operators and mechanisms for operator communication and synchronization Query execution engine defines the space of possible plans that can be chosen by query optimizer. 6

Translate logical query from SQL to query tree in logical algebra.. Query tree in logical algebra is translated into a physical plan Optimizer expands search space and finds best plan. Optimal physical plan copied out of optimizer s memory structure and sent to query execution engine. query execution engine executes plan using relations in database as input, and produces output 7

Discussion Fact: On the first page, the author states that DBMSs have not been used in many areas for two reasons. Application development and maintenance is difficult. (Restrictive data definition and manipulation languages can make the development difficult) The data in those areas is SO big, that speed trumps all, and people would rather handcode.

Questions are Do you think in today s world databases are popular for most applications? IF NO, GO TO (1), IF YES, GO TO (2) 1)Why do you think databases aren't used more? Why don't you use them on your data? 2)How do database systems' needs for dealing with large amounts of data differ from other applications or systems such as OS? Should we only depend on it? IF YES, what should be changed in the architecture of databases or data abstraction?

1. Architecture of Query Execution Engines Focus on useful mechanisms for processing sets of items Records Tuples Entities Objects 10

Logical algebra VS Physical algebra Logical algebra, e.g. relational algebra, is more closely related to the data model and defines what queries can be expressed in the data model. Physical algebra is system specific. For example, nested-loops joins, merge-join, hash join, etc. Systems may implement same model and logical algebra with different physical algebra. 11

Logical algebra VS physical algebra Specific algorithms and therefore cost functions are associated only with physical operators not logical algebra operators Mapping logical to physical non trivial: It involves algorithm choices Logical and physical operators not directly mapped Some operators in physical algebra may implement multiple logical operators etc 12

To sum up Query optimization is the mapping from logical to physical operations Query execution engine is the implementation of operations on physical representation types and of mechanisms for coordination and cooperation among multiple such operations in complex queries. 13

Implementation issues Prepare an operator for producing data : Open Produce an item : Next Perform final housekeeping: Close 14

Iterators Two important features of operators Can be combined into arbitrarily complex evaluation plans Any number of operators can schedule and execute each other in a single process without assistant from underlying OS

Observations Entire query plan executed within a single process Operators produce an item at a time on request Items never wait in a temporary file or buffer (pipelining) Efficient in time-space-product memory cost Iterators can schedule any type of trees including bushy trees No operator affected by the complexity of the whole plan

Sorting &Hashing The purpose of many query-processing algorithms is to perform some kind of matching, i.e., bringing items that are alike together and performing some operation on them. There are two basic approaches used for this purpose: sorting hashing. These are the basis for many join algorithms

Design Issues Sorting should be implemented as an iterator In order to ensure that sort module interfaces well with the other operators, (e.g., file scan or merge-join). Input to the sort module must be an iterator, and sort uses open, next, and close procedures to request its input therefore, sort input can come from a scan or a complex query plan, and sort operator can be inserted into a query plan at any place or at several places.

Out of Memory If the input is larger than main memory, the open-sort procedure creates sorted runs and merges them until only one final merge phase is left. Dividing and combining

More on Sorting For sorting large data sets there are two distinct sub-algorithm : One for sorting within main memory One for managing subsets of the data set on the disk. For practical reasons, e.g., ensuring that a run fits into main memory, the disk management algorithm typically uses physical dividing and logical combining (merging). A point of practical importance is the fan-in or degree of merging, but this is a parameter rather than a defining algorithm property.

Creating level-0 runs In memory sort, e.g. Quicksort. By doing so, each run will have the size of allocated memory. Replacement selection. Using priority heap, which support insert and removesmallest operations. The main purpose is to generate the longest initial ordered sequence for each file.

Replacement Selection Sort Input Buffer, Output Buffer and Ram have the same size. Input File Input Buffer RAM Output Buffer Output File

Replacement Selection Sort In memory: 12 Next input: 20 10 19 31 25 21 56 40 Do not need to care When the run is Larger than the memory

Merging In Memory Run 1 Run 2 Run 3 Output Buffer On the disk

Merging Scans are faster if read-ahead and write-behind are used; Using large cluster sizes for the run files is very beneficial; The number of runs W is typically not a power of Fan-in;

Merging as many as possible instead of always merging the same level

Quick Sort VS Replacement selection Run files in RS are typically larger than memory,as oppose to QS where they are the size of the memory Qs results in burst of reads and writes for entire memory loads from the input file to initial run files while RS alternates between individual read and write In RS memory management is more complex The advantage of having fewer runs must be balanced with the different I/0 pattern and the disadvantage of more complex memory management.

2.2 Hashing Alternative to sorting Expected complexity of hashing algorithms is O(N) rather than O( N log N) as for sorting. Hash-based query processing algorithms use an in-memory hash table of database objects to perform their matching task.

Hash Table overflow When hash table is larger than memory, hash table overflow occurs and must be dealt with. Input divided into multiple partition files such that partitions can be processed independently from one another Concatenation of results of all partitions is the result of the entire operation. 29

Discussion 2 Question 2 Fact1: Nowadays, data is larger and larger than before. Fact2: Memory overflow was a serious problem to both sorting and hashing at the time when the paper was written. Fact3: Machines are more powerful than the time of this paper: such as larger memory.

The question is For query execution in large database system: Do you think that in the time since then the issues would have gotten Better? Because memories have gotten larger? Worse? Because there is a bigger gap between the time it takes to access memory and the time to access things on disk? Do you think increasing power of the machines could deal with the large data processing or still worth making great effort on the field as this paper did? Are there any other new issues worth consideration?

3. Disk Access File Scans Associative Access Using Indices Buffer Management

3.1 File scans Make it fast -> Read ahead -> requires contiguous file allocation

Associative Access Using Indices Goal: To reduce the number of accesses to secondary storage How? By employing associative search techniques in the form of indices Indices map key or attribute values to locator information with which database objects can be retrieved.

Some Index Structures: Clustered & Un-clustered Clustered: order or organization of index entries determines order of items on disk. Sparse & Dense Sparse: Indices do not contain an entry for each data item in the primary file, but only one entry for each page of the primary file; Dense: there are same number of entries in index as there are items in primary file. Non-clustering indices must always be dense

Buffer Management Goal: reduce I/O cost by cashing data in an I/O buffer. Design Issues Recovery Replacement policy performance effect of buffer allocation Interactions of index retrieval and buffer management Implementation Issues Interface provided : fixing unfixing Intermediate results kept in a separate buffer

Part 5 Binary Matching Operations There are a number of database operations that combine information from two inputs, files, or sets. A group of operators that all do basically the same task are called the one-to-one match operations.

5.1 Nested-Loops Join For each item in one input (called the outer input), scan the entire other input (called the inner input) and find matches. Temporary file used to store inner input. Shortcomings: the inner input is scanned very often.

Improvements for Nested-Loops If a single match (semi-join) carries all necessary information, a scan of the inner input can be terminated after the first match. Scan Each item of inner input-> scan each page (block nested-loops join). More in memory computation Scanning the inner input Forward and Backward, reusing the last page.

5.2 Merge-Join Algorithms Require both inputs are sorted on the join attribute. Similar to merge process used in sorting Heap-filter merge-join: a combination of nested-loops join and merge-join Hybrid join (used by IBM for DB2), uses elements from index nested-loop joins and merge join, and techniques joining sorted lists on index leaf entries.

5.3 Hash Join Algorithms Are based on the idea of building an inmemory hash table on one input and then probing this hash table using items from the other input. Good News: Very fast, no temporary files, if the build input does indeed fits into memory.

5.3 Hash Join Algorithms Build and probe inputs are partitioned using the same partitioning function. The final join result comes from concatenating the join results of pairs of partitioning files. Recursive partitioning is used when a build partition file is larger than memory.

Discussion The author mentions that many commercial database systems use only the nested loop and merge joins due to the results of the System R project, which did not consider hash join algorithms. He states that hash join is today regarded as the most efficient in many cases. Should hash joins have been considered for use in the System R project? Why? What factors might have made the System R team not consider hash join?

7. Duality of Sort And Hash-Based Query Processing Algorithms Sort based algorithms: a large data set is divided into subsets using a physical rule, namely into chunks as large as memory. Hash based algorithms: Large inputs are cut into subsets by hash value.

Several dualities I/O capabilities (random and sequential). Memory limitation (fan-in, fan-out) Read-ahead (merging) VS write-behind (partitioning) Etc There exist many dualities between sorting using multilevel merging and recursive hash table overflow management.

Two special cases If two join inputs are of different size (and the query optimizer can reliably predict this difference), hybrid hash join outperforms merge-join. If the hash function is very poor, merge-join performs better than hash join.

8. Execution of Complex Query Plans Optimal scheduling of multiple operators and the division and allocation of resources in a complex plan are important issues. Share of physical resources, i.e. memory, disk bandwidth

earlier relational execution engines VS nowadays In earlier time: Concurrent execution of multiple subplans in a single query was not possible. Temporary files written impedes concurrency and need for resource allocation. Now: Consider more join algorithms, more complex query plans. Support more concurrent users and use parallelprocessing capabilities.

For bushy plans Stop points (all data in temporary files and no intermediate result data in memory) can be used to switch subplans. Which subplan is initiated first. Memory partition among active concurrent operators. Recursion levels should be executed level by level in recursive hybrid hash join

Other I/O issues, e.g. disk bandwidth and disk arms for seeking in partitioning and merging need to be considered. Scheduling bushy trees in multiprocessor systems is not entirely understood yet.

Discussion 4 When tuning your database system for performance, how much would you weight the common queries versus the computationally complex queries? the queries that touch a large amount of data versus the queries that touch a small amount of data?

It s over.