QUERY OPTIMIZATION. CS 564- Fall ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan

Similar documents
University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

Chapter 13: Query Processing. Basic Steps in Query Processing

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

CS 525 Advanced Database Organization - Spring 2013 Mon + Wed 3:15-4:30 PM, Room: Wishnick Hall 113

Computer Security: Principles and Practice

Ch.5 Database Security. Ch.5 Database Security Review

Access Path Selection in a Relational Database Management System

There are five fields or columns, with names and types as shown above.

From Databases to Natural Language: The Unusual Direction

SQL Query Evaluation. Winter Lecture 23

Why Query Optimization? Access Path Selection in a Relational Database Management System. How to come up with the right query plan?

Evaluation of Expressions

Lecture 6: Query optimization, query tuning. Rasmus Pagh

Inside the PostgreSQL Query Optimizer

Advanced Oracle SQL Tuning

Chapter 5: Overview of Query Processing

Overview of Storage and Indexing

Relational Algebra. Query Languages Review. Operators. Select (σ), Project (π), Union ( ), Difference (-), Join: Natural (*) and Theta ( )

Assignment 3: SQL Queries. Questions 1. The following relations keep track of airline flight information:

Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8

Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : Hertford College Supervisor: Dr.

Chapter 14: Query Optimization

System R: Relational Approach to Database Management

B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Chapter 13: Query Optimization

Run$me Query Op$miza$on

SQL: QUERIES, CONSTRAINTS, TRIGGERS

Query Processing C H A P T E R12. Practice Exercises

Introduction to tuple calculus Tore Risch

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

The SQL Query Language. Creating Relations in SQL. Referential Integrity in SQL. Basic SQL Query. Primary and Candidate Keys in SQL

University of Aarhus. Databases IBM Corporation

Oracle Database 11g: SQL Tuning Workshop

ASTERIX: An Open Source System for Big Data Management and Analysis (Demo) :: Presenter :: Yassmeen Abu Hasson

Access Path Selection in a Relational Database Management System. P. Griffiths Selinger M. M. Astrahan D. D. Chamberlin,: It. A. Lorie.

Reducing the Braking Distance of an SQL Query Engine

Access Path Selection in a Relational Database Management System. P. Griffiths Selinger M. M. Astrahan D. D. Chamberlin,: It. A. Lorie.

Pattern Insight Clone Detection

Query Optimization in a Memory-Resident Domain Relational Calculus Database System

Scheme G. Sample Test Paper-I

MapReduce examples. CSE 344 section 8 worksheet. May 19, 2011

Data Models and Database Management Systems (DBMSs) Dr. Philip Cannata

Object-Relational Query Processing

Aggregating Data Using Group Functions

University of Wisconsin. Imagine yourself standing in front of an exquisite buæet ælled with numerous delicacies.

Fallacies of the Cost Based Optimizer

Distributed Databases in a Nutshell

SQL Tuning Proven Methodologies

IBM DB2: LUW Performance Tuning and Monitoring for Single and Multiple Partition DBs

DB2 for Linux, UNIX, and Windows Performance Tuning and Monitoring Workshop

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Parquet. Columnar storage for the people

SQL Server Query Tuning

Part A: Data Definition Language (DDL) Schema and Catalog CREAT TABLE. Referential Triggered Actions. CSC 742 Database Management Systems

Physical Database Design and Tuning

CSE 562 Database Systems

1. Physical Database Design in Relational Databases (1)

APP INVENTOR. Test Review

Query Processing. Q Query Plan. Example: Select B,D From R,S Where R.A = c S.E = 2 R.C=S.C. Himanshu Gupta CSE 532 Query Proc. 1

Index Selection Techniques in Data Warehouse Systems

Visual Data Mining. Motivation. Why Visual Data Mining. Integration of visualization and data mining : Chidroop Madhavarapu CSE 591:Visual Analytics

Relational Algebra 2. Chapter 5.2 V3.0. Napier University Dr Gordon Russell

CS54100: Database Systems

Adaptive Virtual Partitioning for OLAP Query Processing in a Database Cluster

Oracle Database 11g: SQL Tuning Workshop Release 2

Efficient Processing of Joins on Set-valued Attributes

Storage in Database Systems. CMPSCI 445 Fall 2010

Distributed Database Design (Chapter 5)

Query Optimization Over Web Services Using A Mixed Approach

Technical Challenges for Big Health Care Data. Donald Kossmann Systems Group Department of Computer Science ETH Zurich

First, here are our three tables in a fictional sales application:

2 Associating Facts with Time

Data Management in the Cloud

Rethinking SIMD Vectorization for In-Memory Databases

DB2 LUW Performance Tuning and Monitoring for Single and Multiple Partition DBs

In-Memory Database: Query Optimisation. S S Kausik ( ) Aamod Kore ( ) Mehul Goyal ( ) Nisheeth Lahoti ( )

Introducing Cost Based Optimizer to Apache Hive

Topics in basic DBMS course

Query Performance Tuning: Start to Finish. Grant Fritchey

DBMS / Business Intelligence, SQL Server

Graham Kemp (telephone , room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.

Database Internals (Overview)

Recovering Business Rules from Legacy Source Code for System Modernization

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

CS 564: DATABASE MANAGEMENT SYSTEMS

Preparing Data Sets for the Data Mining Analysis using the Most Efficient Horizontal Aggregation Method in SQL

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

DATABASE DESIGN - 1DL400

Inside the SQL Server Query Optimizer

VIEWS virtual relation data duplication consistency problems

Search on Encrypted Data

Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis

Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Active database systems. Triggers. Triggers. Active database systems.

Introduction to Parallel Programming and MapReduce

Principles of Database Management Systems. Overview. Principles of Data Layout. Topic for today. "Executive Summary": here.

Web Database Integration

Transcription:

QUERY OPTIMIZATION CS 564- Fall 2015 ACKs: Jeff Naughton, Jignesh Patel, AnHai Doan

EXAMPLE QUERY EMP(ssn, ename, addr, sal, did) 10000 tuples, 1000 pages DEPT(did, dname, floor, mgr) 500 tuples, 50 pages SELECT DISTINCT ename FROM Emp E, Dept D WHERE E.did = D.did AND D.dname = Toy ; 2

EVALUATION PLAN (1) SELECT DISTINCT ename FROM Emp E, Dept D WHERE E.did = D.did AND D.dname = Toy ; π ename σ dname = Toy σ EMP.did = DEPT.did X EMP DEPT 3

EVALUATION PLAN (2) SELECT DISTINCT ename FROM Emp E, Dept D WHERE E.did = D.did AND D.dname = Toy ; π ename σ dname = Toy Nested Loop Join EMP DEPT 4

EVALUATION PLAN (3) SELECT DISTINCT ename FROM Emp E, Dept D WHERE E.did = D.did AND D.dname = Toy ; π ename σ dname = Toy buffer size B= 50 Sort Merge Join EMP DEPT 5

PIPELINED EVALUATION Instead of materializing the temporary relation to disk, we can instead pipeline to the next operator How much cost does it save? When can pipelining be used? 6

EVALUATION PLAN (4) SELECT DISTINCT ename FROM Emp E, Dept D WHERE E.did = D.did AND D.dname = Toy ; π ename Sort Merge Join buffer size B= 50 index on dname σ dname = Toy DEPT EMP 7

QUERY OPTIMIZATION identify candidate equivalent trees for each candidate find best annotated version (using available indexes) choose the best overall by estimating the cost of each plan In practice we choose from a subset of possible plans 8

ARCHITECTURE OF AN OPTIMIZER query Query Parser parsed query Query Optimizer Plan generator Plan cost estimator System Catalog evaluation plan 9

QUERY OPTIMIZATION Plan: Annotated RA Tree operator interface: open/getnext/close pipelined or materialized Two main issues: What plans are considered? How is the cost of a plan estimated? Ideally: best plan! Practically: avoid worst plans! Look at a subset of all plans 10

COST ESTIMATION Estimate the cost of each operation in the plan depends on input cardinalities algorithm cost (we know this!) Estimate the size of result statistics about input relations for selections and joins, we assume independence of predicates 11

COST ESTIMATION Statistics stored in the catalogs cardinality size in pages # distinct keys (for index) range (for numeric values) Catalogs update periodically Commercial systems use histograms, which give more accurate estimates 12

EVALUATION PLANS There is a huge space of plans to navigate through Relational algebra equivalences help to construct many alternative plans 13

EQUIVALENCE (1) Commutativity of σ σ "# (σ "& (R)) σ "& (σ "# (R)) Cascading of σ σ "# " & ", (R) σ "# (σ "& ( σ ", (R))) Cascading of π π /# (R) π /# (π /& ( π /, (R) )) when a 1 a 134 This means that we can evaluate selections in any order! 14

EQUIVALENCE (2) Commutativity of join R S S R Associativity of join R S T R (S T) This means that we can reorder the computation of joins in any way! 15

EQUIVALENCE (3) Selections + Projections σ 8 (π 9 (R)) π 9 (σ " (R)) (if the selection involves attributes that remain after projection) Selections + Joins σ 8 R S σ 8 (R) S (if the selection involves attributes only in S) This means that we can push selections down the plan tree! 16

EVALUATION PLANS Single relation plan (no joins) file scan index scan(s): clustered or non- clustered More than one index may match predicates Choose the one with the least estimated cost Merge/pipeline selection and projection (and aggregate) Index aggregate evaluation 17

EVALUATION PLANS Multiple relation plan selections can be combined into joins joins can be reordered selections and projections can be pushed down the plan tree 18

JOIN REORDERING Consider the following join: R S T U Left- deep join plans Allow for fully pipelined evaluation U T R S 19