Query Processing. Query Processing Components

Size: px
Start display at page:

Download "Query Processing. Query Processing Components"

Transcription

1 Query Processing high level user query (SQL) Query Compiler Query Processor Plan Plan Cost Generator Estimator low level data manipulation commands (execution plan) Plan Evaluator 7-1 Query Processing Components Query language that is used SQL: intergalactic dataspeak Query execution methodology The steps that one goes through in executing high-level (declarative) user queries. Query optimization How do we determine a good execution plan? 7-2

2 What are we trying to do? Consider query For each project whose budget is greater than $ and which employs more than two employees, list the names and titles of employees. In SQL SELECT Ename, Title FROM Emp, Project, Works WHERE Budget > AND Emp.Eno=Works.Eno AND Project.Pno=Works.Pno AND Project.Pno IN (SELECT w.pno FROM Works w GROUP BY w.pno HAVING SUM(*) > 2) How to execute this query? 7-3 A Possible Execution Plan 1. T 1 Scan Project table and select all tuples with Budget value > T 2 Join T 1 with the Works relation 3. T 3 Join T 2 with the Emp relation 4. T 4 Group tuples of T 3 over Pno 5. Scan tuples in each group of T 4 and for groups that have more than 2 tuples, Project over Ename, Title Note: Overly simplified we ll detail later. 7-4

3 Pictorial Representation Π Ename, Title T 1 σ Budget> Project T 2 T 4 Group by T 3 Works Emp 1. How do we get this plan? 2. How do we execute each of the nodes? Π Ename, Title (Group Pno,Eno (Emp(σ Budget> ProjectWorks))) 7-5 Query Processing Methodology SQL Queries Normalization Normalization Analysis Analysis Simplification Simplification System Catalog Restructuring Restructuring Optimization Optimization Optimal Execution Plan 7-6

4 Query Normalization Lexical and syntactic analysis check validity (similar to compilers) check for attributes and relations type checking on the qualification Put into (query normal form Conjunctive normal form (p 11 p 12 p 1n ) (p m1 p m2 p mn ) Disjunctive normal form (p 11 p 12 p 1n ) (p m1 p m2 p mn ) OR's mapped into union AND's mapped into join or selection 7-7 Analysis Refute incorrect queries Type incorrect If any of its attribute or relation names are not defined in the global schema If operations are applied to attributes of the wrong type Semantically incorrect Components do not contribute in any way to the generation of the result Only a subset of relational calculus queries can be tested for correctness Those that do not contain disjunction and negation To detect connection graph (query graph) join graph 7-8

5 Emp.Eno=Works.Eno Title = Programmer Ename Analysis Example SELECT Ename,Resp FROM Emp, Works, Project WHERE Emp.Eno = Works.Eno AND Works.Pno = Project.Pno AND Pname = CAD/CAM AND Dur > 36 AND Title = Programmer Query graph Works Resp RESULT Dur>36 Works.Pno=Project.Pno Emp.Eno=Works.Eno Pname= CAD/CAM Join graph Works Works.Pno=Project.Pno Emp Project Emp Project 7-9 Analysis If the query graph is not connected, the query may be wrong. SELECT Ename,Resp FROM Emp, Works, Project WHERE Emp.Eno = Works.Eno AND Pname = CAD/CAM AND Dur > 36 AND Title = Programmer Works Emp Resp Project Ename RESULT Pname= CAD/CAM 7-10

6 Why simplify? The simpler the query, the easier (and more efficient) it is to execute it How? Use transformation rules elimination of redundancy idempotency rules p 1 ( p 1 ) false p 1 (p 1 p 2 ) p 1 Simplification p 1 false p 1 application of transitivity use of integrity rules 7-11 Simplification Example SELECT FROM WHERE OR AND OR AND SELECT FROM WHERE Title Emp Ename = J. Doe (NOT(Title = Programmer ) (Title = Programmer Title = Elect. Eng. ) NOT(Title = Elect. Eng. )) Title Emp Ename = J. Doe 7-12

7 Convert SQL to relational algebra Make use of query trees Example Restructuring SELECT Ename FROM Emp, Works, Project WHERE Emp.Eno = Works.Eno AND Works.Pno = Project.Pno AND Ename <> J. Doe AND Pname = CAD/CAM AND (Dur = 12 OR Dur = 24) Π ENAME σ DUR=12 OR DUR=24 σ PNAME= CAD/CAM σ ENAME J. DOE PNO ENO Project Select Join Project Works Emp 7-13 How to implement operators Selection (assume R has n pages) Scan without an index O(n) Scan with index B + index O(logn) Hash index O(1) Projection Without duplicate elimination O(n) With duplicate elimination Sorting-based O(nlogn) Hash-based O(n+t) where t is the result of hashing phase 7-14

8 Join How to implement operators (cont d) Nested loop join: RS foreach tuple r R do foreach tuple s S do if r==s then add <r,s> to result O(n*m) Improvements possible by page-oriented nested loop join block-oriented nested loop join 7-15 Join How to implement operators (cont d) Index nested loop join: RS foreach tuple r R do use index on join attr. to find tuples of S foreach such tuple s S do add <r,s> to result Sort-merge join Sort R and S on the join attribute Merge the sorted relations Hash join Hash R and S using a common hash function Within each bucket, find tuples where r=s 7-16

9 Index Selection Guidelines Hash vs tree index Hash index on inner is very good for Index Nested Loops. Should be clustered if join column is not key for inner, and inner tuples need to be retrieved. Clustered B+ tree on join column(s) good for Sort- Merge Example 1 SELECT e.ename, w.dur FROM Emp e, Works w WHERE w.resp= Mgr AND e.eno=w.eno Hash index on w.resp supports Mgr selection. Hash index on w.eno allows us to get matching (inner) Emp tuples for each selected (outer) Works tuple. What if WHERE included: AND e.title=`programmer? Could retrieve Emp tuples using index on e.title, then join with Works tuples satisfying Resp selection. 7-18

10 Example 2 SELECT e.ename, w.resp FROM Emp e, Works w WHERE e.age BETWEEN 45 AND 60 AND e.title= Programmer AND e.eno=w.eno Clearly, Emp should be the outer relation. Suggests that we build a hash index on w.eno. What index should we build on Emp? B+ tree on e.age could be used, OR an index on e.title could be used. Only one of these is needed, and which is better depends upon the selectivity of the conditions. As a rule of thumb, equality selections more selective than range selections. As both examples indicate, our choice of indexes is guided by the plan(s) that we expect an optimizer to consider for a query. Have to understand optimizers! 7-19 Examples of Clustering SELECT e.title FROM Emp e WHERE e.age > 40 B+ tree index on e.age can be used to get qualifying tuples. How selective is the condition? Is the index clustered? 7-20

11 Clustering and Joins SELECT e.ename, p.pname FROM Emp e, Project p WHERE p.budget= AND e.city=p.city Clustering is especially important when accessing inner tuples in Index Nested Loop join. Should make index on e.city clustered. Suppose that the WHERE clause is instead: WHERE e.title= Programmer AND e.city=p.city If many employees are Programmers, Sort-Merge join may be worth considering. A clustered index on p.city would help. Summary: Clustering is useful whenever many tuples are to be retrieved Selecting Alternatives SELECT Ename FROM Emp e,works w WHERE e.eno = w.eno AND w.dur > 37 Strategy 1 Π ENAME (σ DUR>37 EMP.ENO=ASG.ENO (Emp Works)) Strategy 2 Π ENAME (Emp ENO (σ DUR>37 (Works))) Strategy 2 is better because It avoids Cartesian product It selects a subset of Works before joining How to determine the better alternative? 7-22

12 Query Optimization Issues Types of Optimizers Exhaustive search cost-based optimal combinatorial complexity in the number of relations Heuristics not optimal regroup common sub-expressions perform selection, projection as early as possible reorder operations to reduce intermediate relation size optimize individual operations 7-23 Query Optimization Issues Optimization Granularity Single query at a time cannot use common intermediate results Multiple queries at a time efficient if many similar queries decision space is much larger 7-24

13 Query Optimization Issues Optimization Timing Static compilation optimize prior to the execution difficult to estimate the size of the intermediate results error propagation can amortize over many executions Dynamic run time optimization exact information on the intermediate relation sizes have to reoptimize for multiple executions Hybrid compile using a static algorithm if the error in estimate sizes > threshold, reoptimize at run time 7-25 Query Optimization Issues Statistics Relation cardinality size of a tuple fraction of tuples participating in a join with another relation Attribute cardinality of domain actual number of distinct values Common assumptions independence between different attribute values uniform distribution of attribute values within their domain 7-26

14 Query Optimization Components Cost function (in terms of time) I/O cost + CPU cost These might have different weights Can also maximize throughput Solution space The set of equivalent algebra expressions (query trees). Search algorithm How do we move inside the solution space? Exhaustive search, heuristic algorithms (iterative improvement, simulated annealing, genetic, ) 7-27 Cost Calculation Cost function takes CPU and I/O processing into account Instruction and I/O path lengths Estimate the cost of executing each node of the query tree Is pipelining used or are temporary relations created? Estimate the size of the result of each node Selectivity of operations reduction factor Error propagation is possible 7-28

15 Intermediate Relation Sizes Selection size(r) = card(r) length(r) card(σ F (R)) = SF σ (F) card(r) where 1 S F σ (A = value) = card( A (R)) max(a) value S F σ (A > value) = max(a) min(a) value min(a) S F σ (A < value) = max(a) min(a) SF σ (p(a i ) p(a j )) = SF σ (p(a i )) SF σ (p(a j )) SF σ (p(a i ) p(a j )) = SF σ (p(a i )) + SF σ (p(a j )) (SF σ (p(a i )) SF σ (p(a j ))) SF σ (A value) = SF σ (A= value) card({values}) 7-29 Intermediate Relation Sizes Projection card(π A (R))=card(R) Cartesian Product Union card(r S) = card(r) card(s) upper bound: card(r S) = card(r) + card(s) lower bound: card(r S) = max{card(r), card(s)} Set Difference upper bound: card(r S) = card(r) lower bound:

16 Intermediate Relation Size Join Special case: A is a key of R and B is a foreign key of S; card(r A=B S) = card(s) More general: card(r S) = SF J card(r) card(s) 7-31 Search Space Characterized by equivalent query plans Equivalence is defined in terms of equivalent query results Equivalent plans are generated by means of algebraic transformation rules The cost of each plan may be different Focus on joins 7-32

17 Search Space Join Trees For N relations, there are O(N!) equivalent join trees that can be obtained by applying commutativity and associativity rules SELECT FROM WHERE AND Emp Ename,Resp Emp, Works, Project Project Emp.Eno=Works.Eno Works.PNO=Project.PNO Works Works Project Emp Works Project Emp 7-33 Transformation Rules Commutativity of binary operations R S S R R S S R R S S R Associativity of binary operations ( R S ) T R (S T) ( R S ) T R (S T ) Idempotence of unary operations Π A (Π A (R)) Π A (R) σ p1 (A 1 ) (σ p 2 (A 2 ) (R)) = σ p 1 (A 1 ) p 2 (A 2 ) (R) where R[A] and A' A, A" A and A' A" 7-34

18 Transformation Rules Commuting selection with projection Commuting selection with binary operations σ p(a) (R S) (σ p(a) (R)) S σ p(ai ) (R (A j,b k ) S) (σ p(a i ) (R)) (A j,b k ) S σ p(ai ) (R T) σ p(a i ) (R) σ p(a i ) (T) where A i belongs to R and T Commuting projection with binary operations Π C (R S) Π A (R) Π B (S) Π C (R (Aj,B k ) S) Π A (R) (A j,b k ) Π B (S) Π C (R S) Π C (R) Π C (S) where R[A] and S[B]; C = A' B' where A' A, B' B, A j A', B k B' 7-35 Example Consider the query: Find the names of employees other than J. Doe who worked on the CAD/CAM project for either one or two years. Π ENAME σ DUR=12 OR DUR=24 Project SELECT Ename FROM Project p, Works w, Emp e WHERE w.eno=e.eno AND w.pno=p.pno AND Ename<>`J. Doe AND p.pname=`cad/cam AND (Dur=12 OR Dur=24) σ PNAME= CAD/CAM σ ENAME J. DOE Select Join Project Works Emp 7-36

19 Equivalent Query Π Ename σ Pname=`CAD/CAM (Dur=12 Dur=24) Ename<>`J. DOE Works Project Emp 7-37 Another Equivalent Query Π Ename Π Pno,Ename Π Pno Π Pno,Eno Π Eno,Ename σ Pname = `CAD/CAM σ Dur =12 Dur=24 σ Ename <> `J. Doe Project Works Emp 7-38

20 Search Strategy How to move in the search space. Deterministic Start from base relations and build plans by adding one relation at each step Dynamic programming: breadth-first Greedy: depth-first Randomized Search for optimalities around a particular starting point Trade optimization time for execution time Better when > 5-6 relations Simulated annealing Iterative improvement 7-39 Search Algorithms Restrict the search space Use heuristics E.g., Perform unary operations before binary operations Restrict the shape of the join tree Consider only linear trees, ignore bushy ones Linear Join Tree R 4 Bushy Join Tree R 3 R 1 R 2 R 1 R 2 R 3 R

21 Deterministic Search Strategies R 4 R 3 R 3 R 1 R 2 R 1 R 2 Randomized R 1 R 2 R 3 R 2 R 1 R 2 R 1 R Summary Declarative SQL queries need to be converted into low level execution plans These plans need to be optimized to find the best plan Optimization involves Search space: identifies the alternative plans and alternative execution algorithms for algebra operators This is done by means of transformation rules Cost function: calculates the cost of executing each plan CPU and I/O costs Search algorithm: controls which alternative plans are investigated 7-42

Distributed Database Management Systems

Distributed Database Management Systems Page 1 Distributed Database Management Systems Outline Introduction Distributed DBMS Architecture Distributed Database Design Distributed Query Processing Distributed Concurrency Control Distributed Reliability

More information

Chapter 5: Overview of Query Processing

Chapter 5: Overview of Query Processing Chapter 5: Overview of Query Processing Query Processing Overview Query Optimization Distributed Query Processing Steps Acknowledgements: I am indebted to Arturas Mazeika for providing me his slides of

More information

Distributed Databases in a Nutshell

Distributed Databases in a Nutshell Distributed Databases in a Nutshell Marc Pouly Marc.Pouly@unifr.ch Department of Informatics University of Fribourg, Switzerland Priciples of Distributed Database Systems M. T. Özsu, P. Valduriez Prentice

More information

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly

More information

Chapter 13: Query Processing. Basic Steps in Query Processing

Chapter 13: Query Processing. Basic Steps in Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Inside the PostgreSQL Query Optimizer

Inside the PostgreSQL Query Optimizer Inside the PostgreSQL Query Optimizer Neil Conway neilc@samurai.com Fujitsu Australia Software Technology PostgreSQL Query Optimizer Internals p. 1 Outline Introduction to query optimization Outline of

More information

Overview of Database Management

Overview of Database Management Overview of Database Management M. Tamer Özsu David R. Cheriton School of Computer Science University of Waterloo CS 348 Introduction to Database Management Fall 2012 CS 348 Overview of Database Management

More information

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs. Phases of database design Application requirements Conceptual design Database Management Systems Conceptual schema Logical design ER or UML Physical Design Relational tables Logical schema Physical design

More information

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Query optimization. DBMS Architecture. Query optimizer. Query optimizer. DBMS Architecture INSTRUCTION OPTIMIZER Database Management Systems MANAGEMENT OF ACCESS METHODS BUFFER MANAGER CONCURRENCY CONTROL RELIABILITY MANAGEMENT Index Files Data Files System Catalog BASE It

More information

Evaluation of Expressions

Evaluation of Expressions Query Optimization Evaluation of Expressions Materialization: one operation at a time, materialize intermediate results for subsequent use Good for all situations Sum of costs of individual operations

More information

www.gr8ambitionz.com

www.gr8ambitionz.com Data Base Management Systems (DBMS) Study Material (Objective Type questions with Answers) Shared by Akhil Arora Powered by www. your A to Z competitive exam guide Database Objective type questions Q.1

More information

Chapter 13: Query Optimization

Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Chapter 14: Query Optimization

Chapter 14: Query Optimization Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

CS 338 Join, Aggregate and Group SQL Queries

CS 338 Join, Aggregate and Group SQL Queries CS 338 Join, Aggregate and Group SQL Queries Bojana Bislimovska Winter 2016 Outline SQL joins Aggregate functions in SQL Grouping in SQL HAVING clause SQL Joins Specifies a table resulting from a join

More information

Relational Databases

Relational Databases Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute

More information

Distributed Database Design (Chapter 5)

Distributed Database Design (Chapter 5) Distributed Database Design (Chapter 5) Top-Down Approach: The database system is being designed from scratch. Issues: fragmentation & allocation Bottom-up Approach: Integration of existing databases (Chapter

More information

SQL Query Evaluation. Winter 2006-2007 Lecture 23

SQL Query Evaluation. Winter 2006-2007 Lecture 23 SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution

More information

Datenbanksysteme II: Implementation of Database Systems Implementing Joins

Datenbanksysteme II: Implementation of Database Systems Implementing Joins Datenbanksysteme II: Implementation of Database Systems Implementing Joins Material von Prof. Johann Christoph Freytag Prof. Kai-Uwe Sattler Prof. Alfons Kemper, Dr. Eickler Prof. Hector Garcia-Molina

More information

Relational Algebra. Query Languages Review. Operators. Select (σ), Project (π), Union ( ), Difference (-), Join: Natural (*) and Theta ( )

Relational Algebra. Query Languages Review. Operators. Select (σ), Project (π), Union ( ), Difference (-), Join: Natural (*) and Theta ( ) Query Languages Review Relational Algebra SQL Set operators Union Intersection Difference Cartesian product Relational Algebra Operators Relational operators Selection Projection Join Division Douglas

More information

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level) Comp 5311 Database Management Systems 16. Review 2 (Physical Level) 1 Main Topics Indexing Join Algorithms Query Processing and Optimization Transactions and Concurrency Control 2 Indexing Used for faster

More information

Distributed Database Systems. Prof. Dr. Carl-Christian Kanne

Distributed Database Systems. Prof. Dr. Carl-Christian Kanne Distributed Database Systems Prof. Dr. Carl-Christian Kanne 1 What is a Distributed Database System? A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed

More information

Guide to SQL Programming: SQL:1999 and Oracle Rdb V7.1

Guide to SQL Programming: SQL:1999 and Oracle Rdb V7.1 Guide to SQL Programming: SQL:1999 and Oracle Rdb V7.1 A feature of Oracle Rdb By Ian Smith Oracle Rdb Relational Technology Group Oracle Corporation 1 Oracle Rdb Journal SQL:1999 and Oracle Rdb V7.1 The

More information

Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr.

Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr. Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr. Dan Olteanu Submitted as part of Master of Computer Science Computing Laboratory

More information

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries

More information

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to: 14 Databases 14.1 Source: Foundations of Computer Science Cengage Learning Objectives After studying this chapter, the student should be able to: Define a database and a database management system (DBMS)

More information

C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods

C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods Overview 1 Determining Data Relationships 1 Understanding the Methods for Combining SAS Data Sets 3

More information

Advanced Oracle SQL Tuning

Advanced Oracle SQL Tuning Advanced Oracle SQL Tuning Seminar content technical details 1) Understanding Execution Plans In this part you will learn how exactly Oracle executes SQL execution plans. Instead of describing on PowerPoint

More information

Database Programming with PL/SQL: Learning Objectives

Database Programming with PL/SQL: Learning Objectives Database Programming with PL/SQL: Learning Objectives This course covers PL/SQL, a procedural language extension to SQL. Through an innovative project-based approach, students learn procedural logic constructs

More information

Information Systems SQL. Nikolaj Popov

Information Systems SQL. Nikolaj Popov Information Systems SQL Nikolaj Popov Research Institute for Symbolic Computation Johannes Kepler University of Linz, Austria popov@risc.uni-linz.ac.at Outline SQL Table Creation Populating and Modifying

More information

Oracle Database 11g: SQL Tuning Workshop

Oracle Database 11g: SQL Tuning Workshop Oracle University Contact Us: + 38516306373 Oracle Database 11g: SQL Tuning Workshop Duration: 3 Days What you will learn This Oracle Database 11g: SQL Tuning Workshop Release 2 training assists database

More information

IT2305 Database Systems I (Compulsory)

IT2305 Database Systems I (Compulsory) Database Systems I (Compulsory) INTRODUCTION This is one of the 4 modules designed for Semester 2 of Bachelor of Information Technology Degree program. CREDITS: 04 LEARNING OUTCOMES On completion of this

More information

Database Design Patterns. Winter 2006-2007 Lecture 24

Database Design Patterns. Winter 2006-2007 Lecture 24 Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each

More information

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file? Files What s it all about? Information being stored about anything important to the business/individual keeping the files. The simple concepts used in the operation of manual files are often a good guide

More information

Query Processing C H A P T E R12. Practice Exercises

Query Processing C H A P T E R12. Practice Exercises C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass

More information

Symbol Tables. Introduction

Symbol Tables. Introduction Symbol Tables Introduction A compiler needs to collect and use information about the names appearing in the source program. This information is entered into a data structure called a symbol table. The

More information

DATABASE DESIGN - 1DL400

DATABASE DESIGN - 1DL400 DATABASE DESIGN - 1DL400 Spring 2015 A course on modern database systems!! http://www.it.uu.se/research/group/udbl/kurser/dbii_vt15/ Kjell Orsborn! Uppsala Database Laboratory! Department of Information

More information

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division Answer Key UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division CS186 Fall 2003 Eben Haber Midterm Midterm Exam: Introduction to Database Systems This exam has

More information

Oracle 10g PL/SQL Training

Oracle 10g PL/SQL Training Oracle 10g PL/SQL Training Course Number: ORCL PS01 Length: 3 Day(s) Certification Exam This course will help you prepare for the following exams: 1Z0 042 1Z0 043 Course Overview PL/SQL is Oracle's Procedural

More information

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24 LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24 1. A database schema is a. the state of the db b. a description of the db using a

More information

Overview of Storage and Indexing

Overview of Storage and Indexing Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

CIS 631 Database Management Systems Sample Final Exam

CIS 631 Database Management Systems Sample Final Exam CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files

More information

Live Event Count Issue

Live Event Count Issue Appendix 3 Live Event Document Version 1.0 Table of Contents 1 Introduction and High Level Summary... 3 2 Details of the Issue... 4 3 Timeline of Technical Activities... 6 4 Investigation on Count Day

More information

npsolver A SAT Based Solver for Optimization Problems

npsolver A SAT Based Solver for Optimization Problems npsolver A SAT Based Solver for Optimization Problems Norbert Manthey and Peter Steinke Knowledge Representation and Reasoning Group Technische Universität Dresden, 01062 Dresden, Germany peter@janeway.inf.tu-dresden.de

More information

IT2304: Database Systems 1 (DBS 1)

IT2304: Database Systems 1 (DBS 1) : Database Systems 1 (DBS 1) (Compulsory) 1. OUTLINE OF SYLLABUS Topic Minimum number of hours Introduction to DBMS 07 Relational Data Model 03 Data manipulation using Relational Algebra 06 Data manipulation

More information

Efficient Data Structures for Decision Diagrams

Efficient Data Structures for Decision Diagrams Artificial Intelligence Laboratory Efficient Data Structures for Decision Diagrams Master Thesis Nacereddine Ouaret Professor: Supervisors: Boi Faltings Thomas Léauté Radoslaw Szymanek Contents Introduction...

More information

Lecture 6: Query optimization, query tuning. Rasmus Pagh

Lecture 6: Query optimization, query tuning. Rasmus Pagh Lecture 6: Query optimization, query tuning Rasmus Pagh 1 Today s lecture Only one session (10-13) Query optimization: Overview of query evaluation Estimating sizes of intermediate results A typical query

More information

Effective Use of SQL in SAS Programming

Effective Use of SQL in SAS Programming INTRODUCTION Effective Use of SQL in SAS Programming Yi Zhao Merck & Co. Inc., Upper Gwynedd, Pennsylvania Structured Query Language (SQL) is a data manipulation tool of which many SAS programmers are

More information

Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8

Overview of Storage and Indexing. Data on External Storage. Alternative File Organizations. Chapter 8 Overview of Storage and Indexing Chapter 8 How index-learning turns no student pale Yet holds the eel of science by the tail. -- Alexander Pope (1688-1744) Database Management Systems 3ed, R. Ramakrishnan

More information

In-Memory Database: Query Optimisation. S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027)

In-Memory Database: Query Optimisation. S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027) In-Memory Database: Query Optimisation S S Kausik (110050003) Aamod Kore (110050004) Mehul Goyal (110050017) Nisheeth Lahoti (110050027) Introduction Basic Idea Database Design Data Types Indexing Query

More information

Part A: Data Definition Language (DDL) Schema and Catalog CREAT TABLE. Referential Triggered Actions. CSC 742 Database Management Systems

Part A: Data Definition Language (DDL) Schema and Catalog CREAT TABLE. Referential Triggered Actions. CSC 742 Database Management Systems CSC 74 Database Management Systems Topic #0: SQL Part A: Data Definition Language (DDL) Spring 00 CSC 74: DBMS by Dr. Peng Ning Spring 00 CSC 74: DBMS by Dr. Peng Ning Schema and Catalog Schema A collection

More information

Execution Strategies for SQL Subqueries

Execution Strategies for SQL Subqueries Execution Strategies for SQL Subqueries Mostafa Elhemali, César Galindo- Legaria, Torsten Grabs, Milind Joshi Microsoft Corp With additional slides from material in paper, added by S. Sudarshan 1 Motivation

More information

D B M G Data Base and Data Mining Group of Politecnico di Torino

D B M G Data Base and Data Mining Group of Politecnico di Torino Database Management Data Base and Data Mining Group of tania.cerquitelli@polito.it A.A. 2014-2015 Optimizer objective A SQL statement can be executed in many different ways The query optimizer determines

More information

K-Cover of Binary sequence

K-Cover of Binary sequence K-Cover of Binary sequence Prof Amit Kumar Prof Smruti Sarangi Sahil Aggarwal Swapnil Jain Given a binary sequence, represent all the 1 s in it using at most k- covers, minimizing the total length of all

More information

Understanding SQL Server Execution Plans. Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner

Understanding SQL Server Execution Plans. Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner Understanding SQL Server Execution Plans Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author

More information

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Active database systems. Triggers. Triggers. Active database systems.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Active database systems. Triggers. Triggers. Active database systems. Active database systems Database Management Systems Traditional DBMS operation is passive Queries and updates are explicitly requested by users The knowledge of processes operating on data is typically

More information

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX ISSN: 2393-8528 Contents lists available at www.ijicse.in International Journal of Innovative Computer Science & Engineering Volume 3 Issue 2; March-April-2016; Page No. 09-13 A Comparison of Database

More information

Execution Plans: The Secret to Query Tuning Success. MagicPASS January 2015

Execution Plans: The Secret to Query Tuning Success. MagicPASS January 2015 Execution Plans: The Secret to Query Tuning Success MagicPASS January 2015 Jes Schultz Borland plan? The following steps are being taken Parsing Compiling Optimizing In the optimizing phase An execution

More information

MapReduce examples. CSE 344 section 8 worksheet. May 19, 2011

MapReduce examples. CSE 344 section 8 worksheet. May 19, 2011 MapReduce examples CSE 344 section 8 worksheet May 19, 2011 In today s section, we will be covering some more examples of using MapReduce to implement relational queries. Recall how MapReduce works from

More information

Instant SQL Programming

Instant SQL Programming Instant SQL Programming Joe Celko Wrox Press Ltd. INSTANT Table of Contents Introduction 1 What Can SQL Do for Me? 2 Who Should Use This Book? 2 How To Use This Book 3 What You Should Know 3 Conventions

More information

Distributed Aggregation in Cloud Databases. By: Aparna Tiwari tiwaria@umail.iu.edu

Distributed Aggregation in Cloud Databases. By: Aparna Tiwari tiwaria@umail.iu.edu Distributed Aggregation in Cloud Databases By: Aparna Tiwari tiwaria@umail.iu.edu ABSTRACT Data intensive applications rely heavily on aggregation functions for extraction of data according to user requirements.

More information

University of Wisconsin. yannis@cs.wisc.edu. Imagine yourself standing in front of an exquisite buæet ælled with numerous delicacies.

University of Wisconsin. yannis@cs.wisc.edu. Imagine yourself standing in front of an exquisite buæet ælled with numerous delicacies. Query Optimization Yannis E. Ioannidis æ Computer Sciences Department University of Wisconsin Madison, WI 53706 yannis@cs.wisc.edu 1 Introduction Imagine yourself standing in front of an exquisite buæet

More information

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries COMP 5138 Relational Database Management Systems Week 5 : Basic COMP5138 "Relational Database Managment Systems" J. Davis 2006 5-1 Today s Agenda Overview Basic Queries Joins Queries Aggregate Functions

More information

Performance Tuning for the Teradata Database

Performance Tuning for the Teradata Database Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document

More information

Query Optimization in a Memory-Resident Domain Relational Calculus Database System

Query Optimization in a Memory-Resident Domain Relational Calculus Database System Query Optimization in a Memory-Resident Domain Relational Calculus Database System KYU-YOUNG WHANG and RAVI KRISHNAMURTHY IBM Thomas J. Watson Research Center We present techniques for optimizing queries

More information

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML? CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 Introduction to Databases CS2 Spring 2005 (LN5) 1 Why databases? Why not use XML? What is missing from XML: Consistency

More information

Data Structure [Question Bank]

Data Structure [Question Bank] Unit I (Analysis of Algorithms) 1. What are algorithms and how they are useful? 2. Describe the factor on best algorithms depends on? 3. Differentiate: Correct & Incorrect Algorithms? 4. Write short note:

More information

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR 1 2 2 3 In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR The uniqueness of the primary key ensures that

More information

Why Query Optimization? Access Path Selection in a Relational Database Management System. How to come up with the right query plan?

Why Query Optimization? Access Path Selection in a Relational Database Management System. How to come up with the right query plan? Why Query Optimization? Access Path Selection in a Relational Database Management System P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, T. Price Peyman Talebifard Queries must be executed and execution

More information

Physical Database Design and Tuning

Physical Database Design and Tuning Chapter 20 Physical Database Design and Tuning Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1. Physical Database Design in Relational Databases (1) Factors that Influence

More information

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C

Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate

More information

1. Physical Database Design in Relational Databases (1)

1. Physical Database Design in Relational Databases (1) Chapter 20 Physical Database Design and Tuning Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1. Physical Database Design in Relational Databases (1) Factors that Influence

More information

Big Data Begets Big Database Theory

Big Data Begets Big Database Theory Big Data Begets Big Database Theory Dan Suciu University of Washington 1 Motivation Industry analysts describe Big Data in terms of three V s: volume, velocity, variety. The data is too big to process

More information

Access Path Selection in a Relational Database Management System

Access Path Selection in a Relational Database Management System Access Path Selection in a Relational Database Management System P. Griffiths Selinger M. M. Astrahan D. D. Chamberlin R. A. Lorie T. G. Price IBM Research Division, San Jose, California 95193 ABSTRACT:

More information

Semantic Errors in SQL Queries: A Quite Complete List

Semantic Errors in SQL Queries: A Quite Complete List Semantic Errors in SQL Queries: A Quite Complete List Christian Goldberg, Stefan Brass Martin-Luther-Universität Halle-Wittenberg {goldberg,brass}@informatik.uni-halle.de Abstract We investigate classes

More information

University of Aarhus. Databases 2009. 2009 IBM Corporation

University of Aarhus. Databases 2009. 2009 IBM Corporation University of Aarhus Databases 2009 Kirsten Ann Larsen What is good performance? Elapsed time End-to-end In DB2 Resource consumption CPU I/O Memory Locks Elapsed time = Sync. I/O + CPU + wait time I/O

More information

Temporal Database System

Temporal Database System Temporal Database System Jaymin Patel MEng Individual Project 18 June 2003 Department of Computing, Imperial College, University of London Supervisor: Peter McBrien Second Marker: Ian Phillips Abstract

More information

SQL Server Query Tuning

SQL Server Query Tuning SQL Server Query Tuning Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author Pro SQL Server

More information

Chapter 3: Distributed Database Design

Chapter 3: Distributed Database Design Chapter 3: Distributed Database Design Design problem Design strategies(top-down, bottom-up) Fragmentation Allocation and replication of fragments, optimality, heuristics Acknowledgements: I am indebted

More information

Evaluation of Sub Query Performance in SQL Server

Evaluation of Sub Query Performance in SQL Server EPJ Web of Conferences 68, 033 (2014 DOI: 10.1051/ epjconf/ 201468033 C Owned by the authors, published by EDP Sciences, 2014 Evaluation of Sub Query Performance in SQL Server Tanty Oktavia 1, Surya Sujarwo

More information

Unit 2.1. Data Analysis 1 - V2.0 1. Data Analysis 1. Dr Gordon Russell, Copyright @ Napier University

Unit 2.1. Data Analysis 1 - V2.0 1. Data Analysis 1. Dr Gordon Russell, Copyright @ Napier University Data Analysis 1 Unit 2.1 Data Analysis 1 - V2.0 1 Entity Relationship Modelling Overview Database Analysis Life Cycle Components of an Entity Relationship Diagram What is a relationship? Entities, attributes,

More information

HY345 Operating Systems

HY345 Operating Systems HY345 Operating Systems Recitation 2 - Memory Management Solutions Panagiotis Papadopoulos panpap@csd.uoc.gr Problem 7 Consider the following C program: int X[N]; int step = M; //M is some predefined constant

More information

Chapter 9 Joining Data from Multiple Tables. Oracle 10g: SQL

Chapter 9 Joining Data from Multiple Tables. Oracle 10g: SQL Chapter 9 Joining Data from Multiple Tables Oracle 10g: SQL Objectives Identify a Cartesian join Create an equality join using the WHERE clause Create an equality join using the JOIN keyword Create a non-equality

More information

Oracle Database 11g: SQL Tuning Workshop Release 2

Oracle Database 11g: SQL Tuning Workshop Release 2 Oracle University Contact Us: 1 800 005 453 Oracle Database 11g: SQL Tuning Workshop Release 2 Duration: 3 Days What you will learn This course assists database developers, DBAs, and SQL developers to

More information

INTEGER PROGRAMMING. Integer Programming. Prototype example. BIP model. BIP models

INTEGER PROGRAMMING. Integer Programming. Prototype example. BIP model. BIP models Integer Programming INTEGER PROGRAMMING In many problems the decision variables must have integer values. Example: assign people, machines, and vehicles to activities in integer quantities. If this is

More information

The Relational Algebra

The Relational Algebra The Relational Algebra Relational set operators: The data in relational tables are of limited value unless the data can be manipulated to generate useful information. Relational Algebra defines the theoretical

More information

ICAB4136B Use structured query language to create database structures and manipulate data

ICAB4136B Use structured query language to create database structures and manipulate data ICAB4136B Use structured query language to create database structures and manipulate data Release: 1 ICAB4136B Use structured query language to create database structures and manipulate data Modification

More information

5.1 Database Schema. 5.1.1 Schema Generation in SQL

5.1 Database Schema. 5.1.1 Schema Generation in SQL 5.1 Database Schema The database schema is the complete model of the structure of the application domain (here: relational schema): relations names of attributes domains of attributes keys additional constraints

More information

Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks]

Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks] EXAMINATIONS 2005 MID-YEAR COMP 302 Database Systems Time allowed: Instructions: 3 Hours Answer all questions. Make sure that your answers are clear and to the point. Write your answers in the spaces provided.

More information

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis

SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis SYSM 6304: Risk and Decision Analysis Lecture 5: Methods of Risk Analysis M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 17, 2015 Outline

More information

Relational Algebra. Basic Operations Algebra of Bags

Relational Algebra. Basic Operations Algebra of Bags Relational Algebra Basic Operations Algebra of Bags 1 What is an Algebra Mathematical system consisting of: Operands --- variables or values from which new values can be constructed. Operators --- symbols

More information

DB2 for i5/os: Tuning for Performance

DB2 for i5/os: Tuning for Performance DB2 for i5/os: Tuning for Performance Jackie Jansen Senior Consulting IT Specialist jjansen@ca.ibm.com August 2007 Agenda Query Optimization Index Design Materialized Query Tables Parallel Processing Optimization

More information

SQL Query Performance Tuning: Tips and Best Practices

SQL Query Performance Tuning: Tips and Best Practices SQL Query Performance Tuning: Tips and Best Practices Pravasini Priyanka, Principal Test Engineer, Progress Software INTRODUCTION: In present day world, where dozens of complex queries are run on databases

More information

Best Practices for DB2 on z/os Performance

Best Practices for DB2 on z/os Performance Best Practices for DB2 on z/os Performance A Guideline to Achieving Best Performance with DB2 Susan Lawson and Dan Luksetich www.db2expert.com and BMC Software September 2008 www.bmc.com Contacting BMC

More information

Fallacies of the Cost Based Optimizer

Fallacies of the Cost Based Optimizer Fallacies of the Cost Based Optimizer Wolfgang Breitling breitliw@centrexcc.com Who am I Independent consultant since 1996 specializing in Oracle and Peoplesoft setup, administration, and performance tuning

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis

Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis Rajesh Reddy Muley 1, Sravani Achanta 2, Prof.S.V.Achutha Rao 3 1 pursuing M.Tech(CSE), Vikas College of Engineering and

More information

Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner

Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner 24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India

More information

Relational Algebra. Module 3, Lecture 1. Database Management Systems, R. Ramakrishnan 1

Relational Algebra. Module 3, Lecture 1. Database Management Systems, R. Ramakrishnan 1 Relational Algebra Module 3, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model

More information

Evaluation of view maintenance with complex joins in a data warehouse environment (HS-IDA-MD-02-301)

Evaluation of view maintenance with complex joins in a data warehouse environment (HS-IDA-MD-02-301) Evaluation of view maintenance with complex joins in a data warehouse environment (HS-IDA-MD-02-301) Kjartan Asthorsson (kjarri@kjarri.net) Department of Computer Science Högskolan i Skövde, Box 408 SE-54128

More information

Run$me Query Op$miza$on

Run$me Query Op$miza$on Run$me Query Op$miza$on Robust Op$miza$on for Graphs 2006-2014 All Rights Reserved 1 RDF Join Order Op$miza$on Typical approach Assign es$mated cardinality to each triple pabern. Bigdata uses the fast

More information