Evaluation of Expressions

Size: px
Start display at page:

Download "Evaluation of Expressions"

Transcription

1 Query Optimization

2 Evaluation of Expressions Materialization: one operation at a time, materialize intermediate results for subsequent use Good for all situations Sum of costs of individual operations + cost of writing intermediate results to disk, may be costly Pipeline: evaluate several operations simultaneously, no need to store temporary results May not be applicable in some situations Double buffering: use two output buffers for each operation, when one is full write it to disk while the other is getting filled allowing overlap of disk writes with computation and reducing execution time CMPT 454: Database II -- Query Optimization 2

3 Pipelining Demand driven (pulling data up): the system keeps making requests for tuples from the operation at the top of the pipeline Producer driven (pushing data up): operations generate tuples eagerly into output buffers until they are full Iterators: many operations are active at once, pipelining Join using pipelining balance Iterator Hasnext() Next() σ balance<2500 ; use index 1 account CMPT 454: Database II -- Query Optimization 3

4 Query Optimization Example Query: find the names of all customers who have an account at any branch located in Brooklyn Relational expression: Pushing selection down CMPT 454: Database II -- Query Optimization 4

5 Steps in Query Optimization Input: an expression Generate expressions that are logically equivalent to the given expression Using equivalence rules Estimate the cost of each evaluation plan Using statistical information about the relations, e.g., size, indices, etc. Annotate the resultant expressions in alternative ways to generate alternative query-evaluation plans Cost-based optimization: find the query plan of the lowest cost CMPT 454: Database II -- Query Optimization 5

6 Equivalence Rules Two relational algebra expressions are said to be equivalent if on every legal database instance the two expressions generate the same set of tuples Two expressions in the multiset version of the relational algebra are said to be equivalent if on every legal database instance the two expressions generate the same multiset of tuples Order of tuples does not matter An equivalence rule says that expressions of two forms are equivalent We can replace one form by the other σ ( θ ) ( ( )) 1 θ E = σ E 2 θ σ 1 θ 2 σ ( θ σ ( )) ( ( )) 1 θ E = σ E 2 θ σ 2 θ 1 Π Π ( K( Π ( E)) K)) = Π ( ) L ( 1 L E 2 Ln L 1 σ θ (E 1 X E 2 ) = E 1 θ E 2 σ θ1 (E 1 θ2 E 2 ) = E 1 θ1 θ2 E 2 CMPT 454: Database II -- Query Optimization 6

7 More Equivalence Rules E 1 θ E 2 = E 2 θ E 1 (E 1 E 2 ) E 3 = E 1 (E 2 E 3 ) (E 1 θ1 E 2 ) θ2 θ3 E 3 = E 1 θ2 θ3 (E 2 θ2 E 3 ) σ θ0 (E 1 θ E 2 ) = (σ θ0 (E 1 )) θ E 2 σ θ1 θ2 (E 1 θ E 2 ) = (σ θ1 (E 1 )) θ (σ θ2 (E 2 )) CMPT 454: Database II -- Query Optimization 7

8 More Equivalence Rules L L ( E θ E2) = ( L ( E1)) θ ( L ( 2)) 1 L2 1 E 1 2 ( E θ E ) = 2 L (( L L L ( E1)) ( L L ( 2))) 1 L 1 E θ 2 4 E 1 E 2 = E 2 E 1 E 1 E 2 = E 2 E 1 (E 1 E 2 ) E 3 = E 1 (E 2 E 3 ) (E 1 E 2 ) E 3 = E 1 (E 2 E 3 ) σ θ (E 1 E 2 ) = σ θ (E 1 ) σ θ (E 2 ) and similarly for and in place of σ θ (E 1 for E 2 ) = σ θ (E 1 ) E 2 and similarly for in place of, but not Π L (E 1 E 2 ) = (Π L (E 1 )) (Π L (E 2 )) CMPT 454: Database II -- Query Optimization 8

9 Example Find the names of all customers with an account at a Brooklyn branch whose account balance is over $1000 Π customer_name( (σ branch_city = Brooklyn balance > 1000 (branch (account depositor))) Transformation using join associatively (Rule 6a): Π customer_name ((σ branch_city = Brooklyn balance > 1000 (branch account)) depositor) Second form provides an opportunity to apply the perform selections early rule, resulting in the subexpression σ branch_city = Brooklyn (branch) σ balance > 1000 (account) When we compute (σ branch_city = Brooklyn (branch) account ), we obtain a relation whose schema is (branch_name, branch_city, assets, account_number, balance) Push projections and eliminate unneeded attributes from intermediate results to get Π customer_name ((Π account_number ( (σ branch_city = Brooklyn (branch) account )) depositor ) Performing the projection as early as possible reduces the size of the relation to be joined CMPT 454: Database II -- Query Optimization 9

10 Example CMPT 454: Database II -- Query Optimization 10

11 Join Order Consider the expression Π customer_name ((σ branch_city = Brooklyn (branch)) (account depositor)) Could compute account depositor first, and join result with σ branch_city = Brooklyn (branch), but account depositor is likely to be a large relation Only a small fraction of the bank s customers are likely to have accounts in branches located in Brooklyn it is better to compute σ branch_city = Brooklyn (branch) account first CMPT 454: Database II -- Query Optimization 11

12 Catalog Information Stored in DBMS catalog n r : number of tuples in a relation r b r : number of blocks containing tuples of r l r : size of a tuple of r f r : blocking factor of r the number of tuples of r that fit into one block V(A, r): number of distinct values that appear in r for attribute A; same as the size of π A (r) If tuples of r are stored together physically in a file, then Maintained periodically Maintaining exact information is too expensive! Statistics information in real-world systems More information Histogram n b = f r r r Attribute age can be divided into ranges of 0-9, 10-19,, 90-99, 100+ The count of tuples in each range CMPT 454: Database II -- Query Optimization 12

13 Obtaining Statistics in SQL Server CREATE STATISTICS statistics_name ON { table view } ( column [,...n ] ) [ WITH [ [ FULLSCAN SAMPLE number { PERCENT ROWS } ] [, ] ] [ NORECOMPUTE ] ] CREATE STATISTICS names ON Customers (CompanyName, ContactName) WITH SAMPLE 5 PERCENT CREATE STATISTICS anames ON authors (au_lname, au_fname) WITH SAMPLE 50 PERCENT CMPT 454: Database II -- Query Optimization 13

14 Selection Size Estimation Simple case: σ A=v (r) Under uniform distribution: n r / V(A, r) n r : number of tuples in a relation r V(A, r): number of distinct values that appear in r for attribute A Uniform distribution often does not hold in real data sets A reasonable approximation of reality in many cases, keep our presentation relatively simple Histogram information can be used Estimation of Single Comparison σ A V (r) C = 0 if v < min(a,r) v min( A, r) C = n r max( A, r) min( A, r) min(a, r) and max(a, r): the minimum and maximum values that appear in r for attribute A, where n r is number of tuples in a relation r If statistics information is unavailable, then estimate c as n r / 2 CMPT 454: Database II -- Query Optimization 14

15 Estimation of Complex Cases Selectivity of a condition θ i The probability that a tuple in the relation r satisfies θ i If s i is the number of satisfying tuples in r, the selectivity of θ i is s i /n r Conjunction: σ θ1 θ2... θn (r): Disjunction: σ θ1 θ2... θn (r): 1 Negation: σ θ (r): n r size(σ θ (r)) n r n j= 1 s (1 n j r ) n r n j= 1 n n r s j CMPT 454: Database II -- Query Optimization 15

16 Join Size Estimation Cartesian product r x s: (n r * n s ) tuples R S = : r s is the same as r x s R S is a key for R: r s is no greater than the number of tuples in s A tuple of s will join with at most one tuple from r R Sin S is a foreign key in S referencing R: r s has the same number of tuples as s Example Consider the natural join R S U Statistics for three relations R(a,b) S(b,c) U(c,d) T(R) = 1000 V(R,b) = 20 T(S) = 2000 V(S,b) = 50 V(S,c) = 100 T(U) = 5000 V(U,c) = 500 CMPT 454: Database II -- Query Optimization 16

17 Example (Continued) Estimate for (R S) U T(R S) is T(R)T(S) / max(v(r,b), V(S,b)) T(R S) = 1000 * 2000 / 50 = Join R S with U T(R S)T(U) / max(v(r S,c), V(U,c)) V(R S,c) is the same as V(S,c) (=100) The result is, * 5000 / max(100, 500), or Could start by joining S and U first The estimate of any natural join is the same, regardless of how we order the joins CMPT 454: Database II -- Query Optimization 17

18 The Case of Non-Key Each value appears with equal probability s All tuples in r produce tuples in r s Each tuple t in r produces tuples in r s s All tuples t in s produce tuples in r s Choose the smaller one as the estimation n r ns max{ V ( A, r), V ( A, s)} n r n V ( A, s) n s V ( A, s) n r n V ( A, r) CMPT 454: Database II -- Query Optimization 18

19 Estimation for Other Operations Projection: estimated size of π A (r) = V(A,r) Aggregation: estimated size of A g F (r) = V(A,r) Outer join size of r s = size of r s + size of r size of r s = size of r s + size of r + size of s Inaccurate, only upper bounds on the sizes Set operations Unions/intersections of selections on the same relation: rewrite and use size estimate for selections σ θ1 (r) σ θ2 (r) can be rewritten as σ θ1 σ θ2 (r) Operations on different relations: Estimated size of r s = size of r + size of s Estimated size of r s = minimum size of r and size of s Estimated size of r s = r Inaccurate, upper bounds on the sizes CMPT 454: Database II -- Query Optimization 19

20 Estimation of Number of Distinct Values for Selections σ θ (r) If θ forces A to take a specified value: V(A,σ θ (r)) = 1 If θ forces A to take one of a specified set of values: V(A,σ θ (r)) = number of specified values e.g., (A = 1 V A= 3 V A = 4) If the selection condition θ is of the form A op r, estimated V(A,σ θ (r)) = V(A.r) * s, where s is the selectivity of the selection In all the other cases: use approximate estimate of min(v(a,r), n σθ (r) ) More accurate estimate is feasible using probability theory, but the above works fine generally CMPT 454: Database II -- Query Optimization 20

21 Enumeration of Equivalent Plans Use equivalence rules to systematically generate expressions equivalent to the given expression For each expression found so far, use all applicable equivalence rules, and add newly generated expressions to the set of expressions found so far Repeat until no more expressions can be found Reduce space cost: share common subexpressions CMPT 454: Database II -- Query Optimization 21

22 A Local Optimal Method Choose the best algorithm for each operator The global effect may not be optimal A merge sort may be costlier than a hash join, but enables fast algorithms for later operations, e.g., duplicate elimination, insertion, or another merge join All possible algorithms should be considered Consider all the possible plans and choose the best one in a cost-based fashion Use heuristics to choose a plan Practical system incorporates elements from both approaches CMPT 454: Database II -- Query Optimization 22

23 Cost-Based Optimization Generate a range of query-evaluation plans using the equivalence rules Choose the one with the least cost For a complex query, the number of different possible query plans can be large For r 1 r 2... R n, there are (2(n 1))!/(n 1)! different join orders! No need to generate all the join orders Use dynamic programming, the least-cost join order for any subset of {r 1, r 2,... r n } is computed only once and stored for future use CMPT 454: Database II -- Query Optimization 23

24 Find the Best Plan procedure findbestplan(s) Complexity: O(3 n ) if (bestplan[s].cost ) Space complexity: O(2 n ) return bestplan[s] // else bestplan[s] has not been computed earlier, compute it now for each non-empty proper subset S1 of S P1= findbestplan(s1) P2= findbestplan(s - S1) A = best algorithm for joining results of P1 and P2 cost = P1.cost + P2.cost + cost of A if cost < bestplan[s].cost bestplan[s].cost = cost bestplan[s].plan = execute P1.plan; execute P2.plan; join results of P1 and P2 using A return bestplan[s] CMPT 454: Database II -- Query Optimization 24

25 Heuristic Optimization Deconstruct conjunctive selections into a sequence of single selection operations Move selection operations down the query tree for the earliest possible execution Execute first those selection and join operations that will produce the smallest relations Replace Cartesian product operations that are followed by a selection condition by join operations Deconstruct and move as far down the tree as possible lists of projection attributes, creating new projections where needed Identify those subtrees whose operations can be pipelined, and execute them using pipelining CMPT 454: Database II -- Query Optimization 25

26 Left-Deep Join Order Only consider those join orders where the right operand of each join is one of the initial relations Convenient for pipelined evaluation, only one input the each join is pipelined If only left-deep join orders are considered, time complexity is O(n!) By dynamic programming, the complexity is O(n2 n ) For an n-way join, consider n sets of left-deep join orders such that each set starts with a different one of the n relations Left-deep join tree Non-left-deep join tree CMPT 454: Database II -- Query Optimization 26

Chapter 14: Query Optimization

Chapter 14: Query Optimization Chapter 14: Query Optimization Database System Concepts 5 th Ed. See www.db-book.com for conditions on re-use Chapter 14: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Chapter 13: Query Optimization

Chapter 13: Query Optimization Chapter 13: Query Optimization Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 13: Query Optimization Introduction Transformation of Relational Expressions Catalog

More information

Chapter 13: Query Processing. Basic Steps in Query Processing

Chapter 13: Query Processing. Basic Steps in Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Query Processing. Q Query Plan. Example: Select B,D From R,S Where R.A = c S.E = 2 R.C=S.C. Himanshu Gupta CSE 532 Query Proc. 1

Query Processing. Q Query Plan. Example: Select B,D From R,S Where R.A = c S.E = 2 R.C=S.C. Himanshu Gupta CSE 532 Query Proc. 1 Q Query Plan Query Processing Example: Select B,D From R,S Where R.A = c S.E = 2 R.C=S.C Himanshu Gupta CSE 532 Query Proc. 1 How do we execute query? One idea - Do Cartesian product - Select tuples -

More information

Query Processing C H A P T E R12. Practice Exercises

Query Processing C H A P T E R12. Practice Exercises C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass

More information

SQL Query Evaluation. Winter 2006-2007 Lecture 23

SQL Query Evaluation. Winter 2006-2007 Lecture 23 SQL Query Evaluation Winter 2006-2007 Lecture 23 SQL Query Processing Databases go through three steps: Parse SQL into an execution plan Optimize the execution plan Evaluate the optimized plan Execution

More information

Simple SQL Queries (3)

Simple SQL Queries (3) Simple SQL Queries (3) What Have We Learned about SQL? Basic SELECT-FROM-WHERE structure Selecting from multiple tables String operations Set operations Aggregate functions and group-by queries Null values

More information

SUBQUERIES AND VIEWS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 6

SUBQUERIES AND VIEWS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 6 SUBQUERIES AND VIEWS CS121: Introduction to Relational Database Systems Fall 2015 Lecture 6 String Comparisons and GROUP BY 2! Last time, introduced many advanced features of SQL, including GROUP BY! Recall:

More information

Relational Databases

Relational Databases Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute

More information

SQL is capable in manipulating relational data SQL is not good for many other tasks

SQL is capable in manipulating relational data SQL is not good for many other tasks Embedded SQL SQL Is Not for All SQL is capable in manipulating relational data SQL is not good for many other tasks Control structures: loops, conditional branches, Advanced data structures: trees, arrays,

More information

Introduction to SQL (3.1-3.4)

Introduction to SQL (3.1-3.4) CSL 451 Introduction to Database Systems Introduction to SQL (3.1-3.4) Department of Computer Science and Engineering Indian Institute of Technology Ropar Narayanan (CK) Chatapuram Krishnan! Summary Parts

More information

Comp 5311 Database Management Systems. 3. Structured Query Language 1

Comp 5311 Database Management Systems. 3. Structured Query Language 1 Comp 5311 Database Management Systems 3. Structured Query Language 1 1 Aspects of SQL Most common Query Language used in all commercial systems Discussion is based on the SQL92 Standard. Commercial products

More information

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries

More information

Chapter 6: Integrity Constraints

Chapter 6: Integrity Constraints Chapter 6: Integrity Constraints Domain Constraints Referential Integrity Assertions Triggers Functional Dependencies Database Systems Concepts 6.1 Silberschatz, Korth and Sudarshan c 1997 Domain Constraints

More information

More on SQL. Juliana Freire. Some slides adapted from J. Ullman, L. Delcambre, R. Ramakrishnan, G. Lindstrom and Silberschatz, Korth and Sudarshan

More on SQL. Juliana Freire. Some slides adapted from J. Ullman, L. Delcambre, R. Ramakrishnan, G. Lindstrom and Silberschatz, Korth and Sudarshan More on SQL Some slides adapted from J. Ullman, L. Delcambre, R. Ramakrishnan, G. Lindstrom and Silberschatz, Korth and Sudarshan SELECT A1, A2,, Am FROM R1, R2,, Rn WHERE C1, C2,, Ck Interpreting a Query

More information

Lecture 6: Query optimization, query tuning. Rasmus Pagh

Lecture 6: Query optimization, query tuning. Rasmus Pagh Lecture 6: Query optimization, query tuning Rasmus Pagh 1 Today s lecture Only one session (10-13) Query optimization: Overview of query evaluation Estimating sizes of intermediate results A typical query

More information

Advance DBMS. Structured Query Language (SQL)

Advance DBMS. Structured Query Language (SQL) Structured Query Language (SQL) Introduction Commercial database systems use more user friendly language to specify the queries. SQL is the most influential commercially marketed product language. Other

More information

Chapter 5: Overview of Query Processing

Chapter 5: Overview of Query Processing Chapter 5: Overview of Query Processing Query Processing Overview Query Optimization Distributed Query Processing Steps Acknowledgements: I am indebted to Arturas Mazeika for providing me his slides of

More information

Execution Strategies for SQL Subqueries

Execution Strategies for SQL Subqueries Execution Strategies for SQL Subqueries Mostafa Elhemali, César Galindo- Legaria, Torsten Grabs, Milind Joshi Microsoft Corp With additional slides from material in paper, added by S. Sudarshan 1 Motivation

More information

The MonetDB Architecture. Martin Kersten CWI Amsterdam. M.Kersten 2008 1

The MonetDB Architecture. Martin Kersten CWI Amsterdam. M.Kersten 2008 1 The MonetDB Architecture Martin Kersten CWI Amsterdam M.Kersten 2008 1 Try to keep things simple Database Structures Execution Paradigm Query optimizer DBMS Architecture M.Kersten 2008 2 End-user application

More information

SQL DATA DEFINITION: KEY CONSTRAINTS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7

SQL DATA DEFINITION: KEY CONSTRAINTS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7 SQL DATA DEFINITION: KEY CONSTRAINTS CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7 Data Definition 2 Covered most of SQL data manipulation operations Continue exploration of SQL

More information

Relational Division and SQL

Relational Division and SQL Relational Division and SQL Soulé 1 Example Relations and Queries As a motivating example, consider the following two relations: Taken(,Course) which contains the courses that each student has completed,

More information

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR

In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR 1 2 2 3 In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR The uniqueness of the primary key ensures that

More information

MapReduce examples. CSE 344 section 8 worksheet. May 19, 2011

MapReduce examples. CSE 344 section 8 worksheet. May 19, 2011 MapReduce examples CSE 344 section 8 worksheet May 19, 2011 In today s section, we will be covering some more examples of using MapReduce to implement relational queries. Recall how MapReduce works from

More information

Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr.

Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr. Query Optimization for Distributed Database Systems Robert Taylor Candidate Number : 933597 Hertford College Supervisor: Dr. Dan Olteanu Submitted as part of Master of Computer Science Computing Laboratory

More information

Index Selection Techniques in Data Warehouse Systems

Index Selection Techniques in Data Warehouse Systems Index Selection Techniques in Data Warehouse Systems Aliaksei Holubeu as a part of a Seminar Databases and Data Warehouses. Implementation and usage. Konstanz, June 3, 2005 2 Contents 1 DATA WAREHOUSES

More information

You can use command show database; to check if bank has been created.

You can use command show database; to check if bank has been created. Banking Example primary key is underlined) branch branch-name, branch-city, assets) customer customer-name, customer-street, customer-city) account account-number, branch-name, balance) loan loan-number,

More information

Schema Mappings and Data Exchange

Schema Mappings and Data Exchange Schema Mappings and Data Exchange Phokion G. Kolaitis University of California, Santa Cruz & IBM Research-Almaden EASSLC 2012 Southwest University August 2012 1 Logic and Databases Extensive interaction

More information

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries COMP 5138 Relational Database Management Systems Week 5 : Basic COMP5138 "Relational Database Managment Systems" J. Davis 2006 5-1 Today s Agenda Overview Basic Queries Joins Queries Aggregate Functions

More information

3. Relational Model and Relational Algebra

3. Relational Model and Relational Algebra ECS-165A WQ 11 36 3. Relational Model and Relational Algebra Contents Fundamental Concepts of the Relational Model Integrity Constraints Translation ER schema Relational Database Schema Relational Algebra

More information

DATABASE DESIGN - 1DL400

DATABASE DESIGN - 1DL400 DATABASE DESIGN - 1DL400 Spring 2015 A course on modern database systems!! http://www.it.uu.se/research/group/udbl/kurser/dbii_vt15/ Kjell Orsborn! Uppsala Database Laboratory! Department of Information

More information

Inside the PostgreSQL Query Optimizer

Inside the PostgreSQL Query Optimizer Inside the PostgreSQL Query Optimizer Neil Conway neilc@samurai.com Fujitsu Australia Software Technology PostgreSQL Query Optimizer Internals p. 1 Outline Introduction to query optimization Outline of

More information

Relational Algebra. Basic Operations Algebra of Bags

Relational Algebra. Basic Operations Algebra of Bags Relational Algebra Basic Operations Algebra of Bags 1 What is an Algebra Mathematical system consisting of: Operands --- variables or values from which new values can be constructed. Operators --- symbols

More information

Introducing Cost Based Optimizer to Apache Hive

Introducing Cost Based Optimizer to Apache Hive Introducing Cost Based Optimizer to Apache Hive John Pullokkaran Hortonworks 10/08/2013 Abstract Apache Hadoop is a framework for the distributed processing of large data sets using clusters of computers

More information

Access Queries (Office 2003)

Access Queries (Office 2003) Access Queries (Office 2003) Technical Support Services Office of Information Technology, West Virginia University OIT Help Desk 293-4444 x 1 oit.wvu.edu/support/training/classmat/db/ Instructor: Kathy

More information

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

low-level storage structures e.g. partitions underpinning the warehouse logical table structures DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures

More information

Analysis of Algorithms I: Binary Search Trees

Analysis of Algorithms I: Binary Search Trees Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary

More information

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1

External Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing

More information

Oracle Database 11g: SQL Tuning Workshop

Oracle Database 11g: SQL Tuning Workshop Oracle University Contact Us: + 38516306373 Oracle Database 11g: SQL Tuning Workshop Duration: 3 Days What you will learn This Oracle Database 11g: SQL Tuning Workshop Release 2 training assists database

More information

Part A: Data Definition Language (DDL) Schema and Catalog CREAT TABLE. Referential Triggered Actions. CSC 742 Database Management Systems

Part A: Data Definition Language (DDL) Schema and Catalog CREAT TABLE. Referential Triggered Actions. CSC 742 Database Management Systems CSC 74 Database Management Systems Topic #0: SQL Part A: Data Definition Language (DDL) Spring 00 CSC 74: DBMS by Dr. Peng Ning Spring 00 CSC 74: DBMS by Dr. Peng Ning Schema and Catalog Schema A collection

More information

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division Answer Key UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division CS186 Fall 2003 Eben Haber Midterm Midterm Exam: Introduction to Database Systems This exam has

More information

Chapter 4: SQL. Schema Used in Examples

Chapter 4: SQL. Schema Used in Examples Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries Derived Relations Views Modification of the Database Joined Relations Data Definition Language Embedded SQL,

More information

Analysis of Algorithms I: Optimal Binary Search Trees

Analysis of Algorithms I: Optimal Binary Search Trees Analysis of Algorithms I: Optimal Binary Search Trees Xi Chen Columbia University Given a set of n keys K = {k 1,..., k n } in sorted order: k 1 < k 2 < < k n we wish to build an optimal binary search

More information

SQL SELECT Query: Intermediate

SQL SELECT Query: Intermediate SQL SELECT Query: Intermediate IT 4153 Advanced Database J.G. Zheng Spring 2012 Overview SQL Select Expression Alias revisit Aggregate functions - complete Table join - complete Sub-query in where Limiting

More information

x a x 2 (1 + x 2 ) n.

x a x 2 (1 + x 2 ) n. Limits and continuity Suppose that we have a function f : R R. Let a R. We say that f(x) tends to the limit l as x tends to a; lim f(x) = l ; x a if, given any real number ɛ > 0, there exists a real number

More information

Double Integrals in Polar Coordinates

Double Integrals in Polar Coordinates Double Integrals in Polar Coordinates. A flat plate is in the shape of the region in the first quadrant ling between the circles + and +. The densit of the plate at point, is + kilograms per square meter

More information

Introduction to tuple calculus Tore Risch 2011-02-03

Introduction to tuple calculus Tore Risch 2011-02-03 Introduction to tuple calculus Tore Risch 2011-02-03 The relational data model is based on considering normalized tables as mathematical relationships. Powerful query languages can be defined over such

More information

University of Aarhus. Databases 2009. 2009 IBM Corporation

University of Aarhus. Databases 2009. 2009 IBM Corporation University of Aarhus Databases 2009 Kirsten Ann Larsen What is good performance? Elapsed time End-to-end In DB2 Resource consumption CPU I/O Memory Locks Elapsed time = Sync. I/O + CPU + wait time I/O

More information

C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods

C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods Overview 1 Determining Data Relationships 1 Understanding the Methods for Combining SAS Data Sets 3

More information

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model CS2Bh: Current Technologies Introduction to XML and Relational Databases Spring 2005 The Relational Model CS2 Spring 2005 (LN6) 1 The relational model Proposed by Codd in 1970. It is the dominant data

More information

Advanced Oracle SQL Tuning

Advanced Oracle SQL Tuning Advanced Oracle SQL Tuning Seminar content technical details 1) Understanding Execution Plans In this part you will learn how exactly Oracle executes SQL execution plans. Instead of describing on PowerPoint

More information

1Z0-117 Oracle Database 11g Release 2: SQL Tuning. Oracle

1Z0-117 Oracle Database 11g Release 2: SQL Tuning. Oracle 1Z0-117 Oracle Database 11g Release 2: SQL Tuning Oracle To purchase Full version of Practice exam click below; http://www.certshome.com/1z0-117-practice-test.html FOR Oracle 1Z0-117 Exam Candidates We

More information

SQL: Queries, Programming, Triggers

SQL: Queries, Programming, Triggers SQL: Queries, Programming, Triggers CSC343 Introduction to Databases - A. Vaisman 1 R1 Example Instances We will use these instances of the Sailors and Reserves relations in our examples. If the key for

More information

Translating SQL into the Relational Algebra

Translating SQL into the Relational Algebra Translating SQL into the Relational Algebra Jan Van den Bussche Stijn Vansummeren Required background Before reading these notes, please ensure that you are familiar with (1) the relational data model

More information

Effective Use of SQL in SAS Programming

Effective Use of SQL in SAS Programming INTRODUCTION Effective Use of SQL in SAS Programming Yi Zhao Merck & Co. Inc., Upper Gwynedd, Pennsylvania Structured Query Language (SQL) is a data manipulation tool of which many SAS programmers are

More information

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs. Phases of database design Application requirements Conceptual design Database Management Systems Conceptual schema Logical design ER or UML Physical Design Relational tables Logical schema Physical design

More information

MapReduce and the New Software Stack

MapReduce and the New Software Stack 20 Chapter 2 MapReduce and the New Software Stack Modern data-mining applications, often called big-data analysis, require us to manage immense amounts of data quickly. In many of these applications, the

More information

Load Balancing. Load Balancing 1 / 24

Load Balancing. Load Balancing 1 / 24 Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait

More information

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design Chapter 6: Physical Database Design and Performance Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 23 Database

More information

Oracle Database 11g: SQL Tuning Workshop Release 2

Oracle Database 11g: SQL Tuning Workshop Release 2 Oracle University Contact Us: 1 800 005 453 Oracle Database 11g: SQL Tuning Workshop Release 2 Duration: 3 Days What you will learn This course assists database developers, DBAs, and SQL developers to

More information

Creating QBE Queries in Microsoft SQL Server

Creating QBE Queries in Microsoft SQL Server Creating QBE Queries in Microsoft SQL Server When you ask SQL Server or any other DBMS (including Access) a question about the data in a database, the question is called a query. A query is simply a question

More information

Advanced Query for Query Developers

Advanced Query for Query Developers for Developers This is a training guide to step you through the advanced functions of in NUFinancials. is an ad-hoc reporting tool that allows you to retrieve data that is stored in the NUFinancials application.

More information

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 5 INTEGER LINEAR PROGRAMMING (ILP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1 General Integer Linear Program: (ILP) min c T x Ax b x 0 integer Assumption: A, b integer The integrality condition

More information

www.dotnetsparkles.wordpress.com

www.dotnetsparkles.wordpress.com Database Design Considerations Designing a database requires an understanding of both the business functions you want to model and the database concepts and features used to represent those business functions.

More information

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13

External Sorting. Why Sort? 2-Way Sort: Requires 3 Buffers. Chapter 13 External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing

More information

Oracle Database 10g: Introduction to SQL

Oracle Database 10g: Introduction to SQL Oracle University Contact Us: 1.800.529.0165 Oracle Database 10g: Introduction to SQL Duration: 5 Days What you will learn This course offers students an introduction to Oracle Database 10g database technology.

More information

Query Optimization Over Web Services Using A Mixed Approach

Query Optimization Over Web Services Using A Mixed Approach Query Optimization Over Web Services Using A Mixed Approach Debajyoti Mukhopadhyay 1, Dhaval Chandarana 1, Rutvi Dave 1, Sharyu Page 1, Shikha Gupta 1 1 Maharashtra Institute of Technology, Pune 411038

More information

Distributed Aggregation in Cloud Databases. By: Aparna Tiwari tiwaria@umail.iu.edu

Distributed Aggregation in Cloud Databases. By: Aparna Tiwari tiwaria@umail.iu.edu Distributed Aggregation in Cloud Databases By: Aparna Tiwari tiwaria@umail.iu.edu ABSTRACT Data intensive applications rely heavily on aggregation functions for extraction of data according to user requirements.

More information

Relational Database: Additional Operations on Relations; SQL

Relational Database: Additional Operations on Relations; SQL Relational Database: Additional Operations on Relations; SQL Greg Plaxton Theory in Programming Practice, Fall 2005 Department of Computer Science University of Texas at Austin Overview The course packet

More information

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level)

Comp 5311 Database Management Systems. 16. Review 2 (Physical Level) Comp 5311 Database Management Systems 16. Review 2 (Physical Level) 1 Main Topics Indexing Join Algorithms Query Processing and Optimization Transactions and Concurrency Control 2 Indexing Used for faster

More information

Boyce-Codd Normal Form

Boyce-Codd Normal Form 4NF Boyce-Codd Normal Form A relation schema R is in BCNF if for all functional dependencies in F + of the form α β at least one of the following holds α β is trivial (i.e., β α) α is a superkey for R

More information

Efficiently Identifying Inclusion Dependencies in RDBMS

Efficiently Identifying Inclusion Dependencies in RDBMS Efficiently Identifying Inclusion Dependencies in RDBMS Jana Bauckmann Department for Computer Science, Humboldt-Universität zu Berlin Rudower Chaussee 25, 12489 Berlin, Germany bauckmann@informatik.hu-berlin.de

More information

Rethinking SIMD Vectorization for In-Memory Databases

Rethinking SIMD Vectorization for In-Memory Databases SIGMOD 215, Melbourne, Victoria, Australia Rethinking SIMD Vectorization for In-Memory Databases Orestis Polychroniou Columbia University Arun Raghavan Oracle Labs Kenneth A. Ross Columbia University Latest

More information

Guide to Performance and Tuning: Query Performance and Sampled Selectivity

Guide to Performance and Tuning: Query Performance and Sampled Selectivity Guide to Performance and Tuning: Query Performance and Sampled Selectivity A feature of Oracle Rdb By Claude Proteau Oracle Rdb Relational Technology Group Oracle Corporation 1 Oracle Rdb Journal Sampled

More information

Optimizing Your Data Warehouse Design for Superior Performance

Optimizing Your Data Warehouse Design for Superior Performance Optimizing Your Data Warehouse Design for Superior Performance Lester Knutsen, President and Principal Database Consultant Advanced DataTools Corporation Session 2100A The Problem The database is too complex

More information

Oracle EXAM - 1Z0-117. Oracle Database 11g Release 2: SQL Tuning. Buy Full Product. http://www.examskey.com/1z0-117.html

Oracle EXAM - 1Z0-117. Oracle Database 11g Release 2: SQL Tuning. Buy Full Product. http://www.examskey.com/1z0-117.html Oracle EXAM - 1Z0-117 Oracle Database 11g Release 2: SQL Tuning Buy Full Product http://www.examskey.com/1z0-117.html Examskey Oracle 1Z0-117 exam demo product is here for you to test the quality of the

More information

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao

University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao University of Massachusetts Amherst Department of Computer Science Prof. Yanlei Diao CMPSCI 445 Midterm Practice Questions NAME: LOGIN: Write all of your answers directly on this paper. Be sure to clearly

More information

Introduction to database design

Introduction to database design Introduction to database design KBL chapter 5 (pages 127-187) Rasmus Pagh Some figures are borrowed from the ppt slides from the book used in the course, Database systems by Kiefer, Bernstein, Lewis Copyright

More information

Relational Algebra. Module 3, Lecture 1. Database Management Systems, R. Ramakrishnan 1

Relational Algebra. Module 3, Lecture 1. Database Management Systems, R. Ramakrishnan 1 Relational Algebra Module 3, Lecture 1 Database Management Systems, R. Ramakrishnan 1 Relational Query Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model

More information

Database Design Patterns. Winter 2006-2007 Lecture 24

Database Design Patterns. Winter 2006-2007 Lecture 24 Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each

More information

Efficient Computation of Multiple Group By Queries Zhimin Chen Vivek Narasayya

Efficient Computation of Multiple Group By Queries Zhimin Chen Vivek Narasayya Efficient Computation of Multiple Group By Queries Zhimin Chen Vivek Narasayya Microsoft Research {zmchen, viveknar}@microsoft.com ABSTRACT Data analysts need to understand the quality of data in the warehouse.

More information

First, here are our three tables in a fictional sales application:

First, here are our three tables in a fictional sales application: Introduction to Oracle SQL Tuning Bobby Durrett Introduction Tuning a SQL statement means making the SQL query run as fast as you need it to. In many cases this means taking a long running query and reducing

More information

Nearest Neighbor Join with Groups and Predicates

Nearest Neighbor Join with Groups and Predicates Nearest Neighbor Join with Groups and Predicates F. Cafagna, M. Boehlen, A. Bracher Department of Computer Science Agroscope University of Zürich Switzerland DOLAP, Melbourne 23.0.20 Francesco Cafagna

More information

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example Data warehousing in Oracle Materialized views and SQL extensions to analyze data in Oracle data warehouses SQL extensions for data warehouse analysis Available OLAP functions Computation windows window

More information

1 Structured Query Language: Again. 2 Joining Tables

1 Structured Query Language: Again. 2 Joining Tables 1 Structured Query Language: Again So far we ve only seen the basic features of SQL. More often than not, you can get away with just using the basic SELECT, INSERT, UPDATE, or DELETE statements. Sometimes

More information

6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization 6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization Animals(name,age,species,cageno,keptby,feedtime) Keeper(id,name)

More information

Performing Queries Using PROC SQL (1)

Performing Queries Using PROC SQL (1) SAS SQL Contents Performing queries using PROC SQL Performing advanced queries using PROC SQL Combining tables horizontally using PROC SQL Combining tables vertically using PROC SQL 2 Performing Queries

More information

CIS 631 Database Management Systems Sample Final Exam

CIS 631 Database Management Systems Sample Final Exam CIS 631 Database Management Systems Sample Final Exam 1. (25 points) Match the items from the left column with those in the right and place the letters in the empty slots. k 1. Single-level index files

More information

Quiz 2: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354

Quiz 2: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354 Quiz 2: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354 1. (10) Convert the below E/R diagram to a relational database schema, using each of the following approaches: (3)The straight

More information

Performance Tuning for the Teradata Database

Performance Tuning for the Teradata Database Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document

More information

Temporal Database System

Temporal Database System Temporal Database System Jaymin Patel MEng Individual Project 18 June 2003 Department of Computing, Imperial College, University of London Supervisor: Peter McBrien Second Marker: Ian Phillips Abstract

More information

Lesson 8: Introduction to Databases E-R Data Modeling

Lesson 8: Introduction to Databases E-R Data Modeling Lesson 8: Introduction to Databases E-R Data Modeling Contents Introduction to Databases Abstraction, Schemas, and Views Data Models Database Management System (DBMS) Components Entity Relationship Data

More information

Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3

Unit 4.3 - Storage Structures 1. Storage Structures. Unit 4.3 Storage Structures Unit 4.3 Unit 4.3 - Storage Structures 1 The Physical Store Storage Capacity Medium Transfer Rate Seek Time Main Memory 800 MB/s 500 MB Instant Hard Drive 10 MB/s 120 GB 10 ms CD-ROM

More information

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents

More information

The Volcano Optimizer Generator: Extensibility and Efficient Search

The Volcano Optimizer Generator: Extensibility and Efficient Search The Volcano Optimizer Generator: Extensibility and Efficient Search Goetz Graefe Portland State University graefe @ cs.pdx.edu Abstract Emerging database application domains demand not only new functionality

More information

Exercises. a. Find the total number of people who owned cars that were involved in accidents

Exercises. a. Find the total number of people who owned cars that were involved in accidents 42 Chapter 4 SQL Exercises 4.1 Consider the insurance database of Figure 4.12, where the primary keys are underlined. Construct the following SQL queries for this relational database. a. Find the total

More information

Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling. Tim Nieberg Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

More information

Nonlinear Algebraic Equations Example

Nonlinear Algebraic Equations Example Nonlinear Algebraic Equations Example Continuous Stirred Tank Reactor (CSTR). Look for steady state concentrations & temperature. s r (in) p,i (in) i In: N spieces with concentrations c, heat capacities

More information

We know how to query a database using SQL. A set of tables and their schemas are given Data are properly loaded

We know how to query a database using SQL. A set of tables and their schemas are given Data are properly loaded E-R Diagram Database Development We know how to query a database using SQL A set of tables and their schemas are given Data are properly loaded But, how can we develop appropriate tables and their schema

More information

DBMS / Business Intelligence, SQL Server

DBMS / Business Intelligence, SQL Server DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.

More information