Query Languages Review Relational Algebra SQL Set operators Union Intersection Difference Cartesian product Relational Algebra Operators Relational operators Selection Projection Join Division Douglas S. Kerr CIS 671 Query Languages 1 CIS 671 Query Languages 2 Select (σ), Project (π), Union ( ), Difference (-), Join: Natural (*) and Theta ( ) Natural join (*) Q R 1 * (list1), (list2) R 2 list i has attributes from relation R i list 1 names go in Q Theta-join (θ-join) θ ε {, <, =, >,, } Theta (θ) join ( ) Q R 1 <join condition> R 2 where <join condition> is of the form <condition> AND <condition> AND AND <condition> where each condition is of the form A i θ B j, A i is an attribute of R 1, B j is an attribute of R 2, θ (theta) ε (<,, >,, =, ) e.g. For each employee, those making higher salary. EMP1 EMP1.SAL < EMP2.SAL EMP2 CIS 671 Query Languages 3 EMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION) DEPT( DNO, DNAME, LOC) CIS 671 Query Languages 4 Answer: D1 and D2 (exists, ) Answer: D1 and D2 π DNO ( π JOB (σ DNO = D3 (EMP)) * EMP) What about department D3? CIS 671 Query Languages 5 CIS 671 Query Languages 6 1
CIS 671 Query Languages 7 Divide ( ), for all ( ) EMP( EMPNO, DNO, JOB,...) D3JOBS π JOB (σ DNO = D3 (EMP)) DEPT_JOBS π DNO, JOB (EMP) GOOD_DEPTS DEPT_JOBS D3JOBS (π DNO, JOB (EMP)) (π JOB (σ DNO = D3 (EMP)) ) CIS 671 Query Languages 8 Functions and groups continued Q14. List the departments (DNO) and the average salary of each. EMP( EMPNO, DNO, SAL,...) 100 D3 66,000 200 D3 55,000 300 D3 66,000 400 D1 66,000 500 D1 55,000 600 D1 60,000 700 D2 66,000 800 D2 60,000 900 D2 66,000 DNO DNO AVG SAL D3 62,333 D1 60,333 D2 64,000 AVERAGE SAL ( EMP) CIS 671 Query Languages 9 Outer joins P(P#, PNAME, CITY) S(S#, SNAME, CITY) P1 Nut London S1 Smith London P3 Screw Rome S5 Adams Athens Q15a. For each part list all suppliers in the same city. ( P#, PNAME, S#, SNAME, CITY) P1 Nut S1 Smith London Q15b. For each part list all suppliers in the same city. For a part with no supplier in the city, list null. P P.CITY = S.CITY S natural join (inner join) ( P#, PNAME, S#, SNAME, CITY) P P.CITY = S.CITY S P1 Nut S1 Smith London P3 Screw?? Rome left outer join CIS 671 Query Languages 10 Outer joins continued P(P#, PNAME, CITY) P1 Nut London P3 Screw Rome Q15c. For each part list all suppliers in the same city. For a supplier with no part in the city, list null. ( P#, PNAME, S#, SNAME, CITY) P P.CITY = S.CITY P1 Nut S1 Smith London S?? S5 Adams Athens right outer join Q15d. For each part list all suppliers in the same city. For a part with no supplier in the city, list null. For a supplier with no part in the city, list null. S(S#, SNAME, CITY) S1 Smith London S5 Adams Athens ( P#, PNAME, S#, SNAME, CITY) P1 Nut S1 Smith London P P.CITY = S.CITY S P3 Screw?? Rome?? S5 Adams Athens outer join CIS 671 Query Languages 11 Recursive closure EMP( EMPNO, MGR,...) 100 700 200 700 300 700 400 800 500 800 600 800 700 900 800 900 900 900 Q 16. List all the superiors of EMPNO 100. 700 900 Q 17. List all those supervised by EMPNO 800. 600 Can t express these queries? WHY? CIS 671 Query Languages 12 2
SEQUEL Structured English QUEry Language D. D. Chamberlin, et al., SEQUEL 2 A Unified Approach to Data Definition, Manipulation, and Control, IBM J. Res. Develop., Nov. 1976, pp. 560-575. SQL Structured Query Language SQL - Parts of the Language Data Definition Language (DDL) create table create index Data Manipulation Language (DML) select (retrieve) update insert delete CIS 671 Query Languages 13 CIS 671 Query Languages 14 Select - Basic Form (select from where) Cartesian product followed by select and project. select project-list from Cartesian-product-list where select-condition(s) Abstract example: Given tables R(A,B) and S(B,C) select R.A, R.B, S.C from R, S where R.A > 10 BUT - Duplicates NOT eliminated. Bag vs. Set. CIS 671 Query Languages 15 Select as a JOIN Cartesian product followed by select ( join & select conditions) and project. select project-list from Cartesian-product-list where join-condition and select-condition Abstract example: Given tables R(A,B) and S(B,C) select R.A, R.B, S.C from R, S where R.B = S.B /* join condition */ and R.A > 10 /* select condition */ CIS 671 Query Languages 16 Using EMP and DEPT From Relational Algebra to SQL EMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION) DEPT( DNO, DNAME, LOC) List the names, employee numbers, department numbers and locations for all clerks. select NAME, EMPNO, E.DNO, LOC E, DEPT D where E.DNO = D.DNO /* join condition */ and JOB = Clerk /* select condition */ Note use of alias in from clause. CIS 671 Query Languages 17 Duplicates in project - must use explicit distinct List the different department numbers in the EMP table (eliminate duplicates). select distinct DNO Specify sort order List employee number, name, and salary of employees in department 50. select EMPNO, NAME, SAL where DNO = 50 order by EMPNO CIS 671 Query Languages 18 3
Union List the numbers of those departments which have an employee named Smith or are located in Columbus. where ENAME = Smith union where LOC = Columbus Functions and Groups List the departments (DNO) and the average salary of each., avg(sal) E, DEPT D where E.DNO = D.DNO group by DNO Duplicates ARE eliminated by default. union all - leaves duplicates CIS 671 Query Languages 19 CIS 671 Query Languages 20 SQL is (was) NOT an Algebra List the departments (DNO, DNAME) in which the average employee salary < $25,000., DNAME E, DEPT D where E.DNO = D.DNO group by DNO, DNAME having avg(sal) < 25000 CIS 671 Query Languages 21 Nested Select: No analog in Relational Algebra List names of employees in departments 25, 47 and 53. select NAME where DNO in (25, 47, 53) List names of employees who work in departments in Ann Arbor. select NAME where DNO in ( where LOC = Ann Arbor ) CIS 671 Query Languages 22 Big Summary For all departments in Columbus with average salary > $25,000, list the department s number, name, and average salary ordered by average salary in descending order., DNAME, avg(sal), DEPT where EMP.DNO = DEPT.DNO and LOC = Columbus group by DNO, DNAME having avg(sal) > 25000 order by 3 desc CIS 671 Query Languages 23 Null Values All of the following conditions are always false. null > 25 null < 25 null = 25 null <> 25 null >= 25 null <= 25 null = null null <> null However we can use the following: select NAME where SAL < 35000 or SAL is null CIS 671 Query Languages 24 4
Views In SQL A named, derived table (a virtual table). Derived from base tables and/or other views. In Three-schema Architecture (pp. 27-29) External View A collection of several tables, some views, other base tables. Views: Changing Attribute Names Create a view called PROGS consisting of the EMPNO, name and salary of all programmers. Include the locations of their departments. create view PROG(EMPNO, NAME, SALARY, HOMEBASE) as select EMPNO, NAME, SAL, LOC, DEPT where EMP.DNO = DEPT.DNO and EMP.JOB = Programmer CIS 671 Query Languages 25 CIS 671 Query Languages 26 Exists ( ) and For All ( ) EMP( EMPNO, NAME, DNO, JOB, MGR, SAL, COMMISSION) DEPT( DNO, DNAME, LOC) CIS 671 Query Languages 27 Answer: D1 and D2 exists, The order of the two where exists selects does not matter. and exists EY Still gets D3. and EY.DNO = DEPT.DNO)) CIS 671 Query Languages 28 for all, However no for all exists in SQL. where for all and exists EY and EY.DNO = DEPT.DNO)) CIS 671 Query Languages 29 CIS 671 Query Languages 30 5
However no for all exists in SQL. where not exists Use two not exists. and not exists EY and EY.DNO = DEPT.DNO)) Eliminate where not exists and not exists EY and EY.DNO = DEPT.DNO) ) and DNO <> D3 CIS 671 Query Languages 31 CIS 671 Query Languages 32 SQL: for all ( ) using count Function First Attempt: SQL: for all ( ) using count Function Second Attempt: D where ( select count(distinct JOB) ) = ( select count(distinct JOB) EY where EY.DNO = D.DNO) D3JOBS: Jobs done DJOBS: Jobs done department D. D where ( select count(distinct JOB) ) = ( select count(distinct EY.JOB) EY, EMP ED3 where EY.DNO = D.DNO and EY.JOB = ED3.JOB and ED3.DNO = D3 ) D3JOBS: Jobs done DJOBS: Jobs done by employees in department D that are also done by employees in department D3. CIS 671 Query Languages 33 CIS 671 Query Languages 34 Why does this approach work? The where clause is applied to a specific department D. D where ( select count(distinct JOB) ) = ( select count(distinct EY.JOB) EY, EMP ED3 where EY.DNO = D.DNO and EY.JOB = ED3.JOB and ED3.DNO = D3 ) Works if DJOBS D3JOBS. Why is this a general rule? Why is the rule satisfied in this case? D3JOBS: Jobs done DJOBS: Jobs done department D that are also done by employees in CIS 671 Query Languages 35 6