CS 338 Join, Aggregate and Group SQL Queries Bojana Bislimovska Winter 2016
Outline SQL joins Aggregate functions in SQL Grouping in SQL HAVING clause
SQL Joins Specifies a table resulting from a join operation of other tables in the FROM clause Join conditions can be specified in the WHERE clause NATURAL JOIN No join condition explicitly specified Join between attributes with the same name from the different relations It is possible to rename the attributes first so that they match Each pair of attributes is included only once in the resulting relation SELECT Fname, Lname, Address FROM (EMPLOYEE NATURAL JOIN DEPARTMENT AS DEPT(Dname, Dno, Mssn, MsDate))) WHERE Dname= Research ;
SQL Joins INNER JOIN (default type) A tuple is included in the result only if a matching tuple exists in the other relation OUTER JOIN LEFT OUTER JOIN- every tuple in the left table must appear in the result If it does not have a matching tuple, it is padded with NULL values for the attributes of the right table SELECT E.Lname AS Employee_name, S.Lname AS Supervisor_name FROM (EMPLOYEE AS E LEFT OUTER JOIN EMPLOYEE AS S ON E.Super_ssn=S.Ssn);
SQL Joins OUTER JOIN keyword OUTER may be omitted RIGHT OUTER JOIN- every tuple in the right table must appear in the result If it does not have a matching tuple, it is padded with NULL values for the attributes of the left table FULL OUTER JOIN every tuple in the right table and every tuple in the left table must appear in the result CROSS JOIN cartesian product of the two tables, returning all posible combinations of all tuples Multiway join join multiple tables in a single joined table, nesting join specifications SELECT Pnumber, Dnum, Lname, Address, Bdate FROM ((PROJECT JOIN DEPARTMENT ON Dnum=Dnumber) JOIN EMPLOYEE ON Mgr_ssn=Ssn) WHERE Plocation= Stafford ;
Aggregate Functions in SQL Major research Used to summarize information from multiple tuples into a singletuple summary Common built-in aggregate functions: COUNT, SUM, AVG, MAX, MIN Examples 1. SELECT SUM (Salary) FROM (EMPLOYEE JOIN DEPARTMENT ON Dno=Dnumber) WHERE Dname= Research 2. SELECT COUNT(*) FROM EMPLOYEE 3. SELECT COUNT(DISTINCT SALARY) FROM EMPLOYEE
Grouping in SQL Creates subgroups of tuples before aggregation Example: Find average salary of employees in each department Partition the relation into nonoverlapping tuples that have same value of some (grouping) attribute(s) Apply function to each tuple independently to produce summary information about each group GROUP BY clause specifies the grouping attributes Grouping attributes must also appear in the SELECT clause SELECT Dno, AVG(Salary) FROM EMPLOYEE GROUP BY Dno;
Grouping in SQL WHERE clause can be used SELECT Dno, COUNT(*) FROM EMPLOYEE WHERE Salary>25000 GROUP BY Dno; Results can be ordered and the aggregate column can have a name SELECT Dno, COUNT(*) AS NO_EMP FROM EMPLOYEE WHERE Salary>25000 GROUP BY Dno ORDER BY Dno
HAVING clause in SQL Provides condition on the summary information regarding the group of tuples associated with each value of the grouping attributes Only groups that satisfy the condition are retrieved as a result of the query SELECT Dname, COUNT(*) FROM DEPARTMENT, EMPLOYEE WHERE Dnumber=Dno AND Salary>40000 GROUP BY Dname HAVING COUNT(*)>5; HAVING clause cannot reference individual tuples It selects individual groups of tuples WHERE applies to individual tuples WHERE clause is executed first, HAVING clause is applied later
Exercises 1. Consider the following relations: Emp(eno, ename, title, city) Proj(pno, pname, budget, city) Works(eno, pno, resp, dur) Pay(title, salary) where the primary keys are underlined, and Emp.title is a foreign key to Pay.title, Works.eno is a foreign key to Emp.eno, and Works.pno is a foreign key to Proj.pno. Write the following SQL queries: 1) Find average salary of all employees working on the project XYZ. 2) For each city, how many projects are located in that city and what is the total budget over all projects in the city?
Exercises 2. A database for an organization that shelters animals, and people can go and adopt animals that they shelter, has the following set of relations: Animals(ID: integer, Name: string, Owner: integer, DateAdmitted: date, Type:string) Adopter(PID: integer, Name: string, Address: string) Adoption(AnimalID:integer, PID:integer, AdoptDate: date, chipno: integer) where the type of each attribute is given following the attribute (e.g., ID: integer), and the primary keys are underlined. Furthermore, (a) Animals stores information about the animals. Each is given an ID, and their names (attribute Name) together with the PID of their owners (attribute Owner) if they have been adopted, and their date of admission is recorded (so Owner is a foreign key to Adopter.PID). Type refers to the type of animal (dog, cat, etc). Note that this is a historical table, so it keeps track of all admitted animals, even if some of them have been adopted. (b) Adopter is the relation that holds information about animal adopters. The PID of the person who has adopted an animal is recorded as well as his/her Name and Address. (c) Adoption.AnimalID refers to Animals.ID and has the same type. Similarly, Adoption.PID refers to Adopter. PID and has the same type. Attribute chipno stores the number on the microchip that is implanted on the animal for tracking.
Exercises Formulate the following queries in SQL: 1) Retrieve the total number of dogs that were brought to the shelter on 18 April 2015. 2) For each animal type, list the animal type and total number of adoptions on 14 June 2015. 3) For each adopter who has made at least two adoptions, list their names and addresses.