The SQL Query Language. Creating Relations in SQL. Referential Integrity in SQL. Basic SQL Query. Primary and Candidate Keys in SQL



Similar documents
SQL: Queries, Programming, Triggers

Example Instances. SQL: Queries, Programming, Triggers. Conceptual Evaluation Strategy. Basic SQL Query. A Note on Range Variables

Chapter 5. SQL: Queries, Constraints, Triggers

SQL: Queries, Programming, Triggers

Relational Algebra and SQL

Boats bid bname color 101 Interlake blue 102 Interlake red 103 Clipper green 104 Marine red. Figure 1: Instances of Sailors, Boats and Reserves

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

Lecture 6. SQL, Logical DB Design

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

UNIT 6. Structured Query Language (SQL) Text: Chapter 5

SQL NULL s, Constraints, Triggers

The Relational Model. Why Study the Relational Model?

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

More on SQL. Juliana Freire. Some slides adapted from J. Ullman, L. Delcambre, R. Ramakrishnan, G. Lindstrom and Silberschatz, Korth and Sudarshan

COMP 5138 Relational Database Management Systems. Week 5 : Basic SQL. Today s Agenda. Overview. Basic SQL Queries. Joins Queries

Database Management Systems. Chapter 1

Relational Databases

Part A: Data Definition Language (DDL) Schema and Catalog CREAT TABLE. Referential Triggered Actions. CSC 742 Database Management Systems

The Relational Data Model and Relational Database Constraints

Outline. Data Modeling. Conceptual Design. ER Model Basics: Entities. ER Model Basics: Relationships. Ternary Relationships. Yanlei Diao UMass Amherst

Relational model. Relational model - practice. Relational Database Definitions 9/27/11. Relational model. Relational Database: Terminology

Instant SQL Programming

SQL DATA DEFINITION: KEY CONSTRAINTS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7

Review Entity-Relationship Diagrams and the Relational Model. Data Models. Review. Why Study the Relational Model? Steps in Database Design

Comp 5311 Database Management Systems. 3. Structured Query Language 1

Relational Algebra. Module 3, Lecture 1. Database Management Systems, R. Ramakrishnan 1

EECS 647: Introduction to Database Systems

There are five fields or columns, with names and types as shown above.

4. SQL. Contents. Example Database. CUSTOMERS(FName, LName, CAddress, Account) PRODUCTS(Prodname, Category) SUPPLIERS(SName, SAddress, Chain)

Data Modeling. Database Systems: The Complete Book Ch ,

Oracle SQL. Course Summary. Duration. Objectives

Introduction to Microsoft Jet SQL

Query-by-Example (QBE)

Relational Database: Additional Operations on Relations; SQL

IT2305 Database Systems I (Compulsory)

Databases 2011 The Relational Model and SQL

IT2304: Database Systems 1 (DBS 1)

Lecture 4: More SQL and Relational Algebra

3. Relational Model and Relational Algebra

Introduction to database design

CSC 443 Data Base Management Systems. Basic SQL

How To Create A Table In Sql (Ahem)

Structured Query Language (SQL)

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

Θεµελίωση Βάσεων εδοµένων

SUBQUERIES AND VIEWS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 6

Information Systems SQL. Nikolaj Popov

Advance DBMS. Structured Query Language (SQL)

Oracle Database: SQL and PL/SQL Fundamentals

ICAB4136B Use structured query language to create database structures and manipulate data

Intermediate SQL C H A P T E R4. Practice Exercises. 4.1 Write the following queries in SQL:

History of SQL. Relational Database Languages. Tuple relational calculus ALPHA (Codd, 1970s) QUEL (based on ALPHA) Datalog (rule-based, like PROLOG)

Oracle Database: SQL and PL/SQL Fundamentals NEW

SQL: QUERIES, CONSTRAINTS, TRIGGERS

Chapter 6: Integrity Constraints

Chapter 8. SQL-99: SchemaDefinition, Constraints, and Queries and Views

The Entity-Relationship Model

CS143 Notes: Views & Authorization

Duration Vendor Audience 5 Days Oracle End Users, Developers, Technical Consultants and Support Staff

Oracle Database: SQL and PL/SQL Fundamentals

Review: Participation Constraints

Programming with SQL

SQL Simple Queries. Chapter 3.1 V3.0. Napier University Dr Gordon Russell

How To Manage Data In A Database System

Week 4 & 5: SQL. SQL as a Query Language

CSI 2132 Lab 3. Outline 09/02/2012. More on SQL. Destroying and Altering Relations. Exercise: DROP TABLE ALTER TABLE SELECT

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX

BCA. Database Management System

2874CD1EssentialSQL.qxd 6/25/01 3:06 PM Page 1 Essential SQL Copyright 2001 SYBEX, Inc., Alameda, CA

Microsoft Access 3: Understanding and Creating Queries

Financial Data Access with SQL, Excel & VBA

MOC 20461C: Querying Microsoft SQL Server. Course Overview

3.GETTING STARTED WITH ORACLE8i

SQL SELECT Query: Intermediate

Introduction to SQL: Data Retrieving

unisys OS 2200 Relational Data Management System (RDMS 2200) and IPF SQL Interface End Use Guide imagine it. done. ClearPath OS

DataBase Management Systems Lecture Notes

Oracle Database 10g: Introduction to SQL

Databasesystemer, forår 2005 IT Universitetet i København. Forelæsning 3: Business rules, constraints & triggers. 3. marts 2005

Data Structure: Relational Model. Programming Interface: JDBC/ODBC. SQL Queries: The Basic From

Introduction to Querying & Reporting with SQL Server

Oracle Database 12c: Introduction to SQL Ed 1.1

Chapter 14: Query Optimization

Simple SQL Queries (3)

SQL - QUICK GUIDE. Allows users to access data in relational database management systems.

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Active database systems. Triggers. Triggers. Active database systems.

Define terms Write single and multiple table SQL queries Write noncorrelated and correlated subqueries Define and use three types of joins

T-SQL STANDARD ELEMENTS

Translating SQL into the Relational Algebra

DBMS / Business Intelligence, SQL Server

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

Guide to SQL Programming: SQL:1999 and Oracle Rdb V7.1

Chapter 13: Query Optimization

Netezza SQL Class Outline

Advanced SQL. Lecture 3. Outline. Unions, intersections, differences Subqueries, Aggregations, NULLs Modifying databases, Indexes, Views

1. SELECT DISTINCT f.fname FROM Faculty f, Class c WHERE f.fid = c.fid AND c.room = LWSN1106

Introduction to tuple calculus Tore Risch

CS 338 Join, Aggregate and Group SQL Queries

Transcription:

COS 597A: Principles of Database and Information Systems SQL: Overview and highlights The SQL Query Language Structured Query Language Developed by IBM (system R) in the 1970s Need for a standard since it is used by many vendors Standards: SQL-86 SQL-92 (major revision) SQL-99 (major extensions) SQL 2003 (XML SQL) continue enhancements Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 1 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 2 Creating Relations in SQL CREATE TABLE Movie ( name CHAR(30), producer CHAR(30), rel_date CHAR(8), rating CHAR, PRIMARY KEY (name, producer, rel_date) ) CREATE TABLE Employee (SS# CHAR(9), name CHAR(30), addr CHAR(50), startyr INT, PRIMARY KEY (SS#)) CREATE TABLE Assignment (position CHAR(20), SS# CHAR(9), manager SS# CHAR(9), PRIMARY KEY (position), FOREIGN KEY(SS# REFERENCES Employee), FOREIGN KEY (managerss# REFERENCES Employee) ) Observe that the type (domain) of each attribute is specified, and enforced by the DBMS whenever tuples are added or modified. Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 3 Referential Integrity in SQL SQL-92 on support all 4 options on deletes and updates. Default is NO ACTION (delete/update is rejected) CASCADE (also delete all tuples that refer to deleted tuple) SET NULL / SET DEFAULT (sets foreign key value of referencing tuple) CREATE TABLE Acct (bname CHAR(20) DEFAULT main, acctn CHAR(20), bal REAL, PRIMARY KEY ( acctn), FOREIGN KEY (bname) REFERENCES Branch ON DELETE SET DEFAULT ) BUT individual implementations may NOT support Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 4 Primary and Candidate Keys in SQL Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key. There at most one book with a given title and edition date, publisher and isbn are determined Used carelessly, an IC can prevent the storage of database instances that arise in practice! Title and ed suffice? UNIQUE (title, ed, pub)? CREATE TABLE Book (isbn CHAR(10) title CHAR(100), ed INTEGER, pub CHAR(30), date INTEGER, PRIMARY KEY (isbn), UNIQUE (title, ed )) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 5 Basic SQL Query SELECT WHERE [DISTINCT] select-list from-list qualification from-list A list of relation names (possibly with a tuple-variable after each name). select-list A list of attributes of relations in fromlist qualification Comparisons (Attr op const or Attr1 op Attr2, where op is one of <, >,=,,, ) combined using AND, OR and NOT. DISTINCT is an optional keyword indicating that the answer should not contain duplicates. Default is that duplicates are not eliminated! Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 6

Conceptual Evaluation Strategy Semantics of an SQL query defined in terms of the following conceptual evaluation strategy: Compute the cross-product of from-list. Discard resulting tuples if they fail qualifications. Delete attributes that are not in select-list. If DISTINCT is specified, eliminate duplicate rows. This strategy is probably the least efficient way to compute a query! An optimizer will find more efficient strategies to compute the same answers. Example Instances instance of Branch We will use these instances of the Acct and Branch relations in our examples. bname bcity assets pu Pton 10 nyu nyc 20 time sq nyc 30 bname acctn bal instance of pu 33 356 Acct nyu 45 500 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 7 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 8 Example of Conceptual Evaluation SELECT acctn Branch, Acct WHERE Branch.bname=Acct.bname AND assets<20 bname bcity assets bname acctn bal pu Pton 10 pu 33 356 pu Pton 10 nyu 45 500 nyu nyc 20 pu 33 356 nyu nyc 20 nyu 45 500 time sq nyc 30 pu 33 356 time sq nyc 30 nyu 45 500 Expressions and Strings SELECT name, age=2008-yrofbirth Alumni WHERE dept LIKE C%S Illustrates use of arithmetic expressions and string pattern matching: Find pairs (Alumnus(a) name and age defined by year of birth) for alums whose dept. begins with C and ends with S. LIKE is used for string matching. `_ stands for any one character and `% stands for 0 or more arbitrary characters. Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 9 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 10 Tuple Variables Refer to tuples from a relation Really needed only if the same relation appears twice in the clause. : SELECT acctn Branch, Acct WHERE Branch.bname=Acct.bname AND assets<20 OR OR SELECT R.acctn Branch S, Acct R WHERE S.bname=R.bname AND assets<20 SELECT R.acctn Branch as S, Acct as R WHERE S.bname=R.bname AND assets<20 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 11 CREATE TABLE Acct (bname CHAR(20), acctn CHAR(20), bal REAL, PRIMARY KEY ( acctn), FOREIGN KEY (bname REFERENCES Branch ) CREATE TABLE Branch CREATE TABLE Cust (bname CHAR(20), (name CHAR(20), bcity CHAR(30), street CHAR(30), assets REAL, city CHAR(30), PRIMARY KEY (bname) ) PRIMARY KEY (name) ) CREATE TABLE Owner (name CHAR(20), acctn CHAR(20), FOREIGN KEY (name REFERENCES Cust ) FOREIGN KEY (acctn REFERENCES Acct ) ) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 12

Nested Queries Find names of all branches with accts of cust. who live in Rome SELECT A.bname Acct A WHERE A.acctn IN (SELECT D.acctn Owner D, Cust C WHERE D.name = C.name AND C.city= Rome ) A very powerful feature of SQL: a WHERE clause can itself contain an SQL query! (Actually, so can and HAVING clauses.) What get if use NOT IN? To understand semantics of nested queries, think of a nested loops evaluation: For each Acct tuple, check the qualification by computing the subquery. Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 13 Nested Queries with Correlation Find acct no.s whose owners own at least one acct with a balance over 1000 SELECT D.acctn Owner D WHERE EXISTS (SELECT * Owner E, Acct R WHERE R.bal>1000 AND R.acctn=E.acctn AND E.name=D.name) EXISTS is another set comparison operator, like IN. If UNIQUE is used, and * is replaced by E.name, finds acct no.s whose owners own no more than one acct with a balance over 1000. (UNIQUE checks for duplicate tuples; * denotes all attributes. Why do we have to replace * by E.name?) Illustrates why, in general, subquery must be re-computed for each Branch tuple. Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 14 More on Set-Comparison Operators We ve already seen IN, EXISTS and UNIQUE. Can also use NOT IN, NOT EXISTS and NOT UNIQUE. Also available: op ANY, op ALL, op from >, <, =,!,", # Find names of branches with assets at least as large as the assets of some NYC branch: SELECT B.bname Branch B WHERE B.assets ANY (SELECT Q.assets Branch Q WHERE Q.bcity= NYC ) Includes NYC branches? note: key word SOME is interchangable with ANY - ANY easily confused with ALL Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 15 Division in SQL Find tournament winners who have won all tournaments. SELECT R.wname Winners R WHERE NOT EXISTS ((SELECT S.tourn Winners S) EXCEPT (SELECT T.tourn Winners T WHERE T.wname=R.wname)) CREATE TABLE Winners (wname CHAR((30), tourn CHAR(30), year INTEGER) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 16 Division in SQL simple template Schemas WholeRelation: (r 1, r 2,, r m, q 1, q 2, q n ) DivisorRelation: (q 1, q 2, q n ) WholeRelation DivisorRelation: (r 1, r 2,, r m ) SELECT R.r 1, R.r 2,, R.r m WholeRelation R WHERE NOT EXISTS ( (SELECT * DivisorRelation Q ) EXCEPT (SELECT T.q 1, T.q 2, T.q n WholeRelation T WHERE R.r 1 = T.r 1 R.r 2 = T.r 2 R.r m = T.r m ) ) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 17 Division in SQL general template SELECT WHERE NOT EXISTS ((SELECT WHERE ) EXCEPT (SELECT WHERE ) can do projections and other predicates within nested selects Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 18

Aggregate Operators Significant extension of relational algebra. Example: Find name and city of the poorest branch The first query is illegal! Is it poorest branch or poorest branches? COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) single column SELECT S.bname, MIN (S.assets) Branch S SELECT S.bname, S.assets Branch S WHERE S.assets = (SELECT MIN (T.assets) Branch T) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 19 GROUP BY and HAVING Sometimes, we want to apply aggregate operators to each of several groups of tuples. Find the maximum assets of all branches in a city for each city containing at least one branch. SELECT B.bcity, MAX(B.assets) Branch B GROUP BY B.bcity for each city - one name - aggregate assets Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 20 Queries With GROUP BY and HAVING SELECT [DISTINCT] select-list from-list WHERE qualification GROUP BY grouping-list HAVING group-qualification The select-list contains (i) attribute names (ii) terms with aggregate operations (e.g., MIN (S.age)). The attribute list (i) must be a subset of grouping-list. Intuitively, each answer tuple corresponds to a group, and these attributes must have a single value per group. (A group is a set of tuples that have the same value for all attributes in grouping-list.) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 21 Conceptual Evaluation The cross-product of from-list is computed, tuples that fail qualification are discarded, `unnecessary attributes are deleted, and the remaining tuples are partitioned into groups by the value of attributes in grouping-list. The group-qualification is then applied to eliminate some groups. Expressions in group-qualification must have a single value per group! In effect, an attribute in group-qualification that is not an argument of an aggregate op also appears in groupinglist. (SQL does not exploit primary key semantics here!) One answer tuple is generated per qualifying group. Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 22 What attributes are unnecessary? What attributes are necessary: Exactly those mentioned in SELECT, GROUP BY or HAVING clauses Find the maximum assets of all branches in a city for each city containing at least two branches. SELECT B.bcity, MAX(B.assets) Branch B GROUP BY B.bcity HAVING COUNT(*) >1 empty WHERE bcity assets Pton 10 Pton 8 nyc 20 nyc 30 bcity Pton 10 nyc 30 bname bcity assets pu Pton 10 pmc Pton 8 nyu nyc 20 time sq upenn nyc phili 30 50 2nd column of result is unnamed. (Use AS to name it.) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 23 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 24

Joins in SQL Outer Joins SQL has both inner joins and outer join Use in " " portion of query Inner join variations NATURAL INNER JOIN Generalized versions Outer join includes tuples that don t match fill in with nulls 3 varieties: left, right, full Left outer join of S and R: take inner join of S and R (with whatever qualification) add tuples of S that are not matched in inner join, filling in attributes coming from R with "null" Right outer join: as for left, but fill in tuple of R Full outer join: both left and right Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 25 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 26 Example Given Tables: sid 77 35 21 NATURAL INNER JOIN: residence GC Lawrence Butler 77 21 NATURAL LEFT OUTER JOIN add: sid dept 77 ELE 21 COS 42 MOL GC Butler ELE COS 35 Lawrence null Example Query assignment: (position, division, SS#, managerss#) SELECT DISTINCT M.academic_dept., A.division study M NATURAL LEFT OUTER JOIN assignment A study: (SS#, academic_dept., adviser) NATURAL RIGHT OUTER JOIN add: NATURAL FULL OUTER JOIN add both 42 null MOL What does this produce? Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 27 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 28 General form SQL Query Now seen all major components SELECT Structure of Query: Three set operations Only these combine separate SELECT statements. All other SELECTs nested. Scope of tuple variable within SELECT and nested subqueries in it select-list from-list WHERE qualification GROUP BY grouping-list HAVING group-qualification UNION or INTERSECT or EXCEPT SELECT select-list from-list WHERE qualification GROUP BY grouping-list HAVING group-qualification continuing general query form Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 29 Null Values represent unknown value or inapplicable attribute can test attribute value IS NULL or IS NOT NULL need a 3-valued logic (true, false and unknown) to deal with null values in predicates. comparisons with null evaluate to unknown Boolean operations on unknown depend on truth table can test IS UNKNOWN and IS NOT UNKNOWN meaning of constructs must be defined carefully Example: WHERE clause eliminates rows that don t evaluate to true aggregations, except COUNT(*), ignore nulls Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 30

Integrity Constraints (Review) An IC describes conditions that every legal instance of a relation must satisfy. Inserts/deletes/updates that violate IC s are disallowed. Can be used to ensure application semantics (e.g., sid is a key), or prevent inconsistencies (e.g., sname has to be a string, age must be < 200) Types of IC s: Domain constraints, primary key constraints, candidate key constraints, foreign key constraints, general constraints. General Constraints Useful when more general ICs than keys are involved. CREATE TABLE GasStation ( name CHAR(30), street CHAR(40), city CHAR(30), st CHAR(2), type CHAR(4), PRIMARY KEY (name, street, city, st), CHECK ( type= full OR type= self ), CHECK (st <> nj OR type= full ) ) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 31 Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 32 More General Constraints Can use queries to express constraint. Constraints can be named. Constraints can use other tables Must check if other table modified CREATE TABLE FroshSemEnroll ( sid CHAR(10), sem_title CHAR(40), PRIMARY KEY (sid, sem_title), FOREIGN KEY (sid) REFERENCES Students CONSTRAINT froshonly CHECK (2012 = ( SELECT S.classyear Students S WHERE S.sid=sid) ) ) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 33 Constraints Over Multiple Relations Number of bank branches in a city is less than 3 or the population of the city is greater than 100,000 Cannot impose as CHECK on each table. If either table is empty, the CHECK is satisfied Is conceptually wrong to associate with individual tables ASSERTION is the right solution; not associated with either table. Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 34 Number of bank branches in a city is less than 3 or the population of the city is greater than 100,000 CREATE ASSERTION branchlimit CHECK ( NOT EXISTS ( (SELECT C.name, C.state Cities C WHERE C.pop <=100000 ) INTERSECT ( SELECT D.name, D.state Cities D WHERE 3 <= (SELECT COUNT (*) Branches B WHERE B.bcity=D.name ) ) ) ) Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 35 Summary SQL an important factor in the early acceptance of the relational model more natural than earlier, procedural query languages. Significantly more expressive power than fundamental relational model Blend of relational algebra and calculus - plus extensions Relational queries often expressed more naturally in SQL Many alternative ways to write a query optimizer should look for most efficient evaluation plan when efficiency counts, users need to be aware of how queries are optimized and evaluated for best results SQL allows specification of rich integrity constraints Based on slides for Database Management Systems by R. Ramakrishnan and J. Gehrke 36