EECS 647: Introduction to Database Systems Instructor: Luke Huan Spring 2013
Administrative Take home background survey is due this coming Friday The grader of this course is Ms. Xiaoli Li and her email is xiaoli.li@ku.edu Your MySQL account has be created. One first in-class lab is this Friday at Eaton 1005C. Since we have more than 20 students in this class, you are encouraged to bring in your own laptop. Make sure that you have your EECS account id and password. 9/4/2013 Luke Huan Univ. of Kansas 2
Review What are the two terms used by ER model to describe a miniworld? Entity Relationship Students Takes Courses Student (sid: string, name: string, login: string, age: integer, gpa:real) 10101 11101 9/4/2013 Luke Huan Univ. of Kansas 3
Relational Model Today s Outline Relational Model Constraints and Relational Database Schemas 9/4/2013 Luke Huan Univ. of Kansas 4
Relational Model: Thanking in a tabular Way Entity: Employees, Departments Relationship: Works in E ID E Name Since D ID D Name Employees Works in Departments 9/4/2013 Luke Huan Univ. of Kansas 5
Tabular View Tabular View of Employees, Departments, and Works In E ID E Name 111 John Smith 112 Susan Wang D ID D name 10 HR 11 Research D ID E ID Since 10 111 2001 11 112 2008 9/4/2013 Luke Huan Univ. of Kansas 6
Example Name of the relation Attributes Tuples Employees E ID E Name 111 John Smith 112 Susan Wang 9/4/2013 Luke Huan Univ. of Kansas 7
Historically The model was first proposed by Dr. E.F. Codd of IBM in 1970 in the following paper: "A Relational Model for Large Shared Data Banks," Communications of the ACM, June 1970. The above paper caused a major revolution in the field of Database management and earned Ted Codd the coveted ACM Turing Award. from wikipedia 9/4/2013 Luke Huan Univ. of Kansas 8
Relational Model Concepts Relational database: a set of relations. Relation: made up of 2 parts: Schema : specifies name of relation, plus name and type of each column. E.g. Students (sid: string, name: string, login: string, age: integer, gpa: real) Instance : a table, with rows and columns. #rows = cardinality #fields = degree / arity 9/4/2013 Luke Huan Univ. of Kansas 9
Example Schema Student (SID: integer, name: string, age: integer, GPA: float) Course (CID: string, title: string) Enroll (SID: integer, CID: integer) Instance { 112, Bart, 18, 3.2; 113, Milhouse, 20, 3.1;...} 9/4/2013 Luke Huan Univ. of Kansas 10
Attribute Types Each attribute of a relation has a name, designating the role of the attribute The set of allowed values for each attribute is called the domain of the attribute Attribute values are (normally) required to be atomic; that is, indivisible E.g. the value of an attribute can be an account number, but cannot be a set of account numbers The special value null is a member of every domain 9/4/2013 Luke Huan Univ. of Kansas 11
Definition Summary Informal Terms Formal Terms Table Relation Column Attribute/Domain Row Tuple Possible values in a column Domain Table Definition Schema of a Relation Populated Table Instances 9/4/2013 Luke Huan Univ. of Kansas 12
Characteristics of Relation The tuples in a ration r(r) are not considered to be ordered, even though they appear to be in the tabular form. Consider the mathematical definition of relation R(A 1, A 2,..., A n ) = A 1 A 2 A n, elements are ordered. (well, DB is not math). All values are considered atomic (indivisible). A special null value is used to represent values that are unknown or inapplicable to certain tuples. 9/4/2013 Luke Huan Univ. of Kansas 13
Relational Integrity Constraints Constraints are conditions that must hold on all valid relation instances. There are four main types of constraints: 1. Domain constraints 1. The value of a attribute must come from its domain 2. Key constraints 3. Referential integrity constraints sid cid grade 53666 Carnatic101 C 53666 Reggae203 B 53650 Topology112 A 53661 History105 B sid name login age gpa 53666 Jones jones@cs -1 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 9/4/2013 Luke Huan Univ. of Kansas 14
Primary Keys A set of fields is a candidate key for a relation if : 1. No two distinct tuples can have same values in all key fields, and 2. This is not true for any subset of the key. Part 2 false? A superkey. If there s >1 key for a relation, one of the keys is chosen (by DBA) to be the primary key. E.g., given a schema Student (sid: string, name: string, gpa: float) we have: sid is a key for Students. (What about name?) The set {sid, gpa} is a superkey.
Key Example CAR (licence_num: string, Engine_serial_num: string, maker: string, model: string, year: integer) What is the candidate key(s) Which one you may use as a primary key What are the super keys 9/4/2013 Luke Huan Univ. of Kansas 16
Key Constraint & Entity Integrity Key Constraint: every relation must have a primary key Entity Integrity: The primary key attributes of a relation can not take the NULL value Other attributes of R may be similarly constrained to disallow null values, even though they are not members of the primary key. 9/4/2013 Luke Huan Univ. of Kansas 17
Foreign Keys, Referential Integrity Foreign key : Set of fields in one relation that is used to `refer to a tuple in another relation. (Must correspond to primary key of the second relation.) Like a `logical pointer. E.g. sid is a foreign key referring to Students: Student(sid: string, name: string, gpa: float) Enrolled(sid: string, cid: string, grade: string) If all foreign key constraints are enforced, referential integrity is achieved, i.e., no dangling references. Can you name a data model w/o referential integrity? Links in HTML!
Foreign Keys Only students listed in the Students relation should be allowed to enroll for courses. Enrolled sid cid grade 53666 Carnatic101 C 53666 Reggae203 B 53650 Topology112 A 53661 History105 B Students sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 Or, use NULL as the value for the foreign key in the referencing tuple when the referenced tuple does not exist
In-Class Exercise Consider the following relations for a database that keeps track of student enrollment in courses and the books adopted for each course: STUDENT(SSN, Name, Major, Bdate) COURSE(Course#, Cname, Dept) ENROLL(SSN, Course#, Quarter, Grade) BOOK_ADOPTION(Course#, Quarter, Book_ISBN) TEXT(Book_ISBN, Book_Title, Publisher, Author) Draw a relational schema diagram specifying the foreign keys for this schema. 9/4/2013 Luke Huan Univ. of Kansas 20
Other Types of Constraints Semantic Integrity Constraints: based on application semantics and cannot be expressed by the model per se e.g., the max. no. of hours per employee for all projects he or she works on is 56 hrs per week SQL-99 allows triggers and ASSERTIONS to support some of these 9/4/2013 Luke Huan Univ. of Kansas 21
SQL SQL: Structured Query Language Pronounced S-Q-L or sequel The standard query language supported by most commercial DBMS A brief history IBM System R ANSI SQL89 ANSI SQL92 (SQL2) ANSI SQL99 (SQL3) ANSI SQL 2003 (+OLAP, XML, etc.) 9/4/2013 Luke Huan Univ. of Kansas 22
Creating and dropping tables CREATE TABLE table_name (, column_name i column_type i, ); DROP TABLE table_name; Examples create table Student (SID integer, name varchar(30), email varchar(30), age integer, GPA float); create table Course (CID char(10), title varchar(100)); create table Enroll (SID integer, CID char(10)); drop table Student; drop table Course; drop table Enroll; -- everything from -- to the end of the line is ignored. -- SQL is insensitive to white space. -- SQL is insensitive to case (e.g.,...course... is equivalent to --...COURSE...) 9/4/2013 Luke Huan Univ. of Kansas 23
A Few Options You may add quantifier to a domain: Not NULL Unique Primary Key CREATE TABLE Student (SID INTEGER PRIMARY KEY, name VARCHAR(30) NOT NULL, email VARCHAR(30) UNIQUE, age INTEGER, GPA FLOAT); CREATE TABLE Enroll (SID INTEGER NOT NULL, CID CHAR(10) NOT NULL, PRIMARY KEY(SID, CID)); Used for query processing 9/4/2013 Luke Huan Univ. of Kansas 24
Create Your Own Domain You may define your own domain CREATE DOMAIN <name of the domain> definition of the domain Example: CREATE DOMAIN persons_age INT CREATE DOMAIN persons_age INT NOT NULL DEFAULT 18 CHECK (Value > 0) CREATE TABLE Students (age persons_age, ssn char(9) primary key); Insert into Students values(-1, '123456789'); ERROR: value for domain persons_age violates check constraint "persons_age_check" 9/4/2013 Luke Huan Univ. of Kansas 25
Modify a Table CREATE TABLE Students (age persons_age, ssn char(9) primary key); Insert into Students values(-1, '123456789'); Delete From Student where Students.SSN = 123456789 ; Update Students Set Students.age = 19 Where Students.SSN = `123456788 9/4/2013 Luke Huan Univ. of Kansas 26
Referential Integrity CREATE Table Enrolled ( studentid Char (20), Cid Char (20), grade Char (10), Primary Key (studentid, Cid), Foreign Key (studentid) References Students On Delete CASCADE On UPDATE No Action 9/4/2013 Luke Huan Univ. of Kansas 27
Database Design 9/4/2013 Luke Huan Univ. of Kansas 28