The core theory of relational databases Slide 1 La meilleure pratique... c est une bonne théorie Bibliography M.Levene, G.Loizou, Guided Tour of Relational Databases and Beyond, Springer, 625 pages,1999. Slide 2 H.Mannila, K.Räihä, The Design of Relational Databases, Addison-Wesley, Second edition,1994. Raghu Ramakrishnan, Database Management Systems, McGraw-Hill, Second Edition, 936 pages, 1999. S.Abiteboul, R.Hull, V.Vianu, Foundations of Databases, Addison-Wesley, 685 pages, 1995. C. Delobel, Bases de Données et Systèmes Relationnels, Dunod, 468 pages, 1983.
DBMS Systems - Summary From [Ramakrishnan99]. DBMS used to maintain, query large datasets. Slide 3 Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security Levels of abstraction give data independence A DBMS typically has a layered architecture DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in computer science Why Study the Relational Model? Slide 4 Simple and intuitive representation of data Strong formal foundation Most widely used model Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. Powerful and natural query languages exist
The relational data model A relational model is a combination of three parts : Structural part : the data structure of the model, i.e., the relation, the relation schema Slide 5 Manipulative part There is three relational query languages : Relational algebra : procedural language Relational language : declarative language Non-recursive Datalog : rule based language Integrity part Restrict the allowable relations in a database to satisfy certain logical conditions, called integrity constraints Structural part of the relational model The relational model has only one data structure : the relation Slide 6 A relation is a set of tuples A relational database is a set of relations A schema of a relation (i.e., relation schema) is a set of attributes describing the components of tuples A schema of a relation (i.e., database schema) is a set of relation schemas
Example and notation Slide 7 The relation r 1 over Student Name Age Address Toto 21 Clermont Dupond 35 Paris Durand 45 Lyon Schema(Student)={Name, Age, Address} Attributes and domains Let U be a countably infinite set of attribute names (or simply attributes). U is called universe of attributes. Let D be a countably infinite set of constant values. D is the underlying database domain. Slide 8 Given an attribute A in U, the domain of A, denoted by DOM(A) is a subset of D. We refer to the constant values in DOM(A) as constants or values. Unique Name Assumption (UNA) Two constant values c 1 DOM(A 1 ), c 2 DOM(A 2 ) are equal if and only if they are syntactically identical, i.e., they have the same name. Example: T oto = T oto but T oto Dupond and T oto 21
Relation schema A relation schema R has the follwing components: The name of the schema : a relation symbol R Slide 9 A similarity type type(r), which denotes the number of attributes of R A set of attributes {att(1),..., att(type(r))} denoted by schema(r) Example type(student) = 3 schema(student) = {att(1), att(2), att(3)} where att(1) = name, att(2) = age and att(3) = address Creating Relations in SQL Slide 10 CREATE TABLE Students (Name : CHAR(20), Age : INTEGER, Address : CHAR(40) ) Observe that the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified
Database schema A database schema R is a collection {R 1,..., R n } of relation schemas schema(r) is the union of all schema(r i ), R i R Slide 11 A relation schema R is in First Normal Form (1NF) if all the domains of attributes A i schema(r) are atomic A database schema R is in 1NF if all the relation schemas R i R are in 1NF Universal Relation Schema Assumption (URSA): if an attribute A appears both in schema(r i ) and schema(r j ) then it has the same meaning Relations and databases A tuple over R, with schema(r) = {A 1,..., A m } is a member of the cartesian product DOM(A 1 ) DOM(A 2 )... DOM(A m ) Slide 12 A tuple t over R can also be viewed as a total mapping from schema(r) to the union of the domains DOM(A i ) such that A i schema(r), t(a i ) DOM(A i ) A relation over R is a finite set of tuples over R A database d over R is a collection {r 1,..., r n } of relations r i over R i Relations over 1NF relation schemas are called 1NF relations
Projection and active domain Slide 13 Projection The projection of a tuple t in a relation r over a relation schema R onto the attribute A i = att(i) is the i-coordinate of t, i.e., t(i) Let Y = {att(i 1 ),..., att(i n )} schema(r). The projection of t onto Y, noted t[y ], is defined by t[y ] =< t(i 1 ),..., t(i n ) > Active domain The active domain of a relation r over R, noted ADOM(r), is the set of constant values that appear in the tuples of r The active domain of a database d over R,, noted ADOM(d), is the union of the active domains of its relations Keys and Superkeys Slide 14 Superkey. A subset SK of a schema(r) is a superkey for R if for all relations r over R, the projection t[sk] of a tuple t over R uniquely identifies a single tuple in r Key. A key for a relation schema R is a superkey for R which is minimal Primary Key. a primary key for R is a key which is designated by the database designer
Foreign Key Slide 15 Foreign Key. Let R be a database schema, R 1, R 2 R and assume that K is the primary key of R 2. F K R 1 is a foreign key for R 1 referencing the primary key K of R 2 if the following condition holds: d = {r 1, r 2,..., r n } over R and t 1 r 1, if t 1 [F K] does not contain any null values, then t 2 r 2 such that t 1 [F K] = t 2 [K] Specifying keys in SQL Slide 16 CREATE TABLE Students (sid CHAR(20), ssn CHAR(20) NOT NULL, Name : CHAR(20), Age : INTEGER, dep : CHAR(5) PRIMARY KEY (sid), UNIQUE (ssn), FOREIGN KEY (dep) REFERENCES Department )