Logical Schema Design: The Relational Data Model Basics of the Relational Model From Conceptual to Logical Schema Logical Schema Design Select data model Hierarchical data model: hierarchies of record types mainframe oldie, still in use, outdated Network data model: graph like data structures, still in use, outdated Relational data model: the most important one today Object-Oriented data model: more flexible data structures, less powerful data manipulation language Object-Relational: the best of two worlds? Transform conceptual model into logical schema of data model easy for Relational Data Model (RDM) can be performed automatically (e.g. by Oracle Designer) 1.2 1
Logical Schema Design: Relational Data Model The Relational Data Model Important terms 1970 introduced by E.F. Codd, honoured by Turing award (Codd: A Relational Model of Data for Large Shared Data Banks. ACM Comm. 13( 6): 377-387, 1970) Basic ideas: Database is collection of relations Relation R = set of n-tuples Relation schema R(A 1,A 2, A n ) Attributes A i = atomic values of domain D i =dom(a i ) relation (table) attribute tuple FName Tina Anna Carla Name Müller Katz Maus Student Email mueller@... katz@... piep@... relation schema: Student(Fname, Name, Email, SID) or Student(Fname:string, Name:string, Email:string, SID:number) SID 13555 12555 11222 1.3 Logical Schema Design: Relational Data Model Relation Schema R(A 1, A 2,, A n ) notation sometimes R:{[A 1, A 2,, A n ]} Relation R dom(a 1 ) x dom(a 2 ) x x dom(a n ) Attribute set R ={A 1, A 2,, A n } Degree of a relation: number of attributes Example: Student(Fname, Name, Email, SID) Student string x string x string x number R (Student) ={Fname, Name, Email, SID} Database Schema is set of relation schemas 1.4 2
Logical Schema Design: Relational Data Model Time-variant relations Relations have state at each point in time. Integrity constraints on state part of DB schema Tuples (rows, records) Not ordered No duplicate tuples (Relations are sets) Null-values for some attributes possible Distinguishable based on tuple values (key-concept) Different to object identification in o-o languages: there, each object has implicit identity, usually completely unrelated to its fields 1.5 Logical Schema Design: Relational Data Model Superkey: Important terms subset of attributes of a relation schema R for which no two tuples have the same value. Every relation at least 1 superkey. Example: Student(Fname, Name, Email, SID) Superkeys: {Fname, Name, Email, SID}, {Name, Email, SID}, {Fname, Email, SID}, {Fname, Name, SID}, {Fname, SID}, {Name, SID}, {Fname, Name, Email}, {Fname, Email}, {Name, Email}, {Email, SID}, {Email}, {SID} 1.6 3
Logical Schema Design: Relational Data Model Candidate Key: Important terms superkey K of relation R such that if any attribute A K is removed, the set of attributes K\A is not a superkey of R. Example: Student(Fname, Name, Email, SID) Candidate Keys: {Email}, {SID} Primary Key: arbitrarily designated candidate key Example: Student(Fname, Name, Email, SID) Primary Key: {SID} 1.7 Logical Schema Design: Relational Data Model Foreign Key: Important terms set of attributes FK in relation schema R1 that references relation R2 if: Attributes of FK have the same domain as the attributes of primary key K 2 of R 2 A value of FK in tuple t 1 in R 1 either occurs as a value of K 2 for some t 2 in R 2 or is null. Example: R 1 : Exercise(EID, Tutor, ExcerciseHours, Lecture) R 2 : Student(Fname, Name, Email, SID) K 1 ={EID}, K 2 ={SID} Tutor is foreign key of Student (from Exercise ). 1.8 4
Logical Schema Design: Transformation 1. Select data model relational data model 2. Transform conceptual model into logical schema of relational data model Define relational schema, table names, attributes and types, invariants Design steps: Translate entities into relations Translate relationships into relations Simplify the design Define tables in SQL Define additional invariants (Formal analysis of the schema) 1.9 Logical Schema Design: Entities Step 1: Transform entities For each entity E create relation R with all attributes Key attributes of E transform to keys of the relation category year director id Price_per_day title length Movie Movie(id, title, category, year, director, Price_per_day, length) 1.10 5
Logical Schema Design: Weak Entities Step 2: Transform weak entities For each weak entity WE create relation R Include all attributes of WE Add key of identifying entity to weak entities key Part of key is foreign key eid (0,*) (1,1) Employee has Child cid Employee(eid) Child(cid, eid) Note: weak entity and defining relationship transformed together! 1.11 Logical Schema Design: Weak Entities Step 2: Transform weak entities Example: Item has (1,1) iid date did Delivery (1,*) (1,1) contains Order (1,*) contact Delivery(did, date) Order(oid, did, date, contact) Item(iid, oid, did) date oid 1.12 6
Logical Schema Design: Relationships Step 3: Transform Relationships For each 1:1-relationship between E1, E2 create relation R Include all key attributes Define as key: key of entity E1 or entity E2 Choose entity with total participation for key (driving licence) (1,1) (1,1) Country has President CName PName Country(CName) President(PName) has(cname, PName) or has(cname, PName) 1.13 Logical Schema Design: Relationships Step 3: Transform Relationships For each 1:N-relationship between E1, E2 create relation R Include all key attributes Define as key: key of N-side of relationship lnr Lecture hold Lecturer (1,1) (0,*) lid name Lecture(lnr) Lecturer(lid, name) hold(lnr,lid) 1.14 7
Logical Schema Design: Relationships Step 3: Transform Relationships For each N:M-relationship create relation R Include all key attributes Define as key: keys from both entities User_account has emailmessage (0,*) (1,*) username User_account(username) emailmessage(id, from) has(username, messageid) from id 1.15 Logical Schema Design: Relationships Step 3: Transform Relationships Include all relationship attributes in relation R lnr Lecture hold Lecturer (1,1) (0,N) date Lecture(lnr) Lecturer(lid, name) hold(lnr,lid, date) lid name lnr Rename keys in recursive relationships require(prelnr, succlnr) Lecture predecessor successor (0,1) (0,*) require 1.16 8
Logical Schema Design: Relationships Step 3: Transform Relationships For each n-ary relationship (n>2) create relation R Include all key attributes Define as key: keys from all numerously involved entities rating (0,*) (1,*) Lecturer recommend Textbook ISBN lid name (0,*) Lecture Lecture(lnr) Lecturer(lid, name) Textbook(ISBN, title, author) recommend(lid, lnr, ISBN, rating) lnr title author 1.17 Logical Schema Design: Generalization Step 4: Generalization 3 Variants for Transformation Person FName Name Email pid SID Student Lecturer Contract Person(pid, name, fname, email) Student(pid, sid) Lecturer(pid, contract) Student(pid, name, fname, email, sid) Lecturer(pid, name, fname, email, contract) Person(pid, name, fname, email, sid, contract) 1.18 9
Logical Schema Design: Generalization (1) Separate relation for each entity Key in specialized relations: foreign key to general relation b B A C aid a c A(aid, a) B(aid, b) C(aid, c) A: B: C: Gathering data from different tables time consuming Appropriate specialization prevents unnecessary data access 1.19 Logical Schema Design: Generalization (2) Relations for specialization Separate tables which include A s attributes Separate keys for relations b B A C aid a c AB(aid, a, b) AC(aid, a, c) AB: AC: Only valid for exhaustive specialization (no data in A only) Time consuming data retrieval for non-distinct specializations 1.20 10
Logical Schema Design: Generalization (3) one relation for all entities A aid a ABC(aid, a, b, c) b B C c ABC: NULLs, if subtypes are disjoint Empty, if subtypes are exhaustive Removes generalization! Relation may contain many NULL-values 1.21 Logical Schema Design: Example charge name id from until Format belong_to Tape is_in Rental have (0,*) (1,1) (0,*) (1,1) (1,1) hold (1,1) Telephone (0,*) title Address category (1,*) Customer First_name year Movie director Last_name (0,*) Mem_no id stage_name Price_per_day length play Actor real_name (1,*) birthday 1.22 11
Logical Schema Design: Example Format(name, charge) Tape(id) Movie(id, title, category, year, director, price_per_day, length) Actor(stage_name, real_name, birthday) Customer(mem_no, last_name, first_name, address, telephone) Rental(tape_id, member, from, until) Belong_to(formatname, tape_id) Hold(tape_id, movie_id) Play(movie_id, actor_name) Reduce relation number by simplification! 1.23 Logical Schema Design: Simplification Collect those relation schemas with the same key attributes into one relation schema Semantics of attributes must match, not literal name Do not destroy generalization! hours title Lecture (3,N) attend (1,N) Student SID LID (1,1) (0,N) hold Person Lecturer Contract FName Name Email pid Lecture(lid, title, hours) Person(pid, fname, name, email) Student(pid, sid) Lecturer(pid, contract) Attend(lid, pid) Hold(lid, pid) Lecture(lid, title, hours, lecturer) 1.24 12
Logical Schema Design: Example Format(name, charge) Tape(id) Movie(id, title, category, year, director, price_per_day, length) Actor(stage_name, real_name, birthday) Customer(mem_no, last_name, first_name, address, telephone) Rental(tape_id, member, from, until) Belong_to(formatname, tape_id) Hold(tape_id, movie_id) Play(movie_id, actor_name) Simplified relation from Tape/Belong_to/Hold: Tape(id, format, movie_id) No simplification based on partial keys! 1.25 Logical Schema Design: Simplification Restrictions on schema simplification Simplification (folding) of relations may result in NULL values in some tables id Student get Grant (0,1) (0,1) since responsible Student(id, ) Grant(gid, duation, amount) Get(student, gid, since, responsible) duration gid amount Grant(gid, duation, amount, student, since, responsible) Consequence: optional relationships should not be transformed if many NULLS to be expected May lead to time consuming retrieval 1.26 13
Logical Schema Design: Short summary Relational data model Representation data in relations (tables) the most important data model today Important terms & concepts Relation: set of n-tuples Relation schema defines structure Attribute: property of relation, atomic values Superkey, candidate key, primary key identify tuple Transformation rules for entities, relationships Simplification Open aspects of logical schema design DDL for defining and changing relation schemas Definition of constraints (Min-cardinalities, Value restrictions for attributes, ) 1.27 14