Database Design and the ER model 1. Database design Database design: Description of the data environment An abstraction, driven by anticipated applications Database model: A formalism for specifying database designs ER model: Embeds many well-known data modeling features (pedagogical value) ER designs are later converted to designs for actual models (e.g., relational) Analogy: programming Database design = Programming Database model = Programming language ER Model = Flowcharting Actual model = C or Pascal 2. Basic concepts of the ER model Entities and entity-sets Entity: An object which is distinguishable from other objects Entity-set: A homogeneous (of the same kind) set of entities Entities: The student John The student Betty The student Mike Entity-set: Student Entities: The course INFS-614 The course Math-501 The course INFS-760 Entity-set: Course: Attributes of entity-sets Each entity is represented by a set of attributes Each attribute describes the entity by means of a value(s) These values are the actual content of the entity The set of values permitted for an attribute is its domain Attributes are associated with the entity-set, assuring that every entity in the entity-set is described with a similar set of values (homogeneity) Types of attributes Simple vs. composite attributes: A simple attribute consists of one field of information; a composite attribute is a structure consisting of multiple fields Single-valued vs. multi-valued attributes: A single-valued attribute associates one value with each entity; a multi-valued attribute associates a set of values with each entity
Derived attribute: The value of this attribute can be computed from other related attributes or entities Null value: Special value denoting not applicable (does not exist), missing (exists but not known), or unknown (unclear whether it exists or not) Student Student_Id, Year_of_Birth: simple, single-valued Telephone: simple, multi-valued Address: composite, single-valued Age: derived Relationships and relationship-sets Relationship: Association among two or more entities Relationship-set: A set of relationships, where the corresponding entities are from the same entity-set (homogeneity) Relationships: The student John is enrolled in the course INFS-614 The student Betty is enrolled in the course INFS-614 The student Betty is enrolled in the course Math-501 The student Mike is enrolled in the course INFS-760 Relationship-set: Enrollment Attributes of relationship-sets Each relationship set may be associated with attributes Semester is attribute of the relationship-set Enrollment Note that semester is an attribute of neither Student nor Course Intension (entity-sets, relationship sets) vs. extension (entities, relationships) ER diagrams (Part 1) Entity-set: rectangle Relationship-set: diamond, connected with edges to participating entity-sets Attribute: oval (double oval for multi-attribute, descendent ovals for attributes in composite attribute, dashed for derived attribute) Binary vs. n-ary relationship-sets: An n-ary relationship-set associates n entitysets; in a binary relationship-set n=2 A non-binary relationship-set can always be simulated by several binary relationship-sets (and the addition of a new entity-set). Meeting 3 entity-sets Subject, Time, Location, and one 3-way relationship-set Meeting 4 entity-sets Meeting, Subject, Time, Location and 3 binary relationship-sets: MS, MT, ML
3. Integrity constraints Impose restrictions on the extension Contribute to the integrity (validity) of the extension, by rejecting any modifications (updates) that would result in violations of the restrictions 1. Mapping cardinalities: Dictate how many entities may participate in each relationship of a binary relationship-set (mapping cardinality constraints on nonbinary relationships are not straightforward) Types of mapping cardinalities: 1:1, 1:many, many:many Student Enrolled-in Course (many:many) Faculty Advises Student (1:many) Department Chair Faculty (1:1) Cardinality limits: lowest and highest cardinality allowed (* means no limit) Student Participates-in Project, where one student may work in 3-6 different projects and 1-5 students may work on same project. Student Participates-in is annotated 3..5, Participates-in Project is annotated 1..5. 2. Total participation: Mandatory participation of entities in a relationship-set Faculty Member Department If each faculty must belong to a department, then Faculty Member is a total participation constraint. If each department must have at least one faculty member then Member Department is a total participation constraint There is some overlap between cardinality limits and the combination mapping type/total participation: highest=1: 1:many relationship (but arrow on other edge!) lowest=1: total participation (double-line this edge!) 3. Keys: Limit to one the number of entities that may share the same attribute(s) value Allow identification of a unique entity within an entity set, by providing a value of the attribute(s) Superkey: a subset of the attributes of an entity-set that uniquely identifies the entities Candidate key: A minimal superkey Primary key: A designated candidate key Student Attributes: Student_Id, Last_Name, First_Name, Age, Sex Superkeys: (Student_Id), (Student_id, Age), (Student_Id, Sex), etc. Candidate keys: (Student_id), (Telephone_No, Last_Name, First_Name) Primary Key: (Student_Id) Simple vs. composite keys: A key with more than one attribute is composite, otherwise it is simple Weak vs. strong entity-sets: An entity-set with a key is strong; otherwise it is weak INFS614_Student Submit Homework Student=(Student_Id, Student_Name, Major) Strong
Homework=(Homework_No, Grade) Weak A weak entity-set is permitted in ER designs, if 1. It is associated with a strong entity-set via a 1:many relationship-set 2. Participation of the weak entity-set in this relationship-set is total 3. The entities (of the weak entity-set) that are associated with the same entity (from the strong entity-set) are distinguishable by a subset of the attributes of the weak-entity-set (called discriminator) The discriminator attributes with the primary key of the strong entity-set constitute a key for the weak-entity-set ER diagrams (Part 2) Mapping cardinality: arrow-head on an edge representing 1 relationship Cardinality limits: pair of values lowest..highest on the edge Key attribute(s): underlined (dashed, for discriminator attribute(s) in a weak entity-set) Weak entity-set: double rectangle (also double edge and double diamond to indicate the connection to the strong entity-set) Total participation: double edge 4. Design possibilities Entity-set participates in multiple relationship-sets Faculty Advises Student, Faculty Teaches Course Multiple relationship-sets among the same entity-sets Department Chair Faculty, Department Member Faculty (note the possibility of the chair not being a member of the department) Recursive relationship-sets Employee Manages Employee Solution 1: Multi-level hierarchy with entity-sets: President, Vice President, Manager, Employee and three 1:many relationship-sets Manages Solution 2: Two identical entity-sets Employee1, Employee2 and one 1:many relationship-set Manages Solution 3: One entity-set Employee and one recursive 1:many relationship-set Manages Recursive relationships require annotating the role: one edge is marked boss, the other subordinate 5. Advanced features Generalization/Specialization with inheritance Entity-set B is a specialization of entity-set A (entity-set A is a generalization of entity-set B), if the entities in B are a subset of the entities in A The relationship-set among the entity-sets A and B is 1:1 Entity-set B does not store the attributes of A, as this would be redundant. Instead, these attributes are inherited from A A generalization hierarchy of multiple levels is possible Grad_Student and Undergrad_Student are specializations of Student
Student and Faculty are specializations of University_Person Programmer and Engineer are specializations of Employee Part_Time and Full_Time are also specializations of Employee Disjoint vs. overlapping: is it possible for an entity to belong to more than one subclass? Grad-Student and Undergrad_Student are disjoint Programmer, Engineer, Part_Time, Full_Time are overlapping Total vs. partial: Can there be superclass entities that do not belong to any subclass? Honor_Student is a specialization of Student. Students who are not honor students belong only to the superclass. Honor_Student and Non_Honor_Student are specializations of Student. All students belong to a subclass Generalizations are justified if there are applications that would use both the general and specialized entity-sets Generalizations can also be modeled with weak entity sets and 1:1 relationship-sets Aggregation Encapsulate a relationship-set and its associated entity-sets in one (higher level) entity-set The 3-way relationship-set Meeting (and its entity sets Subject, Time, Location) can be aggregated in a single entity-set Meeting1, which may then participate in a many:many relationship with the entity-set Employee Advantage: Solves modeling problems that are otherwise difficult to represent unambiguously ER diagrams (Part 3) Generalization/specialization: triangle standing on its tip Total generalization: double edge to the higher entity-set Disjoint generalization: the word disjoint next to triangle Aggregation: box the aggregated portion in a rectangle (similar to entity-set) 6. Design issues Avoid redundancies, as they might introduce inconsistencies Student Submit TermPaper, where Term_Paper=(Student_No, Course, Grade) Student_No is represented twice, and it becomes possible for one student to own term papers by other students Prefer Student Major Department over Major as attribute of Student ER design alternatives: Attribute or entity-set? If there are things to be said about the attribute, then it should become an entity-set If we want to record the capacity of a room, then it should become an entity-set
If we need to ask about values not used, then it should become an entity set Find a room available on Monday between 4 and 5. Justifies an entity-set Room ER design methodologies Strategy for large design tasks: 1. Design several independent views, each describing one natural subpart (or pertaining to a subset of the applications) 2. Merge the independent views into a single design