Roadmap. Database Design. Roadmap. The E-R Model. Entities. Conceptual Modelling

Similar documents
Database Design. Database Design I: The Entity-Relationship Model. Entity Type (con t) Chapter 4. Entity: an object that is involved in the enterprise

Database Design. Goal: specification of database schema Methodology: E-R Model is viewed as a set of

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Database Management Systems. Redundancy and Other Problems. Redundancy

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

Limitations of DB Design Processes

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Conceptual Design Using the Entity-Relationship (ER) Model

Chapter 8. Database Design II: Relational Normalization Theory

Outline. Data Modeling. Conceptual Design. ER Model Basics: Entities. ER Model Basics: Relationships. Ternary Relationships. Yanlei Diao UMass Amherst

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen

Review: Participation Constraints

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

2. Conceptual Modeling using the Entity-Relationship Model

The Relational Data Model: Structure

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Data Modeling. Database Systems: The Complete Book Ch ,

Theory of Relational Database Design and Normalization

IV. The (Extended) Entity-Relationship Model

Lecture 6. SQL, Logical DB Design

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

The Entity-Relationship Model

Relational model. Relational model - practice. Relational Database Definitions 9/27/11. Relational model. Relational Database: Terminology

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Schema Refinement, Functional Dependencies, Normalization

There are five fields or columns, with names and types as shown above.

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Schema Refinement and Normalization

Entity-Relationship Model. Purpose of E/R Model. Entity Sets

CS 4604: Introduc0on to Database Management Systems. B. Aditya Prakash Lecture #5: En-ty/Rela-onal Models- - - Part 1

Exercise 1: Relational Model

Entity-Relationship Model

The Entity-Relationship Model

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps.

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

Database Systems Concepts, Languages and Architectures

The Entity-Relationship Model

Data Modeling: Part 1. Entity Relationship (ER) Model

DATABASE NORMALIZATION

Normalization in OODB Design

Databases and BigData

Data Modeling Basics

SQL. Ilchul Yoon Assistant Professor State University of New York, Korea. on tables. describing schema. CSE 532 Theory of Database Systems

Chapter 10. Functional Dependencies and Normalization for Relational Databases

CS143 Notes: Normalization Theory

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

Theory of Relational Database Design and Normalization

Databases What the Specification Says

Functional Dependencies

IT2305 Database Systems I (Compulsory)

XV. The Entity-Relationship Model

Database Management System

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

Normal forms and normalization

Database Design and Normal Forms

Review Entity-Relationship Diagrams and the Relational Model. Data Models. Review. Why Study the Relational Model? Steps in Database Design

Database Design and Normalization

IT2304: Database Systems 1 (DBS 1)

Normalization. CIS 331: Introduction to Database Systems

Normalization in Database Design

Database Constraints and Design

Chapter 3. Data Modeling Using the Entity-Relationship (ER) Model

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Modern Systems Analysis and Design

The Relational Data Model and Relational Database Constraints

Graham Kemp (telephone , room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.

The University of British Columbia

Database Design Process. Databases - Entity-Relationship Modelling. Requirements Analysis. Database Design

How To Write A Diagram

Fundamentals of Database System

CMU - SCS / Database Applications Spring 2013, C. Faloutsos Homework 1: E.R. + Formal Q.L. Deadline: 1:30pm on Tuesday, 2/5/2013

Functional Dependencies and Finding a Minimal Cover

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

Database Design Process

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

Lecture 12: Entity Relationship Modelling

Relational Database Design: FD s & BCNF

MCQs~Databases~Relational Model and Normalization

CSC 742 Database Management Systems

three Entity-Relationship Modeling chapter OVERVIEW CHAPTER

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

7.1 The Information system

Entity Relationship Diagram

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model

Winter Winter

3. Relational Model and Relational Algebra

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Database Design Process

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

Functional Dependencies and Normalization

Transcription:

Conceptual Modelling 1 Database Design Goal: specification of database schema Methodology: Use conceptual model (ER/UML) to get a high-level graphical view of essential components of enterprise and how they are related Convert E-R/UML diagram to SQL DDL E-R Model: enterprise is viewed as a set of among entities class diagrams: static data viewed as Classes Associations among classes The E-R Model Invented by Peter Chen [1976] No standardized graphical representation Used in industry UML taking over Tool for conceptual modelling More general than relational model Entities Entity: an object that is involved in the enterprise Ex: John, CSE305 Entity Type: set of similar objects Ex: students, courses Attribute: describes one aspect of an entity type Ex: name, maximum enrollment 1

Entity Type Entity type described by set of attributes Person: Id, Name, Address, Hobbies Domain: possible values of an attribute Value can be a set (in contrast to relational model) (111111, John, 123 Main St, {stamps, coins}) Key: minimum set of attributes that uniquely identifies an entity (candidate key) Entity Schema: entity type name, attributes (and associated domain), key constraints Entity Type Graphical Representation in E-R diagram: Set valued Relationships Relationship: relates two or more entities John majors in Computer Science Relationship Type: set of similar relationships Student (entity type) related to (entity type) by MajorsIn (relationship type). Distinction: relation (relational model) - set of tuples relationship (E-R Model) describes relationship between entities of an enterprise Both entity types and relationship types (E-R model) may be represented as relations (in the relational model) Attributes and Roles Attribute of a relationship type describes the relationship e.g., John majors in CS since 2000 John and CS are related 2000 describes relationship - value of SINCE attribute of MajorsIn relationship type Role of a relationship type names one of the related entities e.g., John is value of Student role, CS value of role of MajorsIn relationship type (John, CS; 2000) describes a relationship Relationship Type Described by set of attributes and roles e.g., MajorsIn: Student,, Since Here we have used as the role name (Student) the name of the entity type (Student) of the participant in the relationship, but... Roles Problem: relationship can relate elements of same entity type e.g., ReportsTo relationship type relates two elements of Employee entity type: Bob reports to Mary since 2000 We do not have distinct names for the roles It is not clear who reports to whom 2

Roles Schema of a Relationship Type Solution: role name of relationship type need not be same as name of entity type from which participants are drawn ReportsTo has roles Subordinate and Supervisor and attribute Since Values of Subordinate and Supervisor both drawn from entity type Employee Role names, R i, and their corresponding entity sets. Roles must be single valued (number of roles = degree of relationship) Attribute names, A j, and their corresponding domains. Attributes may be set valued Key: Minimum set of roles and attributes that uniquely identify a relationship Relationship: <e 1, e n ; a 1, a k > e i is an entity, a value from R i s entity set a j is a set of attribute values with elements from domain of A j Graphical Representation Roles are edges labeled with role names (omitted if role name = name of entity set). Most attributes have been omitted. Entity Type Hierarchies One entity type might be subtype of another Freshman is a subtype of Student A relationship exists between a Freshman entity and the corresponding Student entity e.g., Freshman John is related to Student John This relationship is called IsA Freshman IsA Student The two entities related by IsA are always descriptions of the same real-world object IsA Student IsA Represents 4 relationship types Freshman Sophmore Junior Senior 3

Properties of IsA Inheritance - Attributes of supertype apply to subtype. E.g., GPA attribute of Student applies to Freshman Subtype inherits all attributes of supertype. Key of supertype is key of subtype Transitivity - Hierarchy of IsA Student is subtype of Person, Freshman is subtype of Student, so Freshman is also a subtype of Student Advantages of IsA Can create a more concise and readable E-R diagram Attributes common to different entity sets need not be repeated They can be grouped in one place as attributes of supertype Attributes of (sibling) subtypes can be different IsA Hierarchy - Example Constraints on Type Hierarchies Might have associated constraints: Covering constraint: Union of subtype entities is equal to set of supertype entities Employee is either a secretary or a technician (or both) Disjointness constraint: Sets of subtype entities are disjoint from one another Freshman, Sophomore, Junior, Senior are disjoint set Single-role Key Constraint If, for a particular participant entity type, each entity participates in at most one relationship, corresponding role is a key of relationship type E.g., role is unique in Representation in E-R diagram: arrow 4

Participation Constraint If every entity participates in at least one relationship, a participation constraint holds: A participation constraint of entity type E having role ρ in relationship type R states that for e in E there is an r in R such that ρ(r) = e. e.g., every professor works in at least one department Participation and Key Constraint If every entity participates in exactly one relationship, both a participation and a key constraint hold: e.g., every professor works in exactly one department E-R representation: thick line Reprsentation in E-R Representation of Entity Types in the Relational Model Redundancy Problem In general, an entity type corresponds to a relation Relation s attributes = entity type s attributes Problem: entity type can have set valued attributes, e.g., Person: Id, Name, Address, Hobbies Straightforward Solution: Use several rows to represent a single entity (111111, John, 123 Main St, stamps) (111111, John, 123 Main St, coins) Problems with this solution: Redundancy Key of entity type (Id) not key of relation Hence, the resulting relation must be further transformed ER Model Relational Model SSN Name Address Hobby 1111 Joe 123 Main {biking, hiking} SSN Name Address Hobby 1111 Joe 123 Main biking 1111 Joe 123 Main hiking. Redundancy 5 5

Redundancy Problem Dependencies between attributes cause redundancy Ex. All addresses in the same town have the same zip code SSN Name Town Zip 1234 Joe Stony Brook 11790 4321 Mary Stony Brook 11790 5454 Tom Stony Brook 11790. Redundancy Anomalies Redundancy leads to anomalies: Update anomaly: A change in Address must be made in several places Deletion anomaly: Suppose a person gives up all hobbies. Do we: Set Hobby attribute to null? No, since Hobby is part of key Delete the entire row? No, since we lose other information in the row Insertion anomaly: Hobby value must be supplied for any inserted row since Hobby is part of key Decomposition Solution: use two relations to store Person information Person1 (SSN, Name, Address) Hobbies (SSN, Hobby) The decomposition is more general: people with hobbies can now be described No update anomalies: Name and address stored once A hobby can be separately supplied or deleted Decomposition guided by normal forms Described in Chapter 6 Relies on notion of Functional Dependencies (FD) Functional Dependencies Definition: A functional dependency (FD) on a relation schema R is a constraint X Y, where X and Y are subsets of attributes of R. Definition: An FD X Y is satisfied in an instance r of R if for every pair of tuples, t and s: if t and s agree on all attributes in X then they must agree on all attributes in Y Key constraint is a special kind of functional dependency: all attributes of relation occur on the right-hand side of the FD: SSN SSN, Name, Address Functional Dependencies Address ZipCode Stony Brook s ZIP is 11733 ArtistName BirthYear Picasso was born in 1881 Autobrand Manufacturer, Engine type Pontiac is built by General Motors with gasoline engine Author, Title PublDate Shakespeare s Hamlet published in 1600 Functional Dependency - Example Consider a brokerage firm that allows multiple clients to share an account, but each account is managed from a single office and a client can have no more than one account in an office HasAccount (AcctNum, ClientId, OfficeId) keys are (ClientId, OfficeId), (AcctNum, ClientId) Client, OfficeId AcctNum AcctNum OfficeId Thus, attribute values need not depend only on key values 6

Boyce-Codd Normal Form Definition: A relation schema R is in BCNF if for every FD X Y associated with R either Y X (i.e., the FD is trivial) or X is a superkey of R Example: Person1(SSN, Name, Address) The only FD is SSN Name, Address Since SSN is a key, Person1 is in BCNF (non) BCNF Examples Person (SSN, Name, Address, Hobby) The FD SSN Name, Address does not satisfy requirements of BCNF since the key is (SSN, Hobby) HasAccount (AcctNum, ClientId, OfficeId) The FD AcctNum OfficeId does not satisfy BCNF requirements since keys are (ClientId, OfficeId) and (AcctNum, ClientId); not AcctNum. BCNF and Redundancy Suppose R has a FD A B, and A is not a superkey. If an instance has 2 rows with same value in A, they must also have same value in B (=> redundancy, if the A-value repeats twice) redundancy SSN Name, Address SSN Name Address Hobby 1111 Joe 123 Main stamps 1111 Joe 123 Main coins Representation of Relationship Types in the Relational Model Typically, a relationship becomes a relation in the relational model Attributes of the corresponding relation are Attributes of relationship type For each role, the primary key of the entity type associated with that role Example: CrsCode Enroll RoomNo DeptId Name SectNo S2000Courses Teaching TAs Id If A is a superkey, there cannot be two rows with same value of A Hence, BCNF eliminates redundancy S2000Courses (CrsCode, SectNo, Enroll) (Id, DeptId, Name) Teaching (CrsCode, SecNo, Id, RoomNo, TAs) Representation of Relationship Types in the Relational Model Representation of Single Role Key Constraints in the Relational Model Candidate key of corresponding table = candidate key of roles Foreign keys in corresponding table = candidate key of roles Except when there are set valued attributes Example: Teaching (CrsCode, SectNo, Id, RoomNo, TAs) Key of relationship type = (CrsCode, SectNo) Key of relation = (CrsCode, SectNo, TAs) CrsCode SectNo Id RoomNo TAs CSE305 1 1234 Hum 22 Joe CSE305 1 1234 Hum 22 Mary Set valued Relational model representation: key of the relation corresponding to the entity type is key of the relation corresponding to the relationship type Id is primary key of ; ProfId is key of. 4100 does not participate in the relationship. Cannot use foreign key in to refer to since some professors may not work in any dept. (But ProfId is a foreign key in that refers to.) Id 1123 4100 3216 ProfId 1123 CSE 3216 AMS Key 7

Example 1 Since Status Project Example 2 Date Price Sold Part CREATE TABLE ( Since DATE, -- attribute Status CHAR (10), -- attribute ProfId INTEGER, -- role (key of ) DeptId CHAR (4), -- role (key of ) PRIMARY KEY (ProfId), -- since a professor works in at most one department FOREIGN KEY (ProfId) REFERENCES (Id), FOREIGN KEY (DeptId) REFERENCES ) Supplier CREATE TABLE Sold ( Price INTEGER, -- attribute Date DATE, -- attribute ProjId INTEGER, -- role SupplierId INTEGER, -- role PartNumber INTEGER, -- role PRIMARY KEY (ProjId, SupplierId, PartNumber, Date), FOREIGN KEY (ProjId) REFERENCES Project, FOREIGN KEY (SupplierId) REFERENCES Supplier (Id), FOREIGN KEY (PartNumber) REFERENCES Part (Number) ) Representing Type Hierarchies in the Relational Model Supertypes and subtypes can be realized as separate relations Need a way of identifying subtype entity with its (unique) related supertype entity Choose a candidate key and make it an attribute of all entity types in hierarchy Type Hierarchies and the Relational Model Id attribs0 Student Translated by adding the primary key of supertype to all subtypes. Plus foreign key from subtypes to the supertype. Id attribs1 Id attribs2 Id attribs3 Id attribs4 Freshman Sophmore Junior Senior FOREIGN KEY Id REFERENCES Student in Freshman, Sophomore, Sunior, Senior Type Hierarchies and the Relational Model Redundancy eliminated if IsA is not disjoint For individuals who are both employees and students, Name and DOB are stored only once Person Employee Student SSN Name DOB SSN Salary SSN GPA StartDate 1234 Mary 1950 1234 Accounting 35000 1234 3.5 1997 Representing Participation Constraints in the Relational Model Inclusion dependency: Every professor works in at least one dep t. in the relational model: (easy) (Id) references (ProfId) in SQL: Simple case: If ProfId is a key in (i.e., every professor works in exactly one department) then it is easy: FOREIGN KEY Id REFERENCES (ProfId) General case ProfId is not a key in, so can t use foreign key constraint (not so easy): Needs an assertion 8

Entity or Attribute? Sometimes information can be represented as either an entity or an attribute. Student Student Transcript Course Transcript Course Grade Semester Grade Semester Appropriate if Semester has attributes (next slide) Entity or Relationship? (Non-) Equivalence of Diagrams Transformations between binary and ternary relationships. Date Project Sold Price Part Supplier UML Notation 9

UML Example ER Example Summary ER and UML are tools for conceptual modelling / Relationships Careful translation into relational model and SQL Consider functional dependencies 10