Designing a Database Schema



Similar documents
Information Systems Analysis and Design CSC John Mylopoulos Database Design Information Systems Analysis and Design CSC340

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps.

Database Management Systems

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E)

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model

not necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage

7.1 The Information system

XV. The Entity-Relationship Model

SCHEMAS AND STATE OF THE DATABASE

Foundations of Information Management

Lesson 8: Introduction to Databases E-R Data Modeling

2. Conceptual Modeling using the Entity-Relationship Model

Database Design Methodology

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

IV. The (Extended) Entity-Relationship Model

Chapter 2: Entity-Relationship Model. E-R R Diagrams

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

Concepts of Database Management Seventh Edition. Chapter 6 Database Design 2: Design Method

Foundations of Information Management

Chapter 2: Entity-Relationship Model

THE OPEN UNIVERSITY OF TANZANIA FACULTY OF SCIENCE TECHNOLOGY AND ENVIRONMENTAL STUDIES BACHELOR OF SIENCE IN DATA MANAGEMENT

Database Design Methodology

Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Data Modelling and E-R Diagrams

THE OPEN UNIVERSITY OF TANZANIA FACULTY OF SCIENCE TECHNOLOGY AND ENVIRONMENTAL STUDIES BACHELOR OF SIENCE IN INFORMATION AND COMMUNICATION TECHNOLOGY

Database Design Process

TIM 50 - Business Information Systems

CHAPTER 6 DATABASE MANAGEMENT SYSTEMS. Learning Objectives

Database Design Process

ISM 318: Database Systems. Objectives. Database. Dr. Hamid R. Nemati

Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design

Database Design Methodologies

Entity Relationship Diagram

How To Write A Diagram

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Exercise 1: Relational Model

Lecture 12: Entity Relationship Modelling

Databases and BigData

We know how to query a database using SQL. A set of tables and their schemas are given Data are properly loaded

Relational Database Concepts

Conceptual Design Using the Entity-Relationship (ER) Model


Modern Systems Analysis and Design

CSC 742 Database Management Systems

AS LEVEL Computer Application Databases

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Chapter 3. Data Modeling Using the Entity-Relationship (ER) Model

Entity-Relationship Model

Data Analysis 1. SET08104 Database Systems. Napier University

Foundations of Business Intelligence: Databases and Information Management

Data Modeling Basics

Relational Database Basics Review

Unit 2.1. Data Analysis 1 - V Data Analysis 1. Dr Gordon Russell, Napier University

The Entity-Relationship Model

Outline. Data Modeling. Conceptual Design. ER Model Basics: Entities. ER Model Basics: Relationships. Ternary Relationships. Yanlei Diao UMass Amherst

Course MIS. Foundations of Business Intelligence

Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

three Entity-Relationship Modeling chapter OVERVIEW CHAPTER

COMHAIRLE NÁISIÚNTA NA NATIONAL COUNCIL FOR VOCATIONAL AWARDS PILOT. Consultative Draft Module Descriptor. Relational Database

Tutorial on Relational Database Design

IT2305 Database Systems I (Compulsory)

Fundamentals of Database System

Foundations of Business Intelligence: Databases and Information Management

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

Databases What the Specification Says

Fundamentals of Database Design

Designing a Dimensional Model

Designing Databases. Introduction

IT2304: Database Systems 1 (DBS 1)

CSE 132A. Database Systems Principles

Chapter 1: Introduction. Database Management System (DBMS)

CIS 631 Database Management Systems Sample Final Exam

The Entity-Relationship Model

Data Hierarchy. Traditional File based Approach. Hierarchy of Data for a Computer-Based File

OVERVIEW 1.1 DATABASE MANAGEMENT SYSTEM (DBMS) DEFINITION:-

Database Design and Normalization

Entity/Relationship Modelling. Database Systems Lecture 4 Natasha Alechina

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

SAMPLE FINAL EXAMINATION SPRING SESSION 2015

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.

Foundations of Business Intelligence: Databases and Information Management

DATA MODELING AND RELATIONAL DATABASE DESIGN IN ARCHAEOLOGY

Database Concepts. Database & Database Management System. Application examples. Application examples

Database Design. Adrienne Watt. Port Moody

ER modelling, Weak Entities, Class Hierarchies, Aggregation

Overview. Physical Database Design. Modern Database Management McFadden/Hoffer Chapter 7. Database Management Systems Ramakrishnan Chapter 16

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Lecture Notes INFORMATION RESOURCES

Databases and Information Management

Chapter 1: Introduction

A brief overview of developing a conceptual data model as the first step in creating a relational database.

Ursuline College Accelerated Program

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

Introduction to normalization. Introduction to normalization

Transcription:

Week 10: Database Design Database Design From an ER Schema to a Relational One Restructuring an ER schema Performance Analysis Analysis of Redundancies, Removing Generalizations Translation into a Relational Schema Designing a Database Schema Part (1,1) supplies (1,N) Supplier (1,N) orders (1,N) Customer Date Hierarchical Part name Supplier Customer Relational Network Part(Name,Description,Part#) Supplier(Name, Addr) Customer(Name, Addr) Supplies(Name,Part#, Date) Orders(Name,Part#) Part Supplier Customer Database Design 1 Database Design 2 (Relational) Database Design Given a conceptual schema (ER, but could also be a UML), generate a logical (relational) schema. This is not just a simple translation from one model to another for two main reasons: o not all the constructs of the Entity-Relationship model can be translated naturally into the relational model; o the schema must be restructured in such a way as to make the execution of the projected operations as efficient as possible. The topic is covered in Section 3.5 of the textbook. Database Design Process Database Design 3 Database Design 4 1

o Logical Design Steps It is helpful to divide the design into two steps: o Restructuring of the Entity- Relationship schema, based on criteria for the optimization of the schema and the simplification of the following step; o Translation into the logical model, based on the features of the logical model (in our case, the relational model). Performance Analysis An ER schema is restructured to optimize: Cost of an operation (evaluated in terms of the number of occurrences of entities and relationships that are visited during the execution of an operation); Storage requirements (evaluated in terms of number of bytes necessary to store the data described by the schema). In order to study these parameters, we need to know: Projected volume of data; Projected operation characteristics. Database Design 5 Database Design 6 Cost Model The cost of an operation is measured in terms of the number of disk accesses required. A disk access is, generally, orders of magnitude more expensive than in-memory accesses, or CPU operations. For a coarse estimate of cost, we assume that a Read operation (for one entity or relationship) requires 1 disk access; A Write operation (for one entity or relationship) requires 2 disk accesses (read from disk, change, write back to disk). There are many other cost models depending on use and type of DB Warehouse (OLAP - On-Line Analysis Processing) Operational DB (OLTP - On-Line Transaction Processing) Database Design 7 Employee-Department Example Database Design 8 2

Typical Operations Operation 1: Assign an employee to a project. Operation 2: Find an employee record, including her department, and the projects she works for. Operation 3: Find records of employees for a department. Operation 4: For each branch, retrieve its departments, and for each department, retrieve the last names of their managers, and the list of their employees. Need operations and their volume/frequency Workload Design During initial design and requirements analysis Estimate operations and frequency Gross estimates at best After database is operational Tools record actual workload characteristics Database Design 9 Database Design 10 Analysis Steps Analysis of Redundancies A redundancy in a conceptual schema corresponds to a piece of information that can be derived (that is, obtained through a series of retrieval operations) from other data in the database. An Entity-Relationship schema may contain various forms of redundancy. Database Design 11 Database Design 12 3

Examples of Redundancies Deciding About Redundancies The presence of a redundancy in a database may be an advantage: a reduction in the number of accesses necessary to obtain the derived information; a disadvantage: because of larger storage requirements, (but, usually at negligible cost) and the necessity to carry out additional operations in order to keep the derived data consistent. The decision to maintain or eliminate a redundancy is made by comparing the cost of operations that involve the redundant information and the storage needed, in the case of presence or absence of redundancy. Database Design 13 Database Design 14 Cost Comparison: An Example Removing Generalizations The relational model does not allow direct representation of generalizations that may be present in an E-R diagram. For example, here is an ER schema with generalizations: In this schema the attribute NumberOfInhabitants is redundant. Database Design 15 Database Design 16 4

Option 1 Option 3 Note! Possible Restructuring s...two More... Option 2 Option 4 (combination) Database Design 17 Database Design 18 General Rules For Removing Generalization Option 1 is convenient when the operations involve the occurrences and the attributes of E 0, E 1 and E 2 more or less in the same way. Option 2 is possible only if the generalization satisfies the coverage constraint (i.e., every instance of E 0 is either an instance of E 1 or E 2 ) and is useful when there are operations that apply only to occurrences of E 1 or E 2. Option 3 is useful when the generalization is not coverage-compliant and the operations refer to either occurrences and attributes of E 1 (E 2 ) or of E 0, and therefore make distinctions between child and parent entities. Available options can be combined (see option 4) Partitioning and Merging of Entities and Relationships Entities and relationships of an E-R schema can be partitioned or merged to improve the efficiency of operations, using the following principle: Accesses are reduced by separating attributes of the same concept that are accessed by different operations and by merging attributes of different concepts that are accessed by the same operations. The same criteria with those discussed for redundancies are valid in making a decision about this type of restructuring. Database Design 19 Database Design 20 5

Example of Partitioning Deletion of Multi-Valued Attribute Recall this is a weak entity Database Design 21 Database Design 22 Merging Entities Partitioning of a Relationship Suppose that composition represents current and past compositions of a team Are these equivalent? Two candidate keys Database Design 23 Database Design 24 6

Selecting a Primary Key Every relation must have a unique primary key. The criteria for this decision are as follows: Attributes with null values cannot form primary keys; One/few attributes is preferable to many attributes; Internal key preferable to external ones (weak entity); A key that is used by many operations to access the instances of an entity is preferable to others. At this stage, if none of the candidate keys satisfies the above requirements, it may be best to introduce a new attribute (e.g., social insurance #, student #, ) Database Design 25 Translation into a Logical Schema The second step of logical design consists of a translation between different data models. Starting from an E-R schema, an equivalent relational schema is constructed. By equivalent, we mean a schema capable of representing the same information. We will deal with the translation problem systematically, beginning with the fundamental case, that of entities linked by many-to-many relationships. Database Design 26 Many-to-Many Relationships Many-to-Many Recursive Relationships Employee(Number, Surname, Salary) Project(Code, Name, Budget) Participation(Number, Code, StartDate) Product(Code, Name, Cost) Composition(Part, SubPart, Quantity) Database Design 27 Database Design 28 7

Ternary Relationships One-to-Many Relationships Supplier(SupplierID, SupplierName) Product(Code, Type) Department(Name, Telephone) Supply(Supplier, Product, Department, Quantity) Player(Surname, DateOfBirth, Position) Team(Name, Town, TeamColours) Contract(PlayerSurname, PlayerDateOfBirth, Team, Salary) OR Player(Surname, DateOfBirth, Position, TeamName, Salary) Team(Name, Town, TeamColours) Database Design 29 Database Design 30 Weak Entities One-to-One Relationships Student(RegistrationNumber, University, Surname, EnrolmentYear) University(Name, Town, Address) Head(Number, Name, Salary, Department, StartDate) Department(Name, Telephone, Branch) OR Head(Number, Name, Salary, StartDate) Department(Name, Telephone, HeadNumber, Branch) Database Design 31 Database Design 32 8

Optional One-to-One Relationships A Sample ER Schema Employee(Number, Name, Salary) Department(Name, Telephone, Branch, Head, StartDate) Or, if both entities are optional Employee(Number, Name, Salary) Department(Name, Telephone, Branch) Management(Head, Department, StartDate) Database Design 33 Database Design 34 Entities with Internal Identifiers 1-1 and Optional 1-1 Relationships R3 E5 R4 E6 E5 E6 R5 E4 E3(A31, A32) E4(A41, A42) E5(A51, A52) E6(A61, A62, A63) E3 E4 1-1 or optional 1-1 relationships can lead to messy transformations E3 Database Design 35 E5(A51, A52, A61R3, A62R3, AR3, A61R4, A62R4, A61R5, A62R5, AR5) Database Design 36 9

Weak Entities R3 Many-to-Many Relationships R3 E1 R1 E5 R4 E6 E1 R1 E5 R4 E6 R6 R5 R6 R5 E2 E3 E4 E1(A11, A51, A12) E2(A21, A11, A51, A22) E2 E3 R2 E4 R2(A21, A11, A51, A31, A41, AR21, AR22) Database Design 37 Database Design 38 Homework - Fill in a Translation Summary of Transformation Rules E1(A11, A51, A12) E2(A21, A11, A51, A22) E3(A31, A32) E4(A41,A42) E5(A51, A52, A61R3, A62R3, AR3, A61R4, A62R4, A61R5, A62R5, AR5) E6(A61, A62, A63) R2(A21, A11, A51, A31, A41, AR21, AR22) Database Design 39 Database Design 40 10

...More Rules... Even More Rules... Database Design 41 Database Design 42 and the Last One... The Training Company Revisited Database Design 43 Database Design 44 11

Logical Design Using CASE Tools The logical design phase is partially supported by database design tools: the translation to the relational model is carried out by such tools semi-automatically; the restructuring step is difficult to automate and CASE tools provide little or no support for it. Most commercial CASE tools will generate automatically SQL code for the creation of the database. Some tools allow direct connection with a DBMS and can construct the corresponding database automatically. [CASE = Computer-Aided Software Engineering] Logical Design with a CASE Tool Database Design 45 Database Design 46 12