Introduction to normalization. Introduction to normalization



Similar documents
Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Fundamentals of Database System

Normalization. CIS 3730 Designing and Managing Data. J.G. Zheng Fall 2010

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

A. TRUE-FALSE: GROUP 2 PRACTICE EXAMPLES FOR THE REVIEW QUIZ:

Normalization in OODB Design

DATABASE NORMALIZATION

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

DATABASE SYSTEMS. Chapter 7 Normalisation

If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.

Normalization. Functional Dependence. Normalization. Normalization. GIS Applications. Spring 2011

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Normalization in Database Design

Chapter 6. Database Tables & Normalization. The Need for Normalization. Database Tables & Normalization

How To Write A Diagram

Normalization. Normalization. First goal: to eliminate redundant data. for example, don t storing the same data in more than one table

Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Database Design and Normalization

MCQs~Databases~Relational Model and Normalization

Normalization of Database

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Normalisation 6 TABLE OF CONTENTS LEARNING OUTCOMES

Functional Dependency and Normalization for Relational Databases

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Chapter 10. Functional Dependencies and Normalization for Relational Databases

RELATIONAL DATABASE DESIGN

Database Design and the Reality of Normalisation

Chapter 9: Normalization

DATABASE INTRODUCTION

Part 6. Normalization


Normal forms and normalization

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Lecture 2 Normalization

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Relational Database Basics Review

DATABASE MANAGEMENT SYSTEMS. Question Bank:

IT2305 Database Systems I (Compulsory)

CIS 631 Database Management Systems Sample Final Exam

Developing Entity Relationship Diagrams (ERDs)

CS143 Notes: Normalization Theory

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Fundamentals of Database Design

CPS352 Database Systems: Design Project

SAMPLE FINAL EXAMINATION SPRING SESSION 2015

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

C HAPTER 4 INTRODUCTION. Relational Databases FILE VS. DATABASES FILE VS. DATABASES

IT2304: Database Systems 1 (DBS 1)

DBMS. Normalization. Module Title?

Conceptual Design: Entity Relationship Models. Objectives. Overview

Database Design Basics

The process of database development. Logical model: relational DBMS. Relation

Normalization. Reduces the liklihood of anomolies

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Tutorial on Relational Database Design

7.1 The Information system

Normalization. Normalization. Normalization. Data Redundancy

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

Module 5: Normalization of database tables

Data Modeling: Part 1. Entity Relationship (ER) Model

MODULE 8 LOGICAL DATABASE DESIGN. Contents. 2. LEARNING UNIT 1 Entity-relationship(E-R) modelling of data elements of an application.

Normalization. CIS 331: Introduction to Database Systems

Benefits of Normalisation in a Data Base - Part 1

Normalisation 1. Chapter 4.1 V4.0. Napier University

Foundations of Information Management

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Theory of Relational Database Design and Normalization

Database Normalization. Mohua Sarkar, Ph.D Software Engineer California Pacific Medical Center

2. Conceptual Modeling using the Entity-Relationship Model

The 3 Normal Forms: Copyright Fred Coulson 2007 (last revised February 1, 2009)

Exercise 1: Relational Model

3. Relational Model and Relational Algebra

Database Design Methodologies

Functional Dependencies and Finding a Minimal Cover

Relational Database Concepts

Data Modeling Basics

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Functional Dependencies

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

An Example: Video Rental System

DATABASE DESIGN: Normalization Exercises & Answers

Topic 5.1: Database Tables and Normalization

Databases What the Specification Says

- Eliminating redundant data - Ensuring data dependencies makes sense. ie:- data is stored logically

BCA. Database Management System

Understanding the Database Design Process

Optimum Database Design: Using Normal Forms and Ensuring Data Integrity. by Patrick Crever, Relational Database Programmer, Synergex

Lecture Notes INFORMATION RESOURCES

City University of Hong Kong. Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015

Database Design and the E-R Model

Design of Relational Database Schemas

Information Systems Analysis and Design CSC John Mylopoulos Database Design Information Systems Analysis and Design CSC340

Transcription:

Introduction to normalization Lecture 4 Instructor Anna Sidorova Agenda Presentation Review of relational models, in class exersise Introduction to normalization In-class exercises Discussion of HW2 1

Next class Review for the midterm HW 2 is due March 7 midterm exam Based on Hoffer, Prescott and Topi Modern Database Management, (c) Prentice Hall 2009 HW 2 Chapter 4, Problem 6 develop a relational schema Convert ERD for Ch. 2, Problem 20 (a part of your HW1) into a relational schema (must be based on the correct solution) Chapter 4, Problems 7 and 8 (we will discuss the relevant material next class) Handout normalization exercises Based on Hoffer, Prescott and Topi Modern Database Management, (c) Prentice Hall 2009 2

Review of relational data models Figure 2-7 Three-schema architecture Different people have different views of the database these these are the external schema The internal schema is the underlying design and implementation Based on Hoffer, Prescott and Topi Modern Database Management, (c) Prentice Hall 2009 3

Relation Definition: A relation is a named, two-dimensional table of data Table consists of rows (records) and columns (attribute or field) Requirements for a table to qualify as a relation: It must have a unique name Every attribute value must be atomic (not multivalued, not composite) Every row must be unique (can t have two rows with exactly the same values for all their fields) Attributes (columns) in tables must have unique names The order of the columns must be irrelevant The order of the rows must be irrelevant NOTE: all relations are in 1 st Normal form Translating ERD into relational schema Map each entity into a relation Map each weak entity into a relation (include the identifier of the strong entity as a part of the primary key) Map each multivalued attribute into a relation (include the identifier of the entity as a part of the primary key) Map many-to-many relationships and associative entities into a relation Represent one-to-one and one-to-many relationships using foreign keys. Based on Hoffer, Prescott and Topi Modern Database Management, (c) Prentice Hall 2009 4

Normalization Learning Objectives Define Normalization Define 1 st, 2 nd and 3 rd Normal Forms Discuss normalization process 5

Normalization: Definitions Normalization is a method used to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data The process of decomposing relations with anomalies to produce smaller, well-structured relations 9.11 Well-Structured Relations A relation that contains minimal data redundancy and allows users to insert, delete, ete, and update rows without causing data inconsistencies Goal is to avoid anomalies Insertion Anomaly adding adding new rows forces user to create duplicate data Deletion Anomaly deleting rows may cause a loss of data that would be needed for other future rows Modification Anomaly changing data in a row forces changes to other rows because of duplication 12 6

Example Figure 5-2b Question Is this a relation? Question What s the primary key? Answer Yes: Unique rows and no multivalued attributes Answer Composite: Emp_ID, Course_Title 13 Anomalies in this Table Insertion can t enter a new employee without having the employee take a class Deletion if if we remove employee 140, we lose information about the existence of a Tax Acc class Modification giving a salary increase to employee 100 forces us to update multiple records Why do these anomalies exist? Because there are two themes (entity types) in this one relation. This results in data duplication and an unnecessary dependency between the entities 14 7

Normalization Process The goal is to bring each relation into the Third Normal Form. The process bringing a relation into the 3 rd Normal Form Goes through stages. 1 st Normal Form 2 nd Normal Form 3 rd Normal Form Functional Dependencies Functional Dependency A particular relationship between two attributes. For a given relation, attribute B is functionally dependent on attribute A if, for every valid value of A, that value of A uniquely determines the value of B Instances (or sample data) in a relation do not prove the existence of a functional dependency Knowledge of problem domain is most reliable method for identifying functional dependency 9.16 8

Functional Dependencies: Notations in Problems A B Attribute B is functionally dependent on attribute A (A determines B) A, B C Attributes A and B together determine attribute C A B, C Both attributes, B and C are determined by (functionally dependent on) attribute A Functional Dependencies We can draw functional dependencies between attributes of a relation as follows: STUDENT Stud_ID F_Name L_Name E-mail 111 Mary Jones mary@hotmail.com 122 Sara Smith smith@hotmail.com 9

Important Definitions Multivalued Attributes (repeating groups) non-key attributes or groups of non-key attributes the values of which are not uniquely identified d by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part). STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 Important Definitions A relation is unnormalized (not in the 1 st Normal Form) if it has multivalued l attributes or repeating groups. STUDENT Repeating Group Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 10

Important Definitions A relation is in the 1 st Normal Form if it has no multivalued attributes or repeating groups. STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 Important Definitions Partial Dependency when an non-key attribute is determined d by a part, but not the whole, of a COMPOSITE primary key. CUSTOMER Partial Dependency Cust_ID Name Order_ID 101 AT&T 1234 101 AT&T 156 125 Cisco 1250 11

Important Definitions A relation is NOT in the 2 nd Normal Form if it has partial dependencies. d CUSTOMER Partial Dependency Cust_ID Name Order_ID 101 AT&T 1234 101 AT&T 156 125 Cisco 1250 Important Definitions A relation is in the 2 nd Normal Form if it is in the 1 st Normal Form AND has no partial dependencies. d EMPLOYEE Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sara Smith 2 Mktg 12

Important Definitions Transitive Dependency when a non-key attribute determines another non-key attribute. EMPLOYEE Transitive Dependency Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sara Smith 2 Mktg Important Definitions A relation is NOT in the 3 rd Normal Form if it has transitive dependencies. EMPLOYEE Transitive Dependency Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sara Smith 2 Mktg 13

Important Definitions A relation is in the 3 rd Normal Form if it is in the 2 nd Normal Form and has no transitive dependencies. EMPLOYEE Emp_ID F_Name L_Name Dept_ID 111 Mary Jones 1 122 Sara Smith 2 Normal Forms: Review Unnormalized There are multivalued attributes or repeating groups 1 NF No multivalued attributes or repeating groups. 2 NF 1 NF plus no partial dependencies 3 NF 2 NF plus no transitive dependencies 9.28 14

Example 1: Determine NF ISBN Title ISBN Publisher Publisher Address All attributes are directly or indirectly determined by the primary key; therefore, the relation is at least in 1 NF BOOK ISBN Title Publisher Address Example 1: Determine NF ISBN Title ISBN Publisher Publisher Address BOOK The relation is at least in 1NF. There is no COMPOSITE primary key, therefore there can t be partial dependencies. Therefore, the relation is at least in 2NF ISBN Title Publisher Address 15

Example 1: Determine NF ISBN Title ISBN Publisher Publisher Address BOOK Publisher is a non-key attribute, and it determines Address, another non-key attribute. Therefore, there is a transitive dependency, which means that the relation is NOT in 3 NF. ISBN Title Publisher Address Example 1: Determine NF ISBN Title ISBN Publisher Publisher Address We know that the relation is at least in 2NF, and it is not in 3 NF. Therefore, we conclude that the relation is in 2NF. BOOK ISBN Title Publisher Address 16

Example 1: Determine NF ISBN Title ISBN Publisher Publisher Address In your solution you will write the following justification: 1) No M/V attributes, therefore at least 1NF 2) No partial dependencies, therefore at least 2NF 3) There is a transitive dependency (Publisher Address), therefore, not 3NF Conclusion: The relation is in 2NF BOOK ISBN Title Publisher Address Example 2: Determine NF Product_ID Description ORDER All attributes are directly or indirectly determined by the primary key; therefore, the relation is at least in 1 NF Order_No Product_ID Description 17

Example 2: Determine NF Product_ID Description ORDER The relation is at least in 1NF. There is a COMPOSITE Primary Key (PK) (Order_No, Product_ID), therefore there can be partial dependencies. Product_ID, which is a part of PK, determines Description; hence, there is a partial dependency. Therefore, the relation is not 2NF. No sense to check for transitive dependencies! Order_No Product_ID Description Example 2: Determine NF Product_ID Description ORDER We know that the relation is at least in 1NF, and it is not in 2 NF. Therefore, we conclude that the relation is in 1 NF. Order_No Product_ID Description 18

Example 2: Determine NF Product_ID Description ORDER In your solution you will write the following justification: 1) No M/V attributes, therefore at least 1NF 2) There is a partial dependency (Product_ID Description), therefore not in 2NF Conclusion: The relation is in 1NF Order_No Product_ID Description Example 3: Determine NF Part_ID Description Part_ ID Price Part_ID, Comp_ID No Comp_ID and No are not determined by the primary key; therefore, the relation is NOT in 1 NF. No sense in looking at partial or transitive dependencies. PART Part_ID Descr Price Comp_ID No 19

Example 3: Determine NF Product_ID Description Product_ID Price Part_ID, Comp_ID No In your solution you will write the following justification: 1) There are M/V attributes; therefore, not 1NF Conclusion: The relation is unnormalized. PART Part_ID Descr Price Comp_ID No Bringing a Relation to 1NF STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 20

Bringing a Relation to 1NF Option 1: Make a determinant of the repeating group (or a multivalued attribute) a part of the primary key. STUDENT Composite Primary Key Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 Bringing a Relation to 1NF Option 2: Remove the entire repeating group from the relation. Create another relation which would contain all the attributes of the repeating group, plus the primary key from the first relation. In this new relation, the primary key from the original ii relation and the determinant of the repeating group will comprise a primary key. STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 21

Bringing a Relation to 1NF STUDENT Stud_ID Name 101 Lennon 101 Lennon 125 Jonson STUDENT_COURSE Stud_ID Course Units 101 MSI 250 3 101 MSI 415 3 125 MSI 331 3 Bringing a Relation to 2NF Composite Primary Key STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 22

Bringing a Relation to 2NF Goal: Remove Partial Dependencies Composite Primary Key Partial Dependencies STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 Bringing a Relation to 2NF Remove attributes that are dependent from the part but not the whole of the primary key from the original relation. For each partial dependency, create a new relation, with the corresponding part of the primary key from the original as the primary key. STUDENT Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 23

Bringing a Relation to 2NF CUSTOMER Stud_ID Name Course_ID Units 101 Lennon MSI 250 3.00 101 Lennon MSI 415 3.00 125 Jonson MSI 331 3.00 STUDENT_COURSE Stud_ID Course_ID 101 MSI 250 101 MSI 415 125 MSI 331 STUDENT Stud_ID Name 101 Lennon 101 Lennon 125 Jonson COURSE Course_ID Units MSI 250 3.00 MSI 415 3.00 MSI 331 3.00 Bringing a Relation to 3NF Goal: Get rid of transitive dependencies. EMPLOYEE Transitive Dependency Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sara Smith 2 Mktg 24

Bringing a Relation to 3NF Remove the attributes, which are dependent on a non-key attributes from the original relation. For each transitive dependency, d create a new relation with the non-key attributes which is a determinant in the transitive dependency as a primary key, and the dependent non-key attribute as a dependent. EMPLOYEE Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sara Smith 2 Mktg Bringing a Relation to 3NF EMPLOYEE Emp_ID F_Name L_Name Dept_ID Dept_Name 111 Mary Jones 1 Acct 122 Sara Smith 2 Mktg EMPLOYEE Emp_ID F_Name L_Name Dept_ID 111 Mary Jones 1 122 Sara Smith 2 DEPARTMENT Dept_ID Dept_Name 1 Acct 2 Mktg 25

Other Normal Forms (from Appendix B) Boyce-Codd NF All determinants are candidate d keys there is no determinant that is not a unique identifier Usually, if a relation is in #NF it is in the BCNF, except when a part of the primary key is determined by a non-key attribute. 4 th NF and 5 th NF used primarily for theoretical purposes Merging Relations View Integration Combining entities from multiple ER models into common relations Issues to watch out for when merging entities from different ER models: Synonyms two or more attributes with different names but same meaning Homonyms attributes with same name but different meanings Transitive dependencies even if relations are in 3NF prior to merging, they may not be after merging Supertype/subtype relationships may be hidden prior to merging 26

Enterprise Keys advice from some experts Primary keys that are unique in the whole database, not just within a single relation Corresponds with the concept of an object ID in object-oriented systems 27

In class exercise See handout Based on Hoffer, Prescott and Topi Modern Database Management, (c) Prentice Hall 2009 28