Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.



Similar documents
Introduction to normalization. Introduction to normalization

Fundamentals of Database System

A. TRUE-FALSE: GROUP 2 PRACTICE EXAMPLES FOR THE REVIEW QUIZ:

MCQs~Databases~Relational Model and Normalization

Normalization. CIS 3730 Designing and Managing Data. J.G. Zheng Fall 2010

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

DATABASE NORMALIZATION

RELATIONAL DATABASE DESIGN

DATABASE SYSTEMS. Chapter 7 Normalisation

Normalization in OODB Design

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Normalization of Database


Chapter 10 Functional Dependencies and Normalization for Relational Databases

Chapter 10. Functional Dependencies and Normalization for Relational Databases

Topic 5.1: Database Tables and Normalization

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

Normalization in Database Design

Theory of Relational Database Design and Normalization

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Theory of Relational Database Design and Normalization

Normalization. Functional Dependence. Normalization. Normalization. GIS Applications. Spring 2011

Normalisation 6 TABLE OF CONTENTS LEARNING OUTCOMES

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Part 6. Normalization

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

Functional Dependency and Normalization for Relational Databases

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Lecture 2 Normalization

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

SQL Server. 1. What is RDBMS?

Database Design and Normalization

Functional Dependencies

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Database Management System

Database Normalization. Mohua Sarkar, Ph.D Software Engineer California Pacific Medical Center

Normal forms and normalization

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen

Normalisation 1. Chapter 4.1 V4.0. Napier University

Chapter 6. Database Tables & Normalization. The Need for Normalization. Database Tables & Normalization

If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.

BCA. Database Management System

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Database Constraints and Design

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Schema Refinement, Functional Dependencies, Normalization

Database Design Methodology

Normalization. Reduces the liklihood of anomolies

Module 5: Normalization of database tables

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

Fundamentals of Database Design

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

C HAPTER 4 INTRODUCTION. Relational Databases FILE VS. DATABASES FILE VS. DATABASES

Database Management Systems. Redundancy and Other Problems. Redundancy

IT2305 Database Systems I (Compulsory)

IT2304: Database Systems 1 (DBS 1)

CS143 Notes: Normalization Theory

Database Design and the Reality of Normalisation

Chapter 7: Relational Database Design

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

1.264 Lecture 10. Data normalization

The Relational Database Model

The process of database development. Logical model: relational DBMS. Relation

UNDERSTANDING NORMALIZATION

Functional Dependencies and Finding a Minimal Cover

Overview. Physical Database Design. Modern Database Management McFadden/Hoffer Chapter 7. Database Management Systems Ramakrishnan Chapter 16

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

Optimum Database Design: Using Normal Forms and Ensuring Data Integrity. by Patrick Crever, Relational Database Programmer, Synergex

How To Handle Missing Information Without Using NULL

Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Database Design and Normal Forms

Chapter 9: Normalization

How To Write A Diagram

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Overview - detailed. Goal. Faloutsos CMU SCS

RELATIONAL DATABASE DESIGN. Basic Concepts

Design of Relational Database Schemas

Limitations of DB Design Processes

Normalization. Purpose of normalization Data redundancy Update anomalies Functional dependency Process of normalization

Normalization. CIS 331: Introduction to Database Systems

Graham Kemp (telephone , room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.

Databases What the Specification Says

Physical Database Design and Tuning

Relational Database Design

DBMS. Normalization. Module Title?

1. Physical Database Design in Relational Databases (1)

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys


Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Tutorial on Relational Database Design

Data Hierarchy. Traditional File based Approach. Hierarchy of Data for a Computer-Based File

B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

LOGICAL DATABASE DESIGN

Transcription:

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Robert C. Nickerson ISYS 464 Spring 2003 Topic 22 Introduction to Normalization Normalization: The process of transforming relations into forms that make them easier to manage in certain situations. Objective of normalization: To eliminate modification anomalies from a relation, which are problems that can occur when a relation is modified (that is, when data is added, deleted, or changed). Normal Forms Normal forms: Forms that a relation can take during normalization. First normal form 1NF Lower Second normal form 2NF Third normal form 3NF Boyce-Codd normal form - BCNF Fourth normal form 4NF Fifth normal form 5NF Domain/key normal form DKNF Higher Normal Forms Any relation in a higher normal form is also in all lower normal forms. Each higher normal form results in fewer modification anomalies. A relation in DKNF has no modification anomalies. Practical View of Normalization Only up through 4NF is practical. We don't even know fully the consequences of a relation not being in 5NF. It is very difficult to determine if a relation is in DKNF. Not all relations can be put in DKNF. Practical View of Normalization Higher normal forms make it easier to update a database. In lower normal forms, modification anomalies have to be handled through complex programming that takes longer to execute. But, higher normal forms can make it harder to query a database. In higher normal forms, some queries require complex programming and take longer to execute. 1

De-normalization De-normalization: The process of changing a relation into a lower normal form to make the database easier to query even though there will be more modification anomalies in it. Place of Normalization in Database Development Some people consider normalization to be a critical step in logical design. They recommend that all relations be put in as high a normal form as possible (fully normalize all relations) My view: Normalization is related to performance tradeoff: What trade off are we willing to make between ease of modifying the database and ease of querying the database? Place of Normalization in Database Development Good conceptual design usually results in relations in a relatively high normal form. Normal forms should be checked after completing the logical design (which is based on the conceptual design) to determine if the relations are in an acceptable normal form given the modification/query tradeoff. Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data The process of decomposing relations with anomalies to produce smaller, wellstructured relations Well-Structured Relations A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies Goal is to avoid modification anomalies Insertion Anomaly adding new rows forces user to create duplicate data; adding data about more than one entity Deletion Anomaly deleting rows may cause a loss of data that would be needed for other future rows; deleting data about more than one entity Modification (update) Anomaly changing data in a row forces changes to other rows because of duplication; changing data about entity in more than one tuple General rule of thumb: a table should not pertain to more than one entity type Question Is this a relation? Example Figure 5.2b Question What s the primary key? Answer Yes: unique rows and no multivalued attributes Answer Composite: Emp_ID, Course_Title 2

Anomalies in this Table Insertion can t enter a new employee without having the employee take a class Deletion if we remove employee 140, we lose information about the existence of a Tax Acc class Modification (update) giving a salary increase to employee 100 forces us to update multiple records Why do these anomalies exist? Because we ve combined two themes (entity types) into one relation. This results in duplication, and an unnecessary dependency between the entities Functional Dependencies 1NF though BCNF deal with functional dependencies Functional dependency (FD): Given a relation R(X,Y, ), Y is functionally dependent on X (or X functionally determines Y) if and only if every value of X has associated with it only one value of Y at any one time. (Given a value of X, there is only one value of Y associated with it at any one time.) Notation: X Y (functional dependency formula) Functional Dependencies Functionally Not functionally Dependent Dependent X Y X Y 1 A 1 A 1 A 1 B 2 B 2 B Functional Dependencies X and Y in definition of functional dependency can be composite A, B C D E, F Determinant: The attribute or composite attribute on the left of a functional dependency formula Functional Dependencies and Keys Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute Candidate Key: A unique identifier. One of the candidate keys will become the primary key E.g. perhaps there is both credit card number and SS# in a table in this case both are candidate keys Each non-key field is functionally dependent on every candidate key 5.22 -Steps in normalization 3

First Normal Form No multivalued attributes Every attribute value is atomic Fig. 5-2a on page 168 is not in 1 st Normal Form (multivalued attributes) it is not a relation Emp(Emp_ID, Name, Dept_Name, Salary, (Course_Title, Date_Completed)) Fig. 5-2b is in 1 st Normal form Emp(Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed) All relations are in 1 st Normal Form Second Normal Form 1NF plus every non-key attribute is fully functionally dependent on the ENTIRE primary key Every non-key attribute must be defined by the entire key, not by only part of the key No partial functional dependencies Fig. 5-2b is NOT in 2 nd Normal Form (see fig 5-23b) Fig 5.23(b) Functional Dependencies in EMPLOYEE2 Dependency on entire primary key EmpID CourseTitle Name DeptName Salary DateCompleted Dependency on only part of the key EmpID, CourseTitle DateCompleted EmpID Name, DeptName, Salary Modification Anomalies What are the modification anomalies in this example? Insertion anomaly: Adding an employee requires adding course data Deletion anomaly: Deleting an employee involves deleting course data Update anomaly: Updating an employees salary may involve updating it in several tuples Therefore, NOT in 2 nd Normal Form!! Getting it into 2 nd Normal Form See p193 decomposed into two separate relations EmpID Name DeptName Salary Both are full functional dependencies Transitive Dependency Transitive dependency: Given a relation R(A,B,C, ), C is transitively dependent on A if A B and B C. (That is, one attribute functionally determines a second, which functionally determines a third.) EmpID CourseTitle DateCompleted 4

Figure 5-24 -- Relation with transitive dependency Figure 5-24(b) Relation with transitive dependency (a) SALES relation with simple data CustID Name CustID Salesperson CustID Region All this is OK (2 nd NF) BUT CustID Salesperson Region Transitive dependency (not 3 rd NF) Modification Anomalies What are the modification anomalies in this example? Insertion anomaly: Inserting a customer requires inserting data (region) about a salesperson Deletion anomaly: Deleting a customer deletes data (region) about a salesperson Update anomaly: Changing a salesperson s region requires changing data in more than one tuple Third Normal Form 2NF PLUS no transitive dependencies involving non-key attributes Figure 5.25 -- Removing a transitive dependency (a) Decomposing the SALES relation Figure 5.25(b) Relations in 3NF Salesperson Region CustID Name CustID Salesperson Now, there are no transitive dependencies Both relations are in 3 rd NF 5

Boyce-Codd Normal Form (Appendix B) All determinants are candidate keys 3NF relations can sometimes have anomalies related to functional dependencies BCNF relations eliminate these anomalies Other Normal Forms (from Appendix B) 4 th NF No multivalued dependencies 5 th NF No lossless joins Domain-key NF The ultimate NF perfect elimination of all possible anomalies Is a higher normal form always better than a lower one? Emp_address (ID, Name, Street, City, State, Zip) ID Name, Street, City, State, Zip Street, City, State Zip In 2NF, but not in 3NF because of transitive dependency To put in 3NF, put Zip into separate relation: Emp_address (ID, Name, Street, City, State) Zip (Street, City, State, Zip) But then join required each time address is needed Better to keep in 2NF because Zips rarely change Integrity Constraints Domain Constraints Allowable values for an attribute. Entity Integrity No primary key attribute may be null. All primary key fields MUST have data Referential Integrity Any foreign key value (on the relation of the many side) MUST match a primary key value in the relation of the one side. (Or the foreign key can be null) Figure 5-5: Referential integrity constraints (Pine Valley Furniture) Referential integrity constraints are drawn via arrows from dependent to parent table 6