Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina



Similar documents
COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Database Design and Normalization

DATABASE NORMALIZATION

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Overview - detailed. Goal. Faloutsos CMU SCS

Normalization in Database Design

Design of Relational Database Schemas

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Functional Dependencies and Finding a Minimal Cover

Database Management System

Functional Dependency and Normalization for Relational Databases

Schema Refinement and Normalization

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

RELATIONAL DATABASE DESIGN

Introduction to Database Systems. Normalization

DATABASE SYSTEMS. Chapter 7 Normalisation

Normalization. CIS 331: Introduction to Database Systems

Normalization of Database

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Chapter 9: Normalization

Theory of Relational Database Design and Normalization

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Normalisation 1. Chapter 4.1 V4.0. Napier University

Relational Database Design Theory

Chapter 10. Functional Dependencies and Normalization for Relational Databases

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Lecture Notes on Database Normalization

Normalisation 6 TABLE OF CONTENTS LEARNING OUTCOMES

MCQs~Databases~Relational Model and Normalization

In This Lecture. Physical Design. RAID Arrays. RAID Level 0. RAID Level 1. Physical DB Issues, Indexes, Query Optimisation. Physical DB Issues

CS143 Notes: Normalization Theory

Introduction to normalization. Introduction to normalization

Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations

Database Design and the Reality of Normalisation

Part 6. Normalization

Schema Refinement, Functional Dependencies, Normalization

Database Design and Normal Forms

Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005

In This Lecture. SQL Data Definition SQL SQL. Notes. Non-Procedural Programming. Database Systems Lecture 5 Natasha Alechina

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

Theory of Relational Database Design and Normalization

Normalization. Normalization. Normalization. Data Redundancy

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Relational Database Design

Normalization in OODB Design

Functional Dependencies and Normalization

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

Relational Database Design: FD s & BCNF

DATABASE DESIGN: Normalization Exercises & Answers

Chapter 6. Database Tables & Normalization. The Need for Normalization. Database Tables & Normalization

Chapter 10 Functional Dependencies and Normalization for Relational Databases

SQL Data Definition. Database Systems Lecture 5 Natasha Alechina

In This Lecture. Security and Integrity. Database Security. DBMS Security Support. Privileges in SQL. Permissions and Privilege.

DBMS. Normalization. Module Title?

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy


A. TRUE-FALSE: GROUP 2 PRACTICE EXAMPLES FOR THE REVIEW QUIZ:

Database Management Systems. Redundancy and Other Problems. Redundancy

Topic 5.1: Database Tables and Normalization

Functional Dependencies

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Entity/Relationship Modelling. Database Systems Lecture 4 Natasha Alechina

If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.

Chapter 7: Relational Database Design

Boyce-Codd Normal Form

An Algorithmic Approach to Database Normalization

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Lecture 2 Normalization

Database Constraints and Design

Determination of the normalization level of database schemas through equivalence classes of attributes

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

Benefits of Normalisation in a Data Base - Part 1

Chapter 8. Database Design II: Relational Normalization Theory

Normalization. Functional Dependence. Normalization. Normalization. GIS Applications. Spring 2011

Normalization. Reduces the liklihood of anomolies

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Module 5: Normalization of database tables

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Normalization. Normalization. First goal: to eliminate redundant data. for example, don t storing the same data in more than one table

Database Design and Normalization

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Introduction to Databases

Normalization. CIS 3730 Designing and Managing Data. J.G. Zheng Fall 2010

Inf202 Introduction to Data and Databases (Spring 2011)

C HAPTER 4 INTRODUCTION. Relational Databases FILE VS. DATABASES FILE VS. DATABASES

Notes. Information Systems. Higher Still. Higher. HSN31010 Database Systems First Normal Form. Contents

Normal forms and normalization

BCA. Database Management System

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

Normalisation and Data Storage Devices

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

In This Lecture. More Concurrency. Deadlocks. Precedence/Wait-For Graphs. Example. Example

In this Lecture SQL SELECT. Example Tables. SQL SELECT Overview. WHERE Clauses. DISTINCT and ALL SQL SELECT. For more information

Transcription:

Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina

In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third Normal Forms For more information Connolly and Begg chapter 13 Ullman and Widom ch.3.6.6 (2 nd edition), 3.5 (3 rd edition)

Redundancy and Normalisation Redundant data Can be determined from other data in the database Leads to various problems INSERT anomalies UPDATE anomalies DELETE anomalies Normalisation Aims to reduce data redundancy Redundancy is expressed in terms of dependencies Normal forms are defined that do not have certain types of dependency

First Normal Form In most definitions of the relational model All data values should be atomic This means that table entries should be single values, not sets or composite objects A relation is said to be in first normal form (1NF) if all data values are atomic

Normalisation to 1NF To convert to a 1NF relation, split up any non-atomic values 1NF Unnormalised Module Dept Lecturer Texts M1 D1 L1 T1, T2 M2 D1 L1 T1, T3 M3 D1 L2 T4 M4 D2 L3 T1, T5 M5 D2 L4 T6 Module Dept Lecturer Text M1 D1 L1 T1 M1 D1 L1 T2 M2 D1 L1 T1 M2 D1 L1 T3 M3 D1 L2 T4 M4 D2 L3 T1 M4 D2 L3 T5 M5 D2 L4 T6

Problems in 1NF 1NF Module Dept Lecturer Text M1 D1 L1 T1 M1 D1 L1 T2 M2 D1 L1 T1 M2 D1 L1 T3 M3 D1 L2 T4 M4 D2 L3 T1 M4 D2 L3 T5 M5 D2 L4 T6 INSERT anomalies Can't add a module with no texts UPDATE anomalies To change lecturer for M1, we have to change two rows DELETE anomalies If we remove M3, we remove L2 as well

Functional Dependencies Redundancy is often caused by a functional dependency A functional dependency (FD) is a link between two sets of attributes in a relation We can normalise a relation by removing undesirable FDs A set of attributes, A, functionally determines another set, B, or: there exists a functional dependency between A and B (A B), if whenever two rows of the relation have the same values for all the attributes in A, then they also have the same values for all the attributes in B.

Example {ID, modcode} {First, Last, modname} {modcode} {modname} {ID} {First, Last} ID First Last modcode modname 111 Joe Bloggs G51PRG Programming 222 Anne Smith G51DBS Databases

FDs and Normalisation We define a set of 'normal forms' Each normal form has fewer FDs than the last Since FDs represent redundancy, each normal form has less redundancy than the last Not all FDs cause a problem We identify various sorts of FD that do Each normal form removes a type of FD that is a problem We will also need a way to remove FDs

Properties of FDs In any relation The primary key FDs any set of attributes in that relation K X K is the primary key, X is a set of attributes Same for candidate keys Any set of attributes is FD on itself X X Rules for FDs Reflexivity: If B is a subset of A then A B Augmentation: If A B then A U C B U C Transitivity: If A B and B C then A C

FD Example 1NF Module Dept Lecturer Text M1 D1 L1 T1 M1 D1 L1 T2 M2 D1 L1 T1 M2 D1 L1 T3 M3 D1 L2 T4 M4 D2 L3 T1 M4 D2 L3 T5 M5 D2 L4 T6 The primary key is {Module, Text} so {Module, Text} {Dept, Lecturer} 'Trivial' FDs, eg: {Text, Dept} {Text} {Module} {Module} {Dept, Lecturer} { }

FD Example 1NF Module Dept Lecturer Text M1 D1 L1 T1 M1 D1 L1 T2 M2 D1 L1 T1 M2 D1 L1 T3 M3 D1 L2 T4 M4 D2 L3 T1 M4 D2 L3 T5 M5 D2 L4 T6 Other FDs are {Module} {Lecturer} {Module} {Dept} {Lecturer} {Dept} These are non-trivial and determinants (left hand side of the dependency) are not keys.

Partial FDs and 2NF Partial FDs: A FD, A B is a partial FD, if some attribute of A can be removed and the FD still holds Formally, there is some proper subset of A, C A, such that C B Let us call attributes which are part of some candidate key, key attributes, and the rest non-key attributes. Second normal form: A relation is in second normal form (2NF) if it is in 1NF and no non-key attribute is partially dependent on a candidate key In other words, no C B where C is a strict subset of a candidate key and B is a non-key attribute.

Second Normal Form 1NF Module Dept Lecturer Text M1 D1 L1 T1 M1 D1 L1 T2 M2 D1 L1 T1 M2 D1 L1 T3 M3 D1 L2 T4 M4 D2 L3 T1 M4 D2 L3 T5 M5 D2 L4 T6 1NF is not in 2NF We have the FD {Module, Text} {Lecturer, Dept} But also {Module} {Lecturer, Dept} And so Lecturer and Dept are partially dependent on the primary key

Removing FDs Suppose we have a relation R with scheme S and the FD A B where A B = { } Let C = S (A U B) In other words: A attributes on the left hand side of the FD B attributes on the right hand side of the FD C all other attributes It turns out that we can split R into two parts: R1, with scheme C U A R2, with scheme A U B The original relation can be recovered as the natural join of R1 and R2: R = R1 NATURAL JOIN R2

1NF to 2NF Example 1NF Module Dept Lecturer Text M1 D1 L1 T1 M1 D1 L1 T2 M2 D1 L1 T1 M2 D1 L1 T3 M3 D1 L2 T4 M4 D2 L3 T1 M4 D2 L3 T5 M5 D2 L4 T6 2NFa Module Dept Lecturer M1 D1 L1 M2 D1 L1 M3 D1 L2 M4 D2 L3 M5 D2 L4 2NFb Module Text M1 T1 M1 T2 M2 T1 M2 T3 M3 T4 M4 T1 M4 T5 M1 T6

Problems Resolved in 2NF Problems in 1NF INSERT Can't add a module with no texts UPDATE To change lecturer for M1, we have to change two rows DELETE If we remove M3, we remove L2 as well In 2NF the first two are resolved, but not the third one 2NFa Module Dept Lecturer M1 D1 L1 M2 D1 L1 M3 D1 L2 M4 D2 L3 M5 D2 L4

Problems Remaining in 2NF INSERT anomalies Can't add lecturers who teach no modules UPDATE anomalies To change the department for L1 we must alter two rows DELETE anomalies If we delete M3 we delete L2 as well 2NFa Module Dept Lecturer M1 D1 L1 M2 D1 L1 M3 D1 L2 M4 D2 L3 M5 D2 L4

Transitive FDs and 3NF Transitive FDs: A FD, A C is a transitive FD, if there is some set B such that A B and B C are non-trivial FDs A B non-trivial means: B is not a subset of A We have A B C Third normal form A relation is in third normal form (3NF) if it is in 2NF and no non-key attribute is transitively dependent on a candidate key

Third Normal Form 2NFa Module Dept Lecturer M1 D1 L1 M2 D1 L1 M3 D1 L2 M4 D2 L3 M5 D2 L4 2NFa is not in 3NF We have the FDs {Module} {Lecturer} {Lecturer} {Dept} So there is a transitive FD from the primary key {Module} to {Dept}

2NF to 3NF Example 2NFa Module Dept Lecturer M1 D1 L1 M2 D1 L1 M3 D1 L2 M4 D2 L3 M5 D2 L4 3NFa Lecturer Dept L1 D1 L2 D1 L3 D2 L4 D2 3NFb Module Lecturer M1 L1 M2 L1 M3 L2 M4 L3 M5 L4

Problems Resolved in 3NF Problems in 2NF INSERT Can't add lecturers who teach no modules UPDATE To change the department for L1 we must alter two rows DELETE If we delete M3 we delete L2 as well In 3NF all of these are resolved (for this relation but 3NF can still have anomalies!) 3NFb 3NFa Lecturer Dept L1 D1 L2 D1 L3 D2 L4 D2 Module Lecturer M1 L1 M2 L1 M3 L2 M4 L3 M5 L4

Normalisation and Design Normalisation is related to DB design A database should normally be in 3NF at least If your design leads to a non-3nf DB, then you might want to revise it When you find you have a non-3nf DB Identify the FDs that are causing a problem Think if they will lead to any insert, update, or delete anomalies Try to remove them

Next Lecture More normalisation Lossless decomposition; why our reduction to 2NF and 3NF is lossless Boyce-Codd normal form (BCNF) Higher normal forms Denormalisation For more information Connolly and Begg chapter 14 Ullman and Widom chapter 3.6