Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations



Similar documents
Functional Dependencies and Finding a Minimal Cover

Database Design and Normalization

Database Design and Normalization

Relational Database Design

Chapter 9: Normalization

Schema Refinement, Functional Dependencies, Normalization

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

Database Design and Normal Forms

Design of Relational Database Schemas

Chapter 10. Functional Dependencies and Normalization for Relational Databases

Database Management Systems. Redundancy and Other Problems. Redundancy

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Schema Refinement and Normalization

Theory of Relational Database Design and Normalization

Functional Dependencies and Normalization

CS143 Notes: Normalization Theory

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Chapter 7: Relational Database Design

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Relational Database Design: FD s & BCNF

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Functional Dependency and Normalization for Relational Databases

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Normalization in Database Design

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

How To Find Out What A Key Is In A Database Engine

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation

Database Management System

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Normalization. CIS 331: Introduction to Database Systems

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Theory of Relational Database Design and Normalization

Chapter 8. Database Design II: Relational Normalization Theory

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

Advanced Relational Database Design

Lecture Notes on Database Normalization

Normalisation 1. Chapter 4.1 V4.0. Napier University

Relational Normalization Theory (supplemental material)

Database Constraints and Design

Normalization of Database

Lecture 2 Normalization

Normalisation 6 TABLE OF CONTENTS LEARNING OUTCOMES

LOGICAL DATABASE DESIGN

Relational Database Design Theory

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005

A. TRUE-FALSE: GROUP 2 PRACTICE EXAMPLES FOR THE REVIEW QUIZ:

Determination of the normalization level of database schemas through equivalence classes of attributes

Limitations of DB Design Processes

Normalisation in the Presence of Lists

An Algorithmic Approach to Database Normalization

MCQs~Databases~Relational Model and Normalization

Functional Dependencies

DATABASE NORMALIZATION

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Normalisation and Data Storage Devices

Overview of Database Management Systems

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Normalization. Normalization. Normalization. Data Redundancy

Normalization for Relational DBs

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Part 6. Normalization

Normalization. Functional Dependence. Normalization. Normalization. GIS Applications. Spring 2011

Introduction to Database Systems. Normalization

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

Topic 5.1: Database Tables and Normalization

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

Module 5: Normalization of database tables

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

Lecture 6. SQL, Logical DB Design

RELATIONAL DATABASE DESIGN

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

Introduction to Microsoft Jet SQL

Design Theory for Relational Databases: Functional Dependencies and Normalization

Normalization of database model. Pazmany Peter Catholic University 2005 Zoltan Fodroczi

A Web-Based Environment for Learning Normalization of Relational Database Schemata

3. Relational Model and Relational Algebra

BCA. Database Management System

DATABASE SYSTEMS. Chapter 7 Normalisation

2. Basic Relational Data Model

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

Benefits of Normalisation in a Data Base - Part 1

Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

The Relational Model. Why Study the Relational Model?

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

SQL DATA DEFINITION: KEY CONSTRAINTS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Normalization in OODB Design

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

Boyce-Codd Normal Form

Graham Kemp (telephone , room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.

Transcription:

Normalisation Why normalise? To improve (simplify) database design in order to Avoid update problems Avoid redundancy Simplify update operations 1

Example ( the practical difference between a first normal form relation and an unnormalised table) EMPLOYEE-PROJECT EMP* EMP-NAME PROJECT E1 Smith CS 101 CS 203 EE 121 E2 Jones CS 202 CS 101 E3 Lee EE 410 2

Consider 2 transactions on EMPLOYEE-PROJECT T1: Insert the fact that Hanks works on project EE 202. T2: Insert the fact that Smith works on project EE 202. 3

No difference between the two transactions with EMPLOYEE-PROJECT* EMPLOYEE-PROJECT* EMP# EMP-NAME PROJECT E1 Smith CS 101 E1 Smith CS 203 E1 Smith EE 121 E2 Jones CS 202 E2 Jones CS 101 E3 Lee EE 410 4

Normalisation theory allows us to detect such cases and shows how relations can be converted to more suitable forms. 5

Numerous normal forms have been defined: 1NF: first normal form 2NF: second normal form 3NF: third normal form BCNF: Boyce-Codd normal form o 4NF: fourth normal form o 5NF: fifth normal form 6

Each succeeding normal form improves on the previous one by specifying further constraints on the relations. 7

Definition: 1NF 1NF: A relation is in first normal form if and only if it contains atomic values only. 1NF relations can still have undesirable features. 8

Recall the S-P-SP Database Suppose we design the S-P-SP database differently: Instead of S(S#, SNAME, STATUS,CITY) SP(S#, P#, QTY) we have a single relation SSP (S#, SNAME, STATUS,CITY, P#, QTY). (leaving the P relation as it is). 9

S S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens 10

SP S# P# QTY S# P# QTY S1 P1 300 S4 P2 200 S1 P2 200 S4 P4 300 S1 P3 400 S4 P5 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 11

An instance of the SSP S#SNAME STATUS CITY P# QTY S1Smith 20 London P1 300 S1Smith 20 London P2 200 S1Smith 20 London P3 400 S1Smith 20 London P4 200 S1Smith 20 London P5 100 S1Smith 20 London P6 100 S2Jones 10 Paris P1 300 S2Jones 10 Paris P2 400 S3Blake 30 Paris P2 200 S4Clark 20 London P2 200 S4Clark 20 London P4 300 S4Clark 20 London P5 400 12

Do you see any problems with SSP? Redundancies e.g. every tuple for supplier S1 shows SNAME to be Smith, and CITY to be London. Update problems Suppose supplier S1 moves from London to Paris. 13

Compare With The Original Table S S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens 14

There are other problems associated with the design of the SSP relation which we will discuss later. To be able to identify all these problems and solutions in general, we have to know about functional dependencies. 15

Functional Dependencies Given a relation R, and X, Y subsets of the set of attributes of R, Y is functionally dependent on X if and only if each X-value in R has associated with it at most one Y-value in R. In other words, whenever two tuples of R agree on their X-value, they must also agree on their Y-value. The functional dependency of Y on X is expressed by X Y (read as "X functionally determines Y".) 16

Examples S.S# S.SNAME S.S# S.STATUS S.S# S.CITY or more succinctly S.S# S.(SNAME, STATUS, CITY) or S# SNAME, STATUS, CITY if the context of relation S is understood. 17

In relation SP: S#, P# QTY Note: Dependencies are a matter of the semantics of the data, not merely a matter of the data values that happen to appear in a relation at some particular time. 18

Exercise: Functional Dependencies Find the functional dependencies amongst the following attributes: Snumber Name TutorName TutorRoom# Degree LectureCourse (LC) LectureCourseGrade(LCG) LectureRoom# (LR) LectureTime (LT) LectureRoomCapacity (LRC) 19

Which of these hold? Snumber Name, TutorName, Degree TutorName TutorRoom# Snumber TutorRoom# Snumber LectureCourse LectureCourse LectureRoom# LectureRoom# LectureRoomCapacity LectureCourse LectureTime 20

Which of these hold? Snumber,Name TutorName LectureCourse,LectureTime LectureRoom# TutorName LectureCourse 21

Inference Axioms for functional dependencies Given a set of functional dependencies we can derive others using the following inference axioms. In these axioms: X, Y, Z, W denote sets of attributes, XY is a shorthand for X Y. 22

LHS = RHS If you know the items given on the LHS then you can infer the items on the RHS. 23

Axioms for functional dependencies A1:Reflexivity = X X A2:Augmentation X Y = XZ Y X Y = XZ YZ A3:Transitivity (X Y) (Y Z) = X Z The axioms A1-A3 are called Armstrong's axioms. 24

Example of Augmentation: Snumber TutorName Snumber, SName TutorName 25

Axioms cntd. A4: Additivity (X Y) (X Z) = X YZ A5: Projectivity X YZ = X Y X YZ = X Z A6:Pseudotransitivity (X Y) (YZ W) = XZ W A4-A6 can be derived from A1 - A3. 26

Exercise Derive A4 from A1 - A3. 27

X Y X Z??? X YZ given given 28

Some Useful Definitions A functional dependency of the form X Y is trivial iff Y X. (E.g AB A) Let F be a set of fds. The closure of F, denoted F +, is the set of all fds logically implied by F. Let X be a set of attributes. The set of all attributes functionally determined by X under a set F of fds is called the closure of X under F, and is denoted X + F. 29

Example: F: B C C D A E CE F Then B + F = {B, C, D} AB + F =??? 30

An Algorithm for computing result : = X; X + F while (changes to result) do for each fd B C in F do begin end; if B result then result:= result C 31

Example: Find AG + F where F is: F: A B CG I C D H C B H result = {A,G} result = {A,G,B,H} result = {A, G,B,H,C,I,D} result = {A,G,B,H,C,I,D} AG + F = {A,B,C,D,G,H,I} 32

The speed of the algorithm is dependent on the size of F. Also these functional dependencies are part of the integrity constraints of the data stored in the database. These constraints have to be checked when updating the database and maintained. So it can pay to reduce the size of F without changing its closure. 33

Integrity Constraints Name Degree Length of Degree Smith A. BEng 3 Smith B. MEng 4 Fran C. MSc 1 Jones D. MSc 1 Name Degree Degree Length of Degree Update: Change Length of Degree of Fran C. to 2. 34

Definition : equivalent sets of FDs Two sets of functional dependencies S1 and S2 are equivalent if and only if S1 + =S2 +, i.e. S1 implies all the fds in S2, and vice versa. Informally: S1 and S2 are equivalent if they contain exactly the same information. 35

Definition: Irreducible/Canonical Cover An irreducible cover for a set F of fds, denoted Fc, is a set of fds that satisfies the following four conditions: 1. F and Fc are equivalent. 2.The right-hand side of every dependency in Fc involves just one attribute. 36

3.The left-hand side of every dependency in Fc is irreducible, i.e. no attribute in any lefthand side can be discarded without changing the closure. 4. No fd in Fc can be discarded without changing the closure. 37

For each set of functional dependencies there exists at least one irreducible cover, but each set does not necessarily have a unique irreducible cover. 38

Example: Irreducible Cover F: A BC B C A B AB C AC D 39

A BC B C A B AB C AC D A B A C 40

First rewrite F as: A B A C B C A B AB C AC D This set is equivalent to F by additivity and projectivity. 41

A B A C B C A B A B AB C AC D 42

Delete the duplication of A B: A B A C B C AB C AC D 43

AB C can be deleted without changing the closure, because it is implied from A C by augmentation. A B A C B C AB C AC D 44

A B A C B C AC D 45

A B A C B C A C D 46

C can be deleted from AC D, because A D is implied from A C and AC D by augmentation and transitivity: A B A C B C A D 47

A B A C B C A D 48

Finally A C can be deleted because it is implied by A B and B C by transitivity: A B B C A D This final set is an irreducible cover for F. 49

Definitions: Candidate, Primary and Alternate Keys A set of attributes K of a relation R is a candidate key of R iff K satisfies the following two conditions: 1) K R This is a shorthand for saying that K functionally determines all attributes of R. 2) K' ( K' K and K' R) i.e. there is no proper subset of K that functionally determines all attributes of R. 50

Every relation has at least one candidate key. Some relations may have exactly one, but it is possible that some may have two or more. Historically, in the relational model, for any given relation, one of the candidate keys is chosen as the primary key, and then the remainder (if any) are called alternate keys. 51

Example Suppose in relation S, S# and SNAME both uniquely identify each supplier. Then S would have two candidate keys: S# and SNAME. We may choose S# as the primary key. Then SNAME becomes an alternate key. 52

The Entity Integrity Rule In the relational model, the primary key is constrained by the following integrity rule. Entity Integrity Rule: No attribute participating in the primary key of a relation is allowed to accept NULL values. ( NULL values represent unknown or non-existent values.) 53

Name Postcode Telephone Smith SW7 2BZ 02075948822 Jones SW5 5AT NULL Pitt NULL 02086662222 NULL = DO not know / Does not exist 54

Phew.. After all this work on FDs we will now use them to analyse the design of relations and improving the design through normalisation. 55