Introduction to Database Systems. Normalization



Similar documents
Lecture Notes on Database Normalization

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Design of Relational Database Schemas

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

Database Design and Normalization

CS143 Notes: Normalization Theory

Relational Database Design: FD s & BCNF

Schema Refinement and Normalization

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Relational Database Design

Theory of Relational Database Design and Normalization

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Chapter 10. Functional Dependencies and Normalization for Relational Databases

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Relational Database Design Theory

Database Design and Normalization

Database Design and Normal Forms

Functional Dependencies and Finding a Minimal Cover

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005

Chapter 7: Relational Database Design

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

Functional Dependencies and Normalization

Database Management Systems. Redundancy and Other Problems. Redundancy

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

Schema Refinement, Functional Dependencies, Normalization

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Chapter 7: Relational Database Design

Relational Normalization Theory (supplemental material)

Database Systems Concepts, Languages and Architectures

DATABASE NORMALIZATION

Theory of Relational Database Design and Normalization

Functional Dependency and Normalization for Relational Databases

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Chapter 8. Database Design II: Relational Normalization Theory

Functional Dependencies

Normalization in OODB Design

Database Constraints and Design

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354

Database Management System

Introduction to Databases

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

MCQs~Databases~Relational Model and Normalization

Advanced Relational Database Design

An Algorithmic Approach to Database Normalization

RELATIONAL DATABASE DESIGN

Normalization. CIS 331: Introduction to Database Systems

Design Theory for Relational Databases: Functional Dependencies and Normalization

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

Limitations of DB Design Processes

Normalization of database model. Pazmany Peter Catholic University 2005 Zoltan Fodroczi

Boyce-Codd Normal Form

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

How To Find Out What A Key Is In A Database Engine

Normalization in Database Design

Entity-Relationship Model. Purpose of E/R Model. Entity Sets

Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

A Web-Based Environment for Learning Normalization of Relational Database Schemata

Normalization of Database

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

Normalization. Reduces the liklihood of anomolies

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations

Normalization for Relational DBs

The class slides, your notes, and the sample problem that we worked in class may be helpful for reference.

Lecture 2 Normalization

Normalization. Normalization. Normalization. Data Redundancy

Topic 5.1: Database Tables and Normalization

3. Database Design Functional Dependency Introduction Value in design Initial state Aims

Normal forms and normalization

Fundamentals of Database System

Figure 14.1 Simplified version of the

Module 5: Normalization of database tables

Transactions, Views, Indexes. Controlling Concurrent Behavior Virtual and Materialized Views Speeding Accesses to Data

Part 6. Normalization

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Normalisation 1. Chapter 4.1 V4.0. Napier University

Normalisation in the Presence of Lists

DATABASE SYSTEMS. Chapter 7 Normalisation

DBMS. Normalization. Module Title?

1. A table design is shown which violates 1NF. Give a design which will store the same information but which is in 1NF.

Database Sample Examination

Unique column combinations

Transcription:

Introduction to Database Systems Normalization Werner Nutt 1 Normalization 1. Anomalies 1. Anomalies 2. Boyce-Codd Normal Form 3. 3 rd Normal Form 2

Anomalies The goal of relational schema design is to avoid anomalies and redundancy: Update anomaly : one occurrence of a fact is changed, but not all occurrences Deletion anomaly : a valid fact is lost when a tuple is deleted 3 Example of Bad Design Drinkers(name, addr, beersliked, manf, favbeer) name addr beersliked manf favbeer Janeway Voyager Bud A.B. WickedAle Janeway??? WickedAle Pete s??? Spock Enterprise Bud??? Bud Data is redundant, because each of the???s can be figured out by using the FDs name addr favbeer beersliked manf 4

This Bad Design Also Exhibits Anomalies name addr beersliked manf favbeer Janeway Voyager Bud A.B. WickedAle Janeway Voyager WickedAle Pete s WickedAle Spock Enterprise Bud A.B. Bud Update anomaly: if Janeway is transferred to Intrepid, will we remember to change each of her tuples? Deletion anomaly: If nobody likes Bud, we lose track of the fact that Anheuser-Busch manufactures Bud. 5 Normalization 2. Boyce-Codd Normal Form 1. Anomalies 2. Boyce-Codd Normal Form 3. 3 rd Normal Form 6

Boyce-Codd Normal Form A relation R is in Boyce-Codd Normal Form (BCNF) if whenever X A is a nontrivial FD that holds in R, then X is a superkey Remember: nontrivial means A X a superkey is any superset of a key (not necessarily a strict superset) Each attribute must describe the key, the whole key, and nothing but the key 7 Example Drinkers(name, addr, beersliked, manf, favbeer) FDs: name addr favbeer, beersliked manf The only key is {name, beersliked} In each FD above, the left side is not a superkey Any one of these FDs shows Drinkers is not in BCNF Each of the above FDs is a partial dependency, i.e., the right side depends only on a part of the key 8

Another Example Beers(name, manf, manfaddr) FD s: name manf, manf manfaddr The only key is {name} name manf does not violate BCNF, but manf manfaddr does The second FD is a transitive dependency, because manfaddr depend on manf, which is not part of any key 9 Decomposition into BCNF Given: relation R with FDs F Goal: decompose R into relations R 1,,R m such that each R i is a projection of R each R i is in BCNF (wrt the projection of F) R is the natural join of R 1,,R m Intuition: R is broken into pieces that contain the same information as R, but are free of redundancy 10

The Algorithm: Divide and Conquer Look in F for an FD X B that violates BCNF (If any FD following from F violates BCNF, then there is surely an FD in F itself that violates BCNF) Compute X + (X + does not contain all attributes of R, otherwise X would be superkey) Decompose R using X B, i.e., replace R by relations with schemas R 1 = X + R 2 = (R X + ) X Compute the projections F 1, F 2 of F on R 1, R 2 Continue with R 1,F 1 and R 2,F 2 11 Decomposition Picture X + R 1 R-X + X X + -X R 2 R 12

Example Drinkers(name, addr, beersliked, manf, favbeer) F = {name addr, name favbeer, beersliked manf} Pick the BCNF violation name addr Close the left side: {name} + = {name, addr, favbeer} Decomposed relations: Drinkers1(name, addr, favbeer) Drinkers2(name, beersliked, manf) 13 Projecting FDs: Example (cntd) For Drinkers1(name, addr, favbeer), the only relevant FDs are name addr and name favbeer: the only key is {name} Drinkers1 is in BCNF For Drinkers2(name, beersliked, manf), the only relevant FD is beersliked manf: the only key is {name, beersliked} violation of BCNF (because there is partial dependency) Continue with Drinkers2 14

Example (cntd) Drinkers2(name, beersliked, manf) F 2 = {beersliked manf} Pick the BCNF violation beersliked manf Close the left side: {beersliked} + = {beersliked, manf} Decomposed relations: Drinkers3(beersLiked, manf) Drinkers4(name, beersliked) 15 Example (cntd) Projecting FDs: For Drinkers3(beersLiked, manf), the only relevant FD is beersliked manf: the only key is {beersliked} Drinkers3 is in BCNF For Drinkers4(name, beersliked), there is no relevant FD: the only key is {name, beersliked} Drinkers4 is in BCNF 16

Example (concluded) We have decomposed Drinkers(name, addr, beersliked, manf, favbeer) into Drinkers1(name, addr, favbeer) Drinkers3(beersLiked, manf) Drinkers4(name, beersliked) Notice: Drinkers1 is about drinkers Drinkers3 is about beers Drinkers4 is about the relationship between drinkers and the beers they like 17 Normalization 3. 3rd Normal Form 1. Anomalies 2. Boyce-Codd Normal Form 3. 3 rd Normal Form 18

Third Normal Form Motivation There is one structure of FDs that causes trouble when we decompose: AB C and C B Examples: A = street address, B = city, C = zip code A = lecturer, B = hour, C = course There are two keys, { A, B } and { A, C } C B is a BCNF violation, so we must decompose into AC BC 19 We Cannot Enforce FDs If we decompose ABC into AC and BC, then we cannot enforce AB C by checking FDs in the decomposed relations Example with A = street, B = city C = zip on the next slide 20

An Unenforceable FD street zip 545 Tech Sq. 02138 545 Tech Sq. 02139 city zip Cambridge 02138 Cambridge 02139 Join tuples with equal zip codes. street city zip 545 Tech Sq. Cambridge 02138 545 Tech Sq. Cambridge 02139 Although no FDs were violated in the decomposed relations, FD street city zip is violated by the database as a whole 21 3NF Lets Us Avoid This Problem 3 rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation An attribute is prime if it is a member of any key X A violates 3NF if and only if X is not a superkey and also A is not prime 22

Back to the ABC Example On ABC, we have FDs AB C and C B There are two keys: AB and AC A, B, and C are each prime Although C B violates BCNF, it does not violate 3NF 23 What 3NF and BCNF Give You There are two important properties of a decomposition: Losslessness: It should be possible to project the original relation onto the decomposed schema and then reconstruct the original by a natural join Dependency Preservation: It should be possible to check in the projected relations whether all the given FDs are satisfied 24

3NF and BCNF Continued We can get (1) with a BCNF decomposition We can get both (1) and (2) with a 3NF decomposition But we can t always get (1) and (2) with a BCNF decomposition street-city-zip is an example 25 Exercise Consider the relation PI (= passengerinfo) with the attributes PI(FlightNo, Date, DepartureTime, SeatNo, TicketNo, Name, Address, LuggageId, Weight) and the FDs FlightNo, Date DepartureTime FlightNo, Date, TicketNo SeatNo TicketNo Name Address LuggageId Weight Date TicketNo Is the relation in Boyce-Codd normal form? If not, decompose into relation that are in BCNF. Is the resulting decomposition dependency preserving? 26

Exercise Let R be a relation with attributes ABCD. Consider the following combinations of FDs on R: AB C, C D, D A B C, B D AB C, BC D, CD A, AD B A B, B C, C D, D A For each collection of FDs do the following: 1. Indicate all the BCNF violations 2. Decompose the relations into collections of relations that are in BCNF 3. Are the decompositions dependency preserving? 27 References In preparing the lectures I have used several sources. The main ones are the following: Books: A First Course in Database Systems, by J. Ullman and J. Widom Fundamentals of Database Systems, by R. Elmasri and S. Navathe Slides: The slides of this chapter are based on slides prepared by Jeff Ullman for his introductory course on database systems at Stanford University 28