Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005



Similar documents
Database Design and Normalization

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

CS143 Notes: Normalization Theory

Lecture Notes on Database Normalization

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Relational Database Design Theory

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

Relational Database Design: FD s & BCNF

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Overview - detailed. Goal. Faloutsos CMU SCS

Boyce-Codd Normal Form

Design of Relational Database Schemas

Relational Normalization Theory (supplemental material)

Database Management System

Introduction to Databases

Chapter 10. Functional Dependencies and Normalization for Relational Databases

Introduction to Database Systems. Normalization

Database Design and Normal Forms

Chapter 7: Relational Database Design

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

Functional Dependency and Normalization for Relational Databases

Theory of Relational Database Design and Normalization

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

DATABASE SYSTEMS. Chapter 7 Normalisation

Chapter 8. Database Design II: Relational Normalization Theory

Database Management Systems. Redundancy and Other Problems. Redundancy

Chapter 7: Relational Database Design

Functional Dependencies and Normalization

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Relational Database Design

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

Schema Refinement, Functional Dependencies, Normalization

MCQs~Databases~Relational Model and Normalization

Normalization in OODB Design

Functional Dependencies and Finding a Minimal Cover

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Schema Refinement and Normalization

Limitations of DB Design Processes

Normalization. CIS 331: Introduction to Database Systems

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Database Constraints and Design

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

RELATIONAL DATABASE DESIGN

Normalization in Database Design

Design Theory for Relational Databases: Functional Dependencies and Normalization

Normalization of database model. Pazmany Peter Catholic University 2005 Zoltan Fodroczi

Normalization of Database

Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354

BCA. Database Management System

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Module 5: Normalization of database tables

Theory of Relational Database Design and Normalization

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

Chapter 6. Database Tables & Normalization. The Need for Normalization. Database Tables & Normalization

Lecture 2 Normalization

Advanced Relational Database Design

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation

Database Systems Concepts, Languages and Architectures

Normal forms and normalization

Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Normalisation in the Presence of Lists

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen

Functional Dependencies

LOGICAL DATABASE DESIGN

How To Find Out What A Key Is In A Database Engine

Normalization. Reduces the liklihood of anomolies

Analysis and Design Complex and Large Data Base using MySQL Workbench

Objectives, outcomes, and key concepts. Objectives: give an overview of the normal forms and their benefits and problems.

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

Topic 5.1: Database Tables and Normalization

Database Systems I Foundations of Databases

3. Database Design Functional Dependency Introduction Value in design Initial state Aims

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Normalization. Functional Dependence. Normalization. Normalization. GIS Applications. Spring 2011

Exercise 1: Relational Model

Normalisation 1. Chapter 4.1 V4.0. Napier University

Xml Normalization and Reducing RedUndies

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Margaret S. W% 425 Beldon Ave Iowa City, Iowa Phone: (319) ABSTRACT

The process of database development. Logical model: relational DBMS. Relation

DATABASE NORMALIZATION

MODULE 8 LOGICAL DATABASE DESIGN. Contents. 2. LEARNING UNIT 1 Entity-relationship(E-R) modelling of data elements of an application.

Graham Kemp (telephone , room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.

Database Design and Normalization

Normalization for Relational DBs

City University of Hong Kong. Information on a Course offered by Department of Computer Science with effect from Semester A in 2014 / 2015

DBMS. Normalization. Module Title?

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

DATABASE MANAGEMENT SYSTEMS. Question Bank:

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer

Transcription:

Introduction to Databases, Fall 2005 IT University of Copenhagen Lecture 5: Normalization II; Database design case studies September 26, 2005 Lecturer: Rasmus Pagh

Today s lecture Normalization II: 3rd normal form. Multivalued dependencies. 4th normal form. Some observations on normalization. Case studies in database design: Internet bookstore. TV series database. 1

Next: 3rd normal form

Interrelation dependencies Consider the relation with schema Bookings(title,theater,city) Under certain assumptions, it has the FD theater city, but theater is not a superkey. The BCNF decomposition yields relation schemas Bookings1(theater,city) and Bookings2(theater,title). These schemas and their FDs allow, e.g., the relation instances: theater city theater title Guild Park Menlo Park Menlo Park Guild Park The net The net which violate the presumed FD title city theater. Thus, there are implicit dependencies between values in different relations. We cannot check FDs separately in each relation to see such a dependency. 3

Splitting keys As we just saw, decomposition can result in a relational database schema where a functional dependency disappeared. The problem in the previous example arose because we decomposed according to the FD theater city, where city is part of a key for the Bookings relation. Thus we ended up splitting the key {city, theater}. This problem of FDs that are not preserved never arises if we do not decompose in this case. 4

Third normal form We have motivated the following normal form which never splits a key of the original relation. An attribute of a key is called prime. A relation is in 3rd normal form (3NF) if any functional dependency among its attributes is either unavoidable, or has a prime (i.e., member of some key) on the right hand side. In words: A relation is in 3NF if there are no unavoidable functional dependencies among non-key attributes. 5

When to stop decomposition at 3NF? Whether it is a good idea to stop decomposition when third normal form is reached depends on the specific scenario. Mostly, 3NF and BCNF coincide, so there is nothing to consider. If not, the redundancy in tuples in 3NF should be weighed against the fact that some FD is difficult to check/maintain in BCNF. Example: In the Bookings example, we might want to make the DBMS check that to every title and city, there is at most one theater. For the BCNF decomposed relations, this would involve a query on Bookings1 for every change of Bookings2, and vice versa. 6

Next: Multivalued dependencies.

Redundancy in BCNF relations Boyce-Codd normal form eliminates redundancy in each tuple, but may leave redundancy among tuples in a relation. This happens, for example, if two many-many relationships are represented in a relation. [Figure 3.29 shown on slide] Example: In the relation StarsIn(name, street, city, title, year) we could represent two many-many relationships: between actors and addresses, and between actors and movies. 8

Curing it with NULL values? Then what about something like one of these: name street city title year C. Fisher 123 Maple St. Hollywood NULL NULL C. Fisher 5 Locust Ln. Malibu NULL NULL C. Fisher NULL NULL Star Wars 1977 C. Fisher NULL NULL Empire Strikes Back 1980 C. Fisher NULL NULL Return of the Jedi 1983 name street city title year C. Fisher 123 Maple St. Hollywood Star Wars 1977 C. Fisher 5 Locust Ln. Malibu Empire Strikes Back 1980 C. Fisher NULL NULL Return of the Jedi 1983 Problem session (5 minutes): Criticize the above solutions. 9

Decomposition A better idea is to eliminate redundancy by decomposing StarsIn as follows: name street city C. Fisher 123 Maple St. Hollywood C. Fisher 5 Locust Ln. Malibu name title year C. Fisher Star Wars 1977 C. Fisher Empire Strikes Back 1980 C. Fisher Return of the Jedi 1983 10

When can we decompose? When can we decompose a relation R? Suppose we decompose into two relations (for simplicity we assume that there is just one common attribute): R1(A, B1, B2,..., Bm) R2(A, C1, C2,..., Ck) Now consider a specific value a for attribute A, occurring in the set of tuples T 1 from R1 and in the set of tuples T 2 from R2. When we join R1 and R2, every pair of tuples from T 1 and T 2 are combined. 11

When can we decompose (2)? Example: a c a b 1 N 1 S 2 U 2 D 1 E 1 W 1 NE 1 NW 1 SE 1 SW 2 45 2 90 12

Multivalued dependencies When we can decompose R into relations R1(A1, A2,...An, B1, B2,..., Bm) R2(A1, A2,...An, C1, C2,..., Ck) (with no Bs among the Cs) then we say that there is a multivalued dependency (MVD) from the As to the Bs, written A1 A2...An B1 B2...Bm Example: Since StarsIn can be decomposed into StarsIn1(name, street, city) and StarsIn2(name, title, year) it has the MVD name street city. 13

Multi-valued dependencies, book s definition A1 A2...An B1 B2...Bm holds exactly if: For every pair of tuples t and u from R that agree on all As, we can find some tuple v in R that agrees: With both t and u on the As With t on the Bs With u on the Cs [Figure 3.30 shown on slide] Homework: Convince yourselves that this definition, used in the coursebook, is equivalent to the one given previously in this lecture. 14

Unavoidable and trivial MVDs If {A1,A2,...,An} form a superkey, then for any B1, B2,..., Bm we unavoidably have: A1 A2...An B1 B2...Bm An MVD is said to be trivial if either One of the Bs is among the As, or All the attributes of R are among the As and Bs. 15

Next: 4th normal form.

4th normal form Roughly speaking, a relation is in 4th normal form if it cannot be meaningfully decomposed into two relations. More precisely: A relation is in fourth normal form (4NF) if any multivalued dependency among its attributes is either unavoidable or trivial. Example: StarsIn has the MVD name street city which is nontrivial. Since name is not a superkey the relation is not in 4NF. 17

Decomposing a relation into 4NF Suppose we have a relation R which is not in 4NF. Then there is a nontrivial MVD which is not unavoidable. A 1 A 2...A n B 1 B 2...B m To eliminate the MVD we split R into two relations: One with all attributes of R except B 1, B 2,...,B m. One with attributes A 1, A 2,...,A n, B 1, B 2,...,B m. If any of the resulting relations is not in 4NF, the process is repeated. 18

4NF decomposition example Recall the relation StarsIn with schema StarsIn(name, street, city, title, year) It has the following nontrivial MVD, which is not unavoidable: name street city Thus the decomposition yields the following relations (both in 4NF): StarsIn1(name, street, city) StarsIn2(name, title, year) 19

Next: Some observations on normalization

Relationship among normal forms Inclusion among normal forms: Any relation in 4NF is also in BCNF. Any relation in BCNF is also in 3NF. [Figure 3.31 shown on slide] Properties of normal forms: A higher normal form has less redundancy, but may not preserve functional and multivalued dependencies. [Figure 3.32 shown on slide] 21

How should normal forms be used? The various normal forms may be seen as guidelines for designing a good relation schema. Some complexities that arise are: Should we split keys, introducing dependencies between relations (in 3NF we do not)? What is the effect of decomposition on performance? How does decomposition affect query programming? 22

Problem session (10 minutes) Consider the following E/R diagram shown on the slide, which is supposed to model data on sales to customers. Convert the E/R diagram to relations. Consider whether there are any 4NF violations, and if so perform the decomposition into 4NF. 23

Most important points in this lecture After this week you should: Be able to determine whether a relation is in 3rd or 4th normal form. Be able to split a relation in several relations to achieve 3rd or 4th normal form. 24