Normalization. CIS 331: Introduction to Database Systems



Similar documents
COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Database Design and Normalization

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Schema Refinement, Functional Dependencies, Normalization

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Relational Database Design

Schema Refinement and Normalization

Relational Database Design Theory

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Database Management System

Design of Relational Database Schemas

Topic 5.1: Database Tables and Normalization

Functional Dependencies and Finding a Minimal Cover

DATABASE NORMALIZATION

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005

Normalization in Database Design

Functional Dependency and Normalization for Relational Databases

Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations

Chapter 6. Database Tables & Normalization. The Need for Normalization. Database Tables & Normalization

Lecture 2 Normalization

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Chapter 10. Functional Dependencies and Normalization for Relational Databases

RELATIONAL DATABASE DESIGN

DATABASE SYSTEMS. Chapter 7 Normalisation

Database Design and Normal Forms

Functional Dependencies

Theory of Relational Database Design and Normalization

Chapter 7: Relational Database Design

Relational Database Design: FD s & BCNF

Chapter 7: Relational Database Design

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

CS143 Notes: Normalization Theory

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Normalization. CIS 3730 Designing and Managing Data. J.G. Zheng Fall 2010

Normalization. Functional Dependence. Normalization. Normalization. GIS Applications. Spring 2011

Normal forms and normalization

Chapter 9: Normalization

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

Database Design and Normalization

Part 6. Normalization

Database Management Systems. Redundancy and Other Problems. Redundancy

Normalization in OODB Design

Boyce-Codd Normal Form

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Introduction to normalization. Introduction to normalization

Introduction to Database Systems. Normalization

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Functional Dependencies and Normalization

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

DBMS. Normalization. Module Title?

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Lecture Notes on Database Normalization

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Chapter 8. Database Design II: Relational Normalization Theory

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

Relational Normalization Theory (supplemental material)

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Database Design and the Reality of Normalisation

MCQs~Databases~Relational Model and Normalization

Normalisation 6 TABLE OF CONTENTS LEARNING OUTCOMES

Module 5: Normalization of database tables

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

Normalization of Database

DATABASE DESIGN: Normalization Exercises & Answers

Normalisation 1. Chapter 4.1 V4.0. Napier University

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation

Database Systems Concepts, Languages and Architectures

Theory of Relational Database Design and Normalization

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Normalization. Normalization. Normalization. Data Redundancy

Design Theory for Relational Databases: Functional Dependencies and Normalization

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

Normalization. Reduces the liklihood of anomolies

Normalization for Relational DBs

Normalisation in the Presence of Lists

Database Normalization. Mohua Sarkar, Ph.D Software Engineer California Pacific Medical Center

A Web-Based Environment for Learning Normalization of Relational Database Schemata

BCA. Database Management System

Conceptual Design Using the Entity-Relationship (ER) Model

Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354

LOGICAL DATABASE DESIGN

Introduction to Databases

If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.

Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

Normalization. Normalization. First goal: to eliminate redundant data. for example, don t storing the same data in more than one table

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Database Sample Examination

Transcription:

Normalization CIS 331: Introduction to Database Systems

Normalization: Reminder Why do we need to normalize? To avoid redundancy (less storage space needed, and data is consistent) To avoid update/delete anomalies Vladimir Vacic, Temple University 2

Normalization: Reminder Ssn c-id Grade Name Address 123 cs331 A smith Main 123 cs351 B smith Main 123 cs211 A smith Main Clearly, Name and Address are redundant (larger relation + you have to update 3 rows to update the Address) Vladimir Vacic, Temple University 3

Normalization: Reminder Ssn c-id Grade Name Address 123 cs331a smith Main 234 null null jones Forbes Insertion anomaly: Cannot make a record Jones address because he is not taking any classes Vladimir Vacic, Temple University 4

Normalization: Reminder Ssn c-id Grade Name Address 123 cs331 A jones 123 Main 124 cs351 B smith 124 Broad 125 cs211 A shmoo 42 Penn Delete anomaly: Cannot delete Shmoo s enrolment without loosing his address as well Vladimir Vacic, Temple University 5

Normal Forms First Normal Form 1NF Second Normal Form 2NF Third Normal Form 3NF Fourth Normal Form 4NF Fifth Normal Form 5NF (so far conveniently named) Boyce-Codd Normal Form BCNF Vladimir Vacic, Temple University 6

First Normal Form (1NF) 1NF: all attributes are atomic ( no repeating groups ) Last Name Smith Rumpelstiltskin First Name Peter Mary John Anne Michael Not in 1NF Vladimir Vacic, Temple University 7

First Normal Form (1NF) Last Name Smith Smith Smith Rumpelstiltskin Rumpelstiltskin First Name Peter Mary John Anne Michael Normalized to 1NF Vladimir Vacic, Temple University 8

First Normal Form (1NF) Name Michael Raphael Gabriel Uriel Metatron Weight 187 lb 192 lb 201 lb 165 lb 195 kg Not in 1NF Vladimir Vacic, Temple University 9

First Normal Form (1NF) Name Michael Weight 187 Unit lb Normalized to 1NF Raphael 192 lb Gabriel 201 lb Uriel 165 lb Metatron 195 kg Vladimir Vacic, Temple University 10

First Normal Form (1NF) Supplier S1 S2 S3 Part P1, P3, P4 P3 P2, P3 Not in 1NF Vladimir Vacic, Temple University 11

First Normal Form (1NF) Is this relation in 1NF? Supplier Part1 Part2 Part3 S1 P1 P3 P4 S2 P3 null null S3 P2 P3 null Formally yes, but in essence, NO! Vladimir Vacic, Temple University 12

First Normal Form (1NF) Supplier S1 S1 S1 S2 S3 S3 Part P1 P3 P4 P3 P2 P3 Normalized to 1NF Vladimir Vacic, Temple University 13

Second Normal Form (2NF) 2NF: 1NF and all non-key attributes are fully dependent on the PK ( no partial dependencies ) Student Course_ID Grade Address Not in Erik Sven CIS331 CIS331 A B 80 Ericsson Av. 12 Olafson St. 2NF Inge CIS331 C 192 Odin Blvd. Hildur CIS362 A 212 Reykjavik St. Vladimir Vacic, Temple University 14

Second Normal Form (2NF) Student Erik Sven Inge Hildur Address 80 Ericsson Av. 12 Olafson St. 192 Freya Blvd. 212 Reykjavik St. Normalized to 2NF Student Erik Sven Inge Hildur Course_ID CIS331 CIS331 CIS331 CIS362 Grade A B C A Vladimir Vacic, Temple University 15

Second Normal Form (2NF) Student Address Course_ID Grade Vladimir Vacic, Temple University 16

Third Normal Form (3NF) 3NF: 2NF and no transitive dependencies Student Erik Course_ID CIS331 Grade A Grade_value 4.00 Not in 3NF Sven CIS331 B 3.00 Inge CIS331 C 2.00 Hildur CIS362 A 4.00 Vladimir Vacic, Temple University 17

Third Normal Form (3NF) Student Course_ID Grade Grade Grade_value Erik CIS331 A A 4.00 Sven CIS331 B B 3.00 Inge CIS331 C C 2.00 Hildur CIS362 A Normalized to 3NF Vladimir Vacic, Temple University 18

Third Normal Form (3NF) Student Value Course_ID Grade Vladimir Vacic, Temple University 19

BCNF: Reminder Informally: Everything depends on the full key, and nothing but the key Formally: For every FD a b in F+ a b is trivial (a is a superset of b) or a is a superkey (or both) Vladimir Vacic, Temple University 20

Normal forms: BCNF vs. 3NF If a relation is in BCNF, it is always in 3NF (but not the converse) In practice, aim for BCNF If that s impossible, we accept 3NF; but we insist on lossless join and dependency preservation Vladimir Vacic, Temple University 21

Normal forms: BCNF vs. 3NF Let s consider the classic example: STC (Student, Teacher, Course) Teacher Course Student, Course Teacher Is it in BCNF? Student Course Teacher Vladimir Vacic, Temple University 22

Normal forms: BCNF vs. 3NF STC (Student, Teacher, Course) Teacher Course Student, Course Teacher 1) (Teacher, Course) (Student, Course) (BCNF? Y+Y - Lossless? No - Dep. pres.? No) 2) (Teacher, Course) (Student, Teacher) (BCNF? Y+Y - Lossless? Yes - Dep. pres.? No) Vladimir Vacic, Temple University 23

Normal forms: BCNF vs. 3NF STC (Student, Teacher, Course) Teacher Course Student, Course Teacher in this case it is impossible to have both BCNF and dependency preservation So we have to use the (weaker) 3NF. Vladimir Vacic, Temple University 24

Normal forms: BCNF vs. 3NF STC (Student, Teacher, Course) Teacher Course Student, Course Teacher S C T Informally, 3NF forgives the red arrow Vladimir Vacic, Temple University 25

Normalization: Examples Supplier Name Status_City City Part Qty S1 Jones 10 Paris P3 257 S1 Jones 10 Paris P1 500 S1 Jones 10 Paris P4 125 S2 Spiritoso 12 London P3 (null) S7 Kohl 10 Paris P4 342 Start by determining functional dependencies! Vladimir Vacic, Temple University 26

Normalization: Example FDs Supplier Name Supplier City Status_City Supplier Status_City (Supplier, Part) Qty Partial Dependencies (Supplier, Part) Name (Supplier, Part) City (Supplier, Part) Status_City Vladimir Vacic, Temple University 27

Normalization: Example Supplier Name Status_City City S1 Jones 10 Paris S2 Spiritoso 12 London S7 Kohl 10 Paris Supplier Part Qty S1 P1 500 S1 P3 257 S1 P4 125 S2 P3 (null) S7 P4 342 Vladimir Vacic, Temple University 28

Normalization: Example We took care of the partial dependencies, but what about transitive dependencies? Supplier Name Status_City City S1 Jones 10 Paris S2 Spiritoso 12 London S7 Kohl 10 Paris Vladimir Vacic, Temple University 29

Normalization: Example Supplier Part Qty Supplier Name City S1 S1 S1 S2 P1 P3 P4 P3 500 257 125 (null) S1 S2 S7 Jones Spirit Kohl Paris London Paris S7 P4 342 City Status_City Paris 10 London 12 Vladimir Vacic, Temple University 30

Normalization: Examples Silberschatz et al.: 7.2 (lossless-join decomposition) hint: use the theorem! A decomposition is a lossless-join decomposition if the joining attribute is a superkey in at least one of the new relations 7.16 (lossless-join decomposition) 7.18 (dependency-preserving decomposition) Vladimir Vacic, Temple University 31

Normalization: Final Thoughts There are higher normal forms (4NF, 5NF), but we will not talk about them In practice, normalized means in BCNF or 3NF Luckily, in practice, ER diagrams lead to normalized tables (but do not rely on luck) Vladimir Vacic, Temple University 32

Normalization: Overview Why do we normalize? To avoid redundancy (less storage space needed, and data is consistent) To avoid update/delete anomalies A good decomposition should: be a lossless join decomposition (you can recover original tables with a join) preserve dependencies (FD s should not span two tables) Vladimir Vacic, Temple University 33

Normalization: Overview Boyce-Codd Normal Form (BCNF): Everything should depend on the key, the whole key, and nothing but the key (so help me Codd joke attributed to C.J. Date ) 1NF (all attributes are atomic) 2NF (no partial dependencies) 3NF (no transitive dependencies) Vladimir Vacic, Temple University 34