Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications. Overview - detailed. Goal. Faloutsos CMU SCS 15-415

Similar documents
Normalization. CIS 331: Introduction to Database Systems

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Database Design and Normalization

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Schema Refinement and Normalization

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Relational Database Design Theory

Chapter 10. Functional Dependencies and Normalization for Relational Databases

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Theory of Relational Database Design and Normalization

Relational Database Design: FD s & BCNF

CS143 Notes: Normalization Theory

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005

Introduction to Database Systems. Normalization

Relational Database Design

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Schema Refinement, Functional Dependencies, Normalization

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Design of Relational Database Schemas

DATABASE NORMALIZATION

Lecture Notes on Database Normalization

Database Management Systems. Redundancy and Other Problems. Redundancy

RELATIONAL DATABASE DESIGN

Database Management System

Functional Dependency and Normalization for Relational Databases

Normalization of database model. Pazmany Peter Catholic University 2005 Zoltan Fodroczi

Database Design and Normal Forms

Functional Dependencies and Finding a Minimal Cover

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Theory of Relational Database Design and Normalization

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Chapter 7: Relational Database Design

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

Chapter 7: Relational Database Design

Limitations of DB Design Processes

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Normalization in Database Design

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Relational Normalization Theory (supplemental material)

Functional Dependencies and Normalization

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

Database Design and Normalization

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Theory I: Database Foundations

Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

Normalisation 1. Chapter 4.1 V4.0. Napier University

Database Constraints and Design

DATABASE SYSTEMS. Chapter 7 Normalisation

Boyce-Codd Normal Form

Normalization in OODB Design

Functional Dependencies

Chapter 8. Database Design II: Relational Normalization Theory

Design Theory for Relational Databases: Functional Dependencies and Normalization

Part 6. Normalization

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

Normal forms and normalization

Chapter 6. Database Tables & Normalization. The Need for Normalization. Database Tables & Normalization

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline. We ll learn: Faloutsos CMU SCS

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Topic 5.1: Database Tables and Normalization

Lecture 2 Normalization

Introduction to normalization. Introduction to normalization

Conceptual Design Using the Entity-Relationship (ER) Model

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

Module 5: Normalization of database tables

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation

Normalization. Reduces the liklihood of anomolies

Teaching Database Modeling and Design: Areas of Confusion and Helpful Hints

Database Sample Examination

Normalization for Relational DBs

How To Find Out What A Key Is In A Database Engine

EECS 647: Introduction to Database Systems

Lecture 6. SQL, Logical DB Design

If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.

Graham Kemp (telephone , room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.

DATABASE DESIGN: Normalization Exercises & Answers

Normalization of Database

Normalization. Normalization. Normalization. Data Redundancy

An Algorithmic Approach to Database Normalization

Fundamentals of Database System

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

Database Systems Concepts, Languages and Architectures

Database Design and the Reality of Normalisation

The University of British Columbia

Databases Model the Real World. The Entity- Relationship Model. Conceptual Design. Steps in Database Design. ER Model Basics. ER Model Basics (Contd.

Advanced Relational Database Design

MCQs~Databases~Relational Model and Normalization

Chapter 9: Normalization

Normalization. Normalization. First goal: to eliminate redundant data. for example, don t storing the same data in more than one table

Transcription:

Faloutsos 15-415 Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #17: Schema Refinement & Normalization - Normal Forms (R&G, ch. 19) Overview - detailed DB design and normalization pitfalls of bad design decomposition normal forms Faloutsos 15-415 2 Goal Design good tables sub-goal#1: define what good means sub-goal#2: fix bad tables in short: we want tables where the attributes depend on the primary key, on the whole key, and nothing but the key Let s see why, and how: Faloutsos 15-415 3 1

Faloutsos 15-415 Pitfalls takes1 (ssn, c-id, grade, name, address) Faloutsos 15-415 4 Pitfalls Bad - why? because: ssn->address, name Faloutsos 15-415 5 Redundancy space Pitfalls (inconsistencies) insertion/deletion anomalies: Faloutsos 15-415 6 2

Faloutsos 15-415 Pitfalls insertion anomaly: jones registers, but takes no class - no place to store his address! Faloutsos 15-415 7 Pitfalls deletion anomaly: delete the last record of smith (we lose his address!) Faloutsos 15-415 8 Solution: decomposition split offending table in two (or more), eg.:?? Faloutsos 15-415 9 3

Faloutsos 15-415 Overview - detailed DB design and normalization pitfalls of bad design decomposition lossless join decomp. dependency preserving normal forms Faloutsos 15-415 10 Decompositions There are bad decompositions. Good ones are: lossless and dependency preserving Faloutsos 15-415 11 Decompositions - lossy: R1(ssn, grade, name, address) R2(c-id, grade) ssn->name, address ssn, c-id -> grade Faloutsos 15-415 12 4

Faloutsos 15-415 Decompositions - lossy: can not recover original table with a join! ssn->name, address ssn, c-id -> grade Faloutsos 15-415 13 Decompositions example of non-dependency preserving S# -> address, status address -> status S# -> address S# -> status Faloutsos 15-415 14 (drill: is it lossless?) Decompositions S# -> address, status address -> status S# -> address S# -> status Faloutsos 15-415 15 5

Faloutsos 15-415 Decompositions - lossless Definition: consider schema R, with FD F. R1, R2 is a lossless join decomposition of R if we always have: An easier criterion? Faloutsos 15-415 16 Decomposition - lossless Theorem: lossless join decomposition if the joining attribute is a superkey in at least one of the new tables Formally: Faloutsos 15-415 17 example: Decomposition - lossless R1 R2 ssn, c-id -> grade ssn->name, address ssn->name, address ssn, c-id -> grade Faloutsos 15-415 18 6

Faloutsos 15-415 Overview - detailed DB design and normalization pitfalls of bad design decomposition lossless join decomp. dependency preserving normal forms Faloutsos 15-415 19 Decomposition - depend. pres. informally: we don t want the original FDs to span two tables - counter-example: S# -> address, status S# -> address S# -> status address -> status Faloutsos 15-415 20 Decomposition - depend. pres. dependency preserving decomposition: S# -> address, status address -> status S# -> address address -> status (but: S#->status?) Faloutsos 15-415 21 7

Faloutsos 15-415 Decomposition - depend. pres. informally: we don t want the original FDs to span two tables. More specifically: the FDs of the canonical cover. Faloutsos 15-415 22 Decomposition - depend. pres. why is dependency preservation good? S# -> address S# -> status S# -> address address -> status (address->status: lost ) Faloutsos 15-415 23 Decomposition - depend. pres. A: eg., record that Philly has status A S# -> address S# -> status S# -> address address -> status (address->status: lost ) Faloutsos 15-415 24 8

Faloutsos 15-415 Decomposition - conclusions decompositions should always be lossless joining attribute -> superkey whenever possible, we want them to be dependency preserving (occasionally, impossible - see STJ example later ) Faloutsos 15-415 25 Overview - detailed DB design and normalization pitfalls of bad design decomposition (-> how to fix the problem) normal forms (-> how to detect the problem) BCNF, 3NF (1NF, 2NF) Faloutsos 15-415 26 We saw how to fix bad schemas - but what is a good schema? Answer: good, if it obeys a normal form, ie., a set of rules. Typically: Boyce-Codd Normal form Faloutsos 15-415 27 9

Faloutsos 15-415 Defn.: Rel. R is in BCNF wrt F, if informally: everything depends on the full key, and nothing but the key semi-formally: every determinant (of the cover) is a candidate key Faloutsos 15-415 28 Example and counter-example: ssn->name, address ssn->name, address ssn, c-id -> grade Faloutsos 15-415 29 Formally: for every FD a->b in F a->b is trivial (a superset of b) or a is a superkey Faloutsos 15-415 30 10

Faloutsos 15-415 Theorem: given a schema R and a set of FD F, we can always decompose it to schemas R1, Rn, so that R1, Rn are in BCNF and the decompositions are lossless. (but, some decomp. might lose dependencies) Faloutsos 15-415 31 How? algorithm in book: for a relation R - for every FD X->A that violates BCNF, decompose to tables (X,A) and (R-A) - repeat recursively eg. TAKES1(ssn, c-id, grade, name, address) ssn -> name, address ssn, c-id -> grade Faloutsos 15-415 32 eg. TAKES1(ssn, c-id, grade, name, address) ssn -> name, address ssn, c-id -> grade grade ssn c-id name address Faloutsos 15-415 33 11

Faloutsos 15-415 ssn, c-id -> grade ssn->name, address ssn->name, address ssn, c-id -> grade Faloutsos 15-415 34 pictorially: we want a star shape grade ssn c-id name address :not in BCNF Faloutsos 15-415 35 pictorially: we want a star shape F A C B or D E G H Faloutsos 15-415 36 12

Faloutsos 15-415 or a star-like: (eg., 2 cand. keys): STUDENT(ssn, st#, name, address) ssn name address = ssn name address st# st# Faloutsos 15-415 37 but not: B F A or D G D E C H Faloutsos 15-415 38 consider the classic case: STJ( Student, Teacher, subject) T-> J S,J -> T is it BCNF? S J T Faloutsos 15-415 39 13

Faloutsos 15-415 STJ( Student, Teacher, subject) T-> J S,J -> T How to decompose it to BCNF? S J T Faloutsos 15-415 40 STJ( Student, Teacher, subject) T-> J S,J -> T 1) R1(T,J) R2(S,J) (BCNF? - lossless? - dep. pres.? ) 2) R1(T,J) R2(S,T) (BCNF? - lossless? - dep. pres.? ) Faloutsos 15-415 41 STJ( Student, Teacher, subject) T-> J S,J -> T 1) R1(T,J) R2(S,J) (BCNF? Y+Y - lossless? N - dep. pres.? N ) 2) R1(T,J) R2(S,T) (BCNF? Y+Y - lossless? Y - dep. pres.? N ) Faloutsos 15-415 42 14

Faloutsos 15-415 STJ( Student, Teacher, subject) T-> J S,J -> T in this case: impossible to have both BCNF and dependency preservation Welcome 3NF! Faloutsos 15-415 43 STJ( Student, Teacher, subject) T-> J S,J -> T S J T informally, 3NF forgives the red arrow in the can. cover Faloutsos 15-415 44 STJ( Student, Teacher, subject) T-> J S,J -> T Formally, a rel. R with FDs F is in 3NF if: for every a->b in F: S J T it is trivial or a is a superkey or b: part of a candidate key Faloutsos 15-415 45 15

Faloutsos 15-415 how to bring a schema to 3NF? two algo s in book: First one: start from ER diagram and turn to tables then we have a set of tables R1,... Rn which are in 3NF for each FD (X->A) in the cover that is not preserved, create a table (X,A) Faloutsos 15-415 46 how to bring a schema to 3NF? two algo s in book: Second one ( synthesis ) take all attributes of R for each FD (X->A) in the cover, add a table (X,A) if not lossless, add a table with appropriate key Faloutsos 15-415 47 Example: R: ABC F: A->B, C->B Q1: what is the cover? Q2: what is the decomposition to 3NF? Faloutsos 15-415 48 16

Faloutsos 15-415 Example: R: ABC F: A->B, C->B Q1: what is the cover? A1: F is the cover Q2: what is the decomposition to 3NF? Faloutsos 15-415 49 Example: R: ABC F: A->B, C->B Q1: what is the cover? A1: F is the cover Q2: what is the decomposition to 3NF? A2: R1(A,B), R2(C,B),... [is it lossless??] Faloutsos 15-415 50 Example: R: ABC F: A->B, C->B Q1: what is the cover? A1: F is the cover Q2: what is the decomposition to 3NF? A2: R1(A,B), R2(C,B), R3(A,C) Faloutsos 15-415 51 17

Faloutsos 15-415 vs BCNF If R is in BCNF, it is always in 3NF (but not the reverse) In practice, aim for BCNF; lossless join; and dep. preservation if impossible, we accept 3NF; but insist on lossless join and dep. preservation Faloutsos 15-415 52 Normal forms - more details why 3 NF? what is 2NF? 1NF? 1NF: attributes are atomic (ie., no setvalued attr., a.k.a. repeating groups ) not 1NF Faloutsos 15-415 53 Normal forms - more details 2NF: 1NF and non-key attr. fully depend on the key counter-example: TAKES1(ssn, c-id, grade, name, address) ssn -> name, address ssn, c-id -> grade grade ssn c-id name address Faloutsos 15-415 54 18

Faloutsos 15-415 Normal forms - more details 3NF: 2NF and no transitive dependencies counter-example: A D B in 2NF, but not in 3NF C Faloutsos 15-415 55 Normal forms - more details 4NF, multivalued dependencies etc: IGNORE in practice, E-R diagrams usually lead to tables in BCNF Faloutsos 15-415 56 Overview - conclusions DB design and normalization pitfalls of bad design decompositions (lossless, dep. preserving) normal forms (BCNF or 3NF) everything should depend on the key, the whole key, and nothing but the key Faloutsos 15-415 57 19