Relational Database Design: FD s & BCNF



Similar documents
Database Design and Normalization

Lecture Notes on Database Normalization

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Schema Refinement and Normalization

CS143 Notes: Normalization Theory

Design of Relational Database Schemas

Database Design and Normal Forms

Schema Refinement, Functional Dependencies, Normalization

Relational Database Design

Functional Dependencies and Finding a Minimal Cover

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Functional Dependencies and Normalization

Functional Dependencies

Chapter 7: Relational Database Design

Relational Database Design Theory

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

Database Management Systems. Redundancy and Other Problems. Redundancy

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Chapter 7: Relational Database Design

Advanced Relational Database Design

Theory of Relational Database Design and Normalization

Chapter 8. Database Design II: Relational Normalization Theory

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

Introduction to Database Systems. Normalization

SQL Tables, Keys, Views, Indexes

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Chapter 10. Functional Dependencies and Normalization for Relational Databases

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Normalization in Database Design

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

Functional Dependency and Normalization for Relational Databases

Database Management System

Database Systems Concepts, Languages and Architectures

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

CSCI-GA Database Systems Lecture 7: Schema Refinement and Normalization

How To Find Out What A Key Is In A Database Engine

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations

Theory of Relational Database Design and Normalization

Determination of the normalization level of database schemas through equivalence classes of attributes

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Limitations of DB Design Processes

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

SQL NULL s, Constraints, Triggers

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Relational Normalization Theory (supplemental material)

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Lecture 6. SQL, Logical DB Design

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation

The Relational Model. Why Study the Relational Model?

Normalization of database model. Pazmany Peter Catholic University 2005 Zoltan Fodroczi

Boyce-Codd Normal Form

Lecture 2 Normalization

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

DATABASE NORMALIZATION

Database Constraints and Design

Normalization. CIS 331: Introduction to Database Systems

Database Sample Examination

Outline. Data Modeling. Conceptual Design. ER Model Basics: Entities. ER Model Basics: Relationships. Ternary Relationships. Yanlei Diao UMass Amherst

EECS 647: Introduction to Database Systems

The core theory of relational databases. Bibliography

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

SQL Programming. CS145 Lecture Notes #10. Motivation. Oracle PL/SQL. Basics. Example schema:

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. Introduction to Databases. Why databases? Why not use XML?

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Entity Relationship Diagram

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Part I: Entity Relationship Diagrams and SQL (40/100 Pt.)

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

3. Database Design Functional Dependency Introduction Value in design Initial state Aims

Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354

Normalisation in the Presence of Lists

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Conceptual Design Using the Entity-Relationship (ER) Model

Introduction to Databases

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

model license person owns car report-number participated damage-amount E-R diagram for a Car-insurance company.

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer

Design Theory for Relational Databases: Functional Dependencies and Normalization

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Database Design and Normalization

There are five fields or columns, with names and types as shown above.

Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks]

MCQs~Databases~Relational Model and Normalization

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

A Web-Based Environment for Learning Normalization of Relational Database Schemata

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

Entity Relationship Model

Transcription:

CS145 Lecture Notes #5 Relational Database Design: FD s & BCNF Motivation Automatic translation from E/R or ODL may not produce the best relational design possible Sometimes database designers like to start directly with a relational design, in which case the design could be really bad Notation,,... denote relations denotes the set of all attributes in,,... denote attributes,,... denote sets of attributes Functional Dependencies A functional dependency (FD) has the form, where and are sets of attributes in a relation Formally, means that whenever two tuples in agree on all the attributes of, they must also agree on all the attributes of FD s in Student(SID, SS#, name, CID, grade) Some FD s are more interesting than others: Trivial FD: where is a subset of Nontrivial FD: where is not a subset of Completely nontrivial FD: where and do not overlap Jun Yang 1 CS145 Spring 1999

Once we declare that an FD holds for a relation, this FD becomes a part of the relation schema Every instance of must satisfy this FD This FD should better make sense in the real world! A particular instance of may coincidentally satisfy some FD But this FD may not hold for in general name SID in Student? FD s are closely related to: Multiplicity of relationships Queens, Overlords, Zerglings Keys SID, CID is a key of Student Another definition of key: A set of attributes is a key for if (1) ; i.e., is a superkey (2) No proper subset of satisfies (1) Closures of Attribute Sets Given, a set of FD s that holds in, and a set of attributes in : The closure of with respect to (denoted ) is the set of all attributes that are functionally determined by Yet another definition of key: A set of attributes is a key for if (1) ; i.e., is a superkey (2) No proper subset of satisfies (1) Question: Given and, what is the closure of? Start with If is a given FD and is already inside the closure, then also add to the closure Repeat until the closure cannot be changed SID CID Student Jun Yang 2 CS145 Spring 1999

Question: Given and, what are the keys of? Brute-force approach: for every subset of, compute its closures and see if it covers Trick: start with small subsets; if, no need to try any superset of Trick: if does not appear on the right-hand side of any FD, then every key must contain what are the keys of Student? Closures of FD Sets Given and a set of FD s that holds in : The closure of in (denoted ) is the set of all FD s in that are logically implied by Question: Given and, is implied by? (Or, given and, is in?) Method 1: compute and check if it contains Method 2: try to prove using Armstrong s Axioms: Reflexivity: if, then Augmentation: if, then for any set Transitivity: if and, then or using other rules that follow from the axioms: Splitting: if, then and Combining: if and, then prove that SS# CID name grade Jun Yang 3 CS145 Spring 1999

Basis When specifying FD s for a relation : Obviously we do not want to list all FD s that hold in Instead, it suffices to specify a set of FD s from which all other FD s will follow logically; this set of FD s is a basis for the FD s in In fact, we should specify a minimal basis Every FD in the minimal basis is necessary; it cannot be proven using other FD s in the minimal basis Sounds tough, but in practice the minimality comes naturally There might be multiple minimal bases what is a minimal basis for the FD s in Student? BCNF (Boyce-Codd Normal Form) A relation is in BCNF if: For every nontrivial FD in, is a superkey In other words: All FD s follow from the fact key everything Intuition: When an FD is not of the form superkey other attributes, then there is typically an attempt to cram too much into one relation; this relation needs to be decomposed SID SS# is a BCNF violation the SID/SS# association is repeated multiple times BCNF Decomposition Algorithm Start with the relation in question Repeat until no BCNF violation can be found in any of your relations: Find a BCNF violation in Decompose into two relations: One with as its attributes (i.e., everything in the FD) One with as its attributes (i.e., left side of the FD plus everything not in the FD) Jun Yang 4 CS145 Spring 1999

Students(SID,SS#,name,CID,grade) SID SS# SS# name SS# SID SID CID grade In general, you may need to decompose several times To check for BCNF violations in, we need to know: All keys of A basis for the FD s that hold in Do we need to check any FD that is not in the basis but follows from the basis? No. If there is no BCNF violation in a basis, then there is no BCNF violation at all (why?) After the first iteration, the algorithm requires FD s to be projected onto smaller relations Be careful when deriving an FD basis for a smaller relation: don t miss any FD that follows from the FD s in the original relation (see textbook for an exhaustive algorithm; can usually do it with common sense though) SID name An optimization: instead of decomposing on any BCNF violation, decompose on This strategy avoids excessive fragmentation decompose on SID SS# name instead of SID SS# BCNF Good Design? BCNF removes all redundancies caused by FD s BCNF can decompose relations too much and complicate queries and constraint enforcement if we decompose Student on SID SS#, it will be difficult to enforce SS# name BCNF does not remove all redundancies in general Student(SID, club, CID) has no FD s, but still redundancy Actually this example is not good: it turns out that we can enforce SS# name by enforcing SS# SID and SID name independently in two different relations. For an example that makes more sense, stay tuned for the next lecture on the theory of decomposition. Jun Yang 5 CS145 Spring 1999