CST221, Dr. Zhen Jiang Normalization & design (see Appendix pages 42-55)



Similar documents
Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Database Normalization. Mohua Sarkar, Ph.D Software Engineer California Pacific Medical Center

Normalization. CIS 3730 Designing and Managing Data. J.G. Zheng Fall 2010

Part 6. Normalization

Normalization. Functional Dependence. Normalization. Normalization. GIS Applications. Spring 2011

The process of database development. Logical model: relational DBMS. Relation

If it's in the 2nd NF and there are no non-key fields that depend on attributes in the table other than the Primary Key.

Database Design Basics

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina


Normalization. Reduces the liklihood of anomolies

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Database Design for the Uninitiated CDS Brownbag Series CDS

Tutorial on Relational Database Design

SQL, PL/SQL FALL Semester 2013

Topics. Database Essential Concepts. What s s a Good Database System? Using Database Software. Using Database Software. Types of Database Programs

Chapter 9: Normalization

Normalization in OODB Design

Introduction to normalization. Introduction to normalization

Unit 3.1. Normalisation 1 - V Normalisation 1. Dr Gordon Russell, Napier University

RELATIONAL DATABASE DESIGN

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

Database Design and Normalization

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

COMPUTING PRACTICES ASlMPLE GUIDE TO FIVE NORMAL FORMS IN RELATIONAL DATABASE THEORY

Normalization. Normalization. First goal: to eliminate redundant data. for example, don t storing the same data in more than one table

Normalization of Database

Normalisation 1. Chapter 4.1 V4.0. Napier University

Normalization in Database Design

1 Class Diagrams and Entity Relationship Diagrams (ERD)

DBMS. Normalization. Module Title?

Relational Database Basics Review

DATABASE NORMALIZATION

The Gravity Model: Derivation and Calibration

1. INTRODUCTION TO RDBMS

Determination of the normalization level of database schemas through equivalence classes of attributes

Once the schema has been designed, it can be implemented in the RDBMS.

Lecture 6. SQL, Logical DB Design

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London Martin Paris Deen London

Fundamentals of Database Systems. Emily Hegge. CTU Online. CS A-03 Fundamentals of Database Systems. Phillip Goin

SQL DATA DEFINITION: KEY CONSTRAINTS. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 7

Creating Tables ACCESS. Normalisation Techniques

Normalisation. Royal Museum for Central Africa 5 June Edward Vanden Berghe

KNOWLEDGE FACTORING USING NORMALIZATION THEORY

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Database Normalization And Design Techniques

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

Lecture 2 Normalization

B2.2-R3: INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS

Exercise 1: Relational Model

C HAPTER 4 INTRODUCTION. Relational Databases FILE VS. DATABASES FILE VS. DATABASES

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Normalisation 6 TABLE OF CONTENTS LEARNING OUTCOMES

BCA. Database Management System

Normal forms and normalization

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

Normalization. Normalization. Normalization. Data Redundancy

Introduction to Computing. Lectured by: Dr. Pham Tran Vu

Chapter 1: Introduction. Database Management System (DBMS) University Database Example

7. Databases and Database Management Systems

EECS 647: Introduction to Database Systems

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

Introduction to Access97 (2): Relationships. by Robin Beaumont. Contents 2. LEARNING OUTCOMES CHECK LIST FOR THE SESSION INTRODUCTION...

Fundamentals of Database System

B.1 Database Design and Definition

MODULE 8 LOGICAL DATABASE DESIGN. Contents. 2. LEARNING UNIT 1 Entity-relationship(E-R) modelling of data elements of an application.

1.204 Lecture 2. Keys

CS2Bh: Current Technologies. Introduction to XML and Relational Databases. The Relational Model. The relational model

LOGICAL DATABASE DESIGN

Lecture Notes on Database Normalization

Data Dictionary and Normalization

Data Modelling And Normalisation

Functional Dependency and Normalization for Relational Databases

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

- Eliminating redundant data - Ensuring data dependencies makes sense. ie:- data is stored logically

Database Design Instructor Resource Guide

Databases and BigData

Functional Dependencies and Normalization

Database Management Systems. Rebecca Koskela DataONE University of New Mexico

Temporal features in SQL:2011

Relational model. Relational model - practice. Relational Database Definitions 9/27/11. Relational model. Relational Database: Terminology

WEBVIEW An SQL Extension for Joining Corporate Data to Data Derived from the World Wide Web

DATABASE DESIGN: NORMALIZATION NOTE & EXERCISES (Up to 3NF)

Foundations of Information Management

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Linear Programming Notes VII Sensitivity Analysis

Taxonomy Enterprise System Search Makes Finding Files Easy

Schema Refinement and Normalization

Participant Tracking Functionality

Chapter 10. Functional Dependencies and Normalization for Relational Databases

Database Dictionary. Provided by GeekGirls.com

Cryptography and Network Security Prof. D. Mukhopadhyay Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

EXTENDED LEARNING MODULE A

Database Management System

The 3 Normal Forms: Copyright Fred Coulson 2007 (last revised February 1, 2009)

Chapter 1: Introduction

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

N o r m a l i z a t i o n

Database Design and Normal Forms

Designing Databases. Introduction

Transcription:

CST221, Dr. Zhen Jiang Normalization & design (see Appendix pages 42-55) Normalization is a process for determining what attributes go into what tables. In order to understand this process, we need to know certain definitions: Functional Dependence: Attribute B is functionally dependent on Attribute A, (or group of attributes A) if for each value of A there is only be one possible value of B We say A---> B (A decides B) or (B is functionally dependent on A) Note: What we consider as a more important thing: parent and children? What is dependant of such a relation in Database? Consider the following tables and list the functional dependencies Student(StNum,SocSecNum,StName,age) Emp(EmpNum,SocSecNum,Name,age,start-date,CurrSalary) Registration(StNum,CNum,grade) We can now define primary key : A single attribute (or minimal group of attributes) that functionally determine all other attributes. If more than one attribute meets this condition, they are considered candidate keys, and one is chosen as the primary key the candidate keys for the Student table are for the Emp table for the registration table For the tables with more than one candidate key, what do you think should be chosen for the primary key? Page 1 of 13

Determining functional dependencies Consider the following table Emp(Enum,Ename,age,acctNum,AcctName,) what are the functional dependencies? Can we be sure? In order to determine the functional dependencies we need to know about the real world model the database is based on. For example: does each employee work on each single account or can there be multiple accounts assigned? Can an account have more than one employee assigned to it? Suppose we know: a) each Account is assigned to one employee, b) an employee can work on more than one Account Now we can determine the functional dependencies. They are: Suppose we have this information for a new housing development Houses(LotNum, Address, builder, model, upgradenum, upgradename, squarefeet) What are the functional dependencies? We can t determine them, unless we know more about the real world situation Suppose we know: 1) Each builder has a list of models they offer, but each model is unique to a specific builder. For example: Builder A may offer 3 models: the Ashley, the Brigham, and the Cambridge Builder B may offer 2 models: the Axton and the Braxton 2) Each model always has the same amount of square feet 3) Each specific house comes with a list of upgrades (for example house with LotNum 101 might have upnum 652 granite countertops upnum 639 hardwood floors upnum 622 finished basement house with LotNum122 might have upnum 639 hardwood floors upnum 622 finished basement Now what are the functional dependencies? Page 2 of 13

The process of normalization involves converting your tables to 1 st then 2 nd then 3 rd Normal forms. 1 st Normal form According to Date's definition of 1NF, a table is in 1NF if and only if it is "isomorphic to some relation", which means, specifically, that it satisfies the following five conditions: 1. There's no top-to-bottom ordering to the rows. 2. There's no left-to-right ordering to the columns. 3. There are no duplicate rows. 4. Every row-and-column intersection contains exactly one value from the applicable domain (and nothing else). 5. All columns are regular [i.e. rows have no hidden components such as row IDs, object IDs, or hidden timestamps]. Chris Date, "What First Normal Form Really Means", pp. 127-8 [4] Violation of any of these conditions would mean that the table is not strictly relational, and therefore that it is not in 1NF. Simply, a table is in first normal form if there is no any duplicate, also called multiple values. Customer Customer ID First Name Surname Telephone Number 123 Robert Ingram 555-861-2025 456 Jane Wright 555-403-1659 555-776-4100 789 Maria Fernandez 555-808-9633 Repeating groups across rows (making multiple records!) This will violate the policy of keys! Repeating groups across columns The designer might attempt to get around this restriction by defining multiple Telephone Number columns: Customer Customer ID First Name Surname Tel. No. 1 Tel. No. 2 Tel. No. 3 123 Robert Ingram 555-861-2025 456 Jane Wright 555-403-1659 555-776-4100 555-403-1659 789 Maria Fernandez 555-808-9633 Difficulty in querying the table. Answering such questions as "Which customers have telephone number X?" and "Which pairs of customers share a telephone number?" is awkward. Inability to enforce uniqueness of Customer-to-Telephone Number links through the RDBMS. Customer 789 might mistakenly be given a Tel. No. 2 value that is exactly the same as her Tel. No. 1 value. Page 3 of 13

Restriction of the number of telephone numbers per customer to three. If a customer with four telephone numbers comes along, we are constrained to record only three and leave the fourth unrecorded. This means that the database design is imposing constraints on the business process, rather than (as should ideally be the case) vice-versa. Repeating groups within a composite value The designer might, alternatively, retain the single Telephone Number column but alter its domain, making it a string of sufficient length to accommodate multiple telephone numbers: Customer Customer ID First Name Surname Telephone Numbers 123 Robert Ingram 555-861-2025 456 Jane Wright 555-403-1659, 555-776-4100 789 Maria Fernandez 555-808-9633 The Telephone Number heading becomes semantically woolly, as it can now represent either a telephone number, a list of telephone numbers, or indeed anything at all. A query such as "Which pairs of customers share a telephone number?" is more difficult to formulate, given the necessity to cater for lists of telephone numbers as well as individual telephone numbers. To ensure/create a first normal form: a) determine the primary key (without considering the repetitions) b) Remove each repeating group from the original table to a new one. Then, copy its relevant key (either the entire key or partial one, or even single key column) in this new table. c) Switch the key role and make the repeating groups the key in new table. Thus, the Customer table would be broken up into two tables. They would be Customer Name( Customer Phone( a) what is the primary key of each table? b) what is the relationship between the two tables? Which tables have foreign keys pointing to the primary key of another table (a foreign key is an attribute in one table after switching which matches (points to) the primary key in another table.) Normalization Exercise 1: Page 4 of 13

For each of the following tables convert to a group of tables in first normal form Show the primary key of each table Show the foreign key of each table (and what table it points to) Sketch the relationships between the tables Student(StNum,StName,SocSecNum,age, clubs, awards) Faculty(FacNum,SocSecNum, degree, rank, committees, papers-written) Page 5 of 13

2 nd Normal Form 2NF was originally defined by E.F. Codd in 1971. [1] A table that is in first normal form (1NF) must meet additional criteria if it is to qualify for second normal form. Specifically: a 1NF table is in 2NF if and only if, given any candidate key K and any attribute A that is not a constituent of a candidate key, A depends upon the whole of K rather than just a part of it. A 1NF table is in 2NF if and only if all its non-prime attributes are functionally dependent on the whole of every candidate key. In slightly more formal terms: a 1NF table is in 2NF if and only if all its non-prime attributes are functionally dependent on the whole of every candidate key. Note that when a 1NF table has no composite key (candidate keys consisting of more than one attribute), the table is automatically in 2NF. Consider the following table - Where each project has many employees and each employee works on many projects. (some sample data is shown) ProjectInfo(PrjNum,PrjName, budget,empnum,empname,hrsworked) P22 Cyclone 50000 E1001 Joe 12 P22 Cyclone 50000 E2002 Pat 50 P21 IMB 20000 E3003 Ed 40 P21 IMB 20000 E2002 Pat 30 P21 IMB 20000 E1001 Joe 70 a)what problems do you see keeping all this data in one table? a)what are the functional dependencies? b)what is the key? (minimum group of attributes that determine all other attributes) Page 6 of 13

c)what are the partial dependencies? Is it in 2 nd N.form? Since there are partial dependencies this table is NOT in 2 nd N.F. To convert to 2 nd N.F a) Remove each partial dependency out of the original table to a new table: Copy the partial relationship to a new table, with its relevant key columns. Then, remove that non-key column from the original table. b) Leave only the attributes that depend on the WHOLE key in the original table. In the above example, we would create 2 new tables, from the partial dependencies Project(PrjNum, ) Emp(EmpNum ) And the original table would include the key and all attributes dependent on the whole key Assignments(PrjNum,EmpNum, ) c) What are the primary keys for each of the tables listed above What are the relationships between the tables? What are the foreign keys? Normalization Exercise 2: Consider the following table- With some sample data shown (assume each order can contain various numbers of different products, but is placed by one customer on one date) Order(OrdNum,date,CustNum,prodNum,num-ordered,unit-price,total-price) O23 1/1/08 C22 P21 5 7.00 35.00 O23 1/1/08 C22 P25 3 21.00 63.00 O24 1/1/08 C22 P23 3 7.00 21.00 O25 2/1/08 C44 P21 11 7.00 77.00 O25 2/1/08 C44 P28 4 99.00 396.00 025 2/1/08 C44 p25 1 21.00 21.00 Page 7 of 13

1) determine the functional dependencies? 2) determine the primary key? 3) determine if there are any partial dependencies? 4) convert to 2 nd N.F. Show all keys (primary and foreign) Exercise 3; Consider the following table involving course info and the grades given in those courses Courses(Cnum,Cname,credits,stuNum,stName,stAge,semester,grade) a)list the functional dependencies b)determine the primary key c)list any partial dependencies d)convert to 2 nd N.F. Show all keys (primary and foreign) and sketch the relationships between the tables Exercise 4: Given the follow table T( A,B,C,D,E,F,G) Suppose we have the following dependencies: A,B---> C; A,B--->F; A--> D; A--->E; B--> G a) What is the primary key? (the minimum attributes that determine all the other attributes) b) What are the partial dependencies? c) Convert to set of tables in 2 nd N.F. Indicate primary and foreign keys sketch the relationship between the tables Page 8 of 13

3 rd Normal Form The third normal form (3NF) is a normal form used in database normalization. 3NF was originally defined by E.F. Codd in 1971.[1] Codd's definition states that a table is in 3NF if and only if both of the following conditions hold: The relation R (table) is in second normal form (2NF) Every non-prime attribute of R is non-transitively dependent (i.e. directly dependent) on every candidate key of R. A non-prime attribute of R is an attribute that does not belong to any candidate key of R.[2] A transitive dependency is a functional dependency in which X Z (X determines Z) indirectly, by virtue of X Y and Y Z (where it is not the case that Y X).[3] Simply saying, a table is in 3 rd N.F if it is in 2 nd N.F and the only dependencies involve the candidate keys. consider the following table: Student(SocSecNum,StNum,name,age) what are the dependencies? Since the only dependencies involved candidate keys then the Student table is in 3 rd N.F Consider the following table: Emp(empNum,SocSecNum,age,deptNum,deptName) (where each employee works in one department) what are the dependencies? what is the primary key? Is it in 2 nd N.form? Is it in 3 rd N. Form? explain explain What problems do you see using the employee table above? Thus even tables in 2 nd Normal form create problems- so we must carry the normalization process further Here is an example of a table that violates 3 rd N.F T(A,B,C,D,E,F,G) if A is the primary key then we know A-> B,C,D,E,F,G if there are other dependencies such as D->E,F C->G that don t involve the key then the table is not in 3 rd N.F To convert to 3 rd N.F Page 9 of 13

For each non-key dependency (a dependency that doesn t involve a candidate key) 1) Copy this partial dependency in the original table to a new table, i.e., the attribute(s) that depend on something other than the primary key. Using table T above we would have: Copy the dependencies D->E,F C->G to two new tables. T1(D,E,F) T2(C,G)) we now have 3 tables -with the following primary and foreign keys T(A,B,C,D,E,F) T1(D,E,F) T2(C,G)) fkey D->T1 fkey C->T2 In the above example: Emp(empNum,SocSecNum,age,deptNum,deptName) the candidate keys are and the dependency doesn t involve these keys We create a table from the dependency that doesn t involve the candidate keys Dept( ) What are the keys of the Dept and Emp tables above (primary and foreign)? What is the relationship between these two tables? Exercise 5: Consider the following table (with some possible data) (each order goes to a single customer and shipped from one warehouse) Order(ordNum,date, warehousenum, warehouseloc, custnum, custname) Or55 1/1/07 w5 NY C55 Acme Or66 1/1/07 w3 WC C66 IBM Or77 4/4/07 w5 NY C77 Intel Or88 4/12/07 w3 WC C55 Acme We can see that this is not a good table because it is about more than one entity It will also have redundant information about customers and warehouses. a) What are the functional dependencies of the order table? What is the primary key? Page 10 of 13

b) why must it be in 2 nd Normal Form? c) Is it in 3 rd Normal Form? Explain (any dependencies not involving candidate keys?) d) Convert to 3 rd N.Form. Indicate primary and foreign keys. Indicate the relationships among these tables? Exercise 6: Consider the following table with the given dependencies. T(A,B,C,D,E,F,G) E->G B->C,A D->A,B,C,E,F,G a)what is the primary key? b)why must it be in 2 nd N.Form? (why can t there be any partial dependencies) c)is it in 3 rd N.Form? explain (any dependencies involving attributes that are not candidate keys?) d) Convert to a set of tables in 3 rd N.Form. Indicate primary and foreign keys. Indicate the relationships among these tables? Exercise 7, Consider the following table and dependencies T ( A,B,C,D,E,F,G,H) D+E---> A,B,C D-->F E--> G,H H -->G a) what is the key? b) are there any partial dependencies? List them? Page 11 of 13

c) Is T1 2 nd Normal form? Convert to a group of tables in 2 nd N.F d) Are the tables in part C in 3 rd N.Form? Explain e) Convert to 3 rd N. Form. For each of your final tables indicate the primary key of each table and any foreign keys Indicate what table each foreign key refers to. Show the relationships between these tables. Exercise 8 Consider the table involving a chain of bookstores, books and publishers (BranchNum, BranchAddr, BkNum, Tittle, PubNum,PubName,List-Price, InStock,list-Price,Selling-Price) Suppose we know the following 1)each branch can sell a book at what ever Selling-price they want 2)each book has a set list price 3)each book is published by a single publisher a)what are the functional dependencies? b)if we keep everything in this one table what is the primary key? (remember the key is the minimum group of attributes that determine all other attributes) c) Is the table in 2 nd N.F.? (remember if we have any partial dependencies it is not in 2 nd N.F) List any dependencies which involve part of the primary key (partial dependencies)? d)convert to a set of tables in 2 nd N.F. (show the primary key of each table) (remember we do this by creating new tables from each partial dependency Page 12 of 13

and then removing the items that are partially dependent on the key) e)is the table in 3 rd N.F? (3 rd N.F: must be in 2 nd N.F and have all dependencies must involve candidate keys) List any dependencies that don t involve the candidate keys f)convert to a set of tables in 3 rd N.F. Show all primary and foreign keys For each foreign key indicate what table it points to. Page 13 of 13