Design of Relational Database Schemas T. M. Murali October 27, November 1, 2010
Plan Till Thanksgiving What are the typical problems or anomalies in relational designs? Introduce the idea of decomposing a relation schema into two smaller schemas. Introduce Boyce-Codd normal form (BCNF), a condition on relational schemas that eliminates anomalies. BCNF stated using the concept of FDs. Use decomposition of schemas to bring them to BCNF. Define another type of constraint called Multivalued Dependencies (MDs). Define normal forms that eliminate MDs.
Closures of FDs Given a relation R and a set F of FDs that hold in R the closure {F} + is the set of all FDs that follow from R.
Closures of FDs Given a relation R and a set F of FDs that hold in R the closure {F} + is the set of all FDs that follow from R. Recall: An FD S follows from a set of FDs T if every relation instance that satisfies all the FDs in T also satisfies S. S = {A C} follows from T = {A B, B C}.
Computing Closures of FDs To compute the closure of a set of FDs, repeatedly apply Armstrong s Axioms until you cannot find any new FDs:
Computing Closures of FDs To compute the closure of a set of FDs, repeatedly apply Armstrong s Axioms until you cannot find any new FDs: Reflexivity: If Y X, then X Y Augmentation: If X Y then XZ YZ for any attribute Z. Transitivity: If X Y and Y Z then X Z.
Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A B, B C}. {F} + is
Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A B, B C}. {F} + is {A B, B C, A C, AC B, AB C}
Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A B, B C}. {F} + is {A B, B C, A C, AC B, AB C} F = {AB C, BC A, AC B}. {F} + is
Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A B, B C}. {F} + is {A B, B C, A C, AC B, AB C} F = {AB C, BC A, AC B}. {F} + is {AB C, BC A, AC B}
Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A B, B C}. {F} + is {A B, B C, A C, AC B, AB C} F = {AB C, BC A, AC B}. {F} + is {AB C, BC A, AC B} F = {A B, B C, C D}. {F} + is
Examples of Computing Closures of FDs Let us include only completely non-trivial FDs in these examples, with a single attribute on the right. Assume that there are no attributes other than those mentioned in the FDs. F = {A B, B C}. {F} + is {A B, B C, A C, AC B, AB C} F = {AB C, BC A, AC B}. {F} + is {AB C, BC A, AC B} F = {A B, B C, C D}. {F} + is {A B, B C, C D, A C, A D, B D,...}
Closures of FDs vs. Closures of Attributes Both algorithms take as input a relation R and a set of FDs F. Closure of FDs: Computes {F} +, the set of all FDs that follow from F. Output is a set of FDs. Output may contain an exponential number of FDs. Closure of attributes: In addition, takes a set {A1, A 2,..., A n } of attributes as input. Computes {A1, A 2,..., A n } +, the set of all attributes B such that the A 1 A 2... A n B follows from F. Output is a set of attributes. Output may contain at most the number of attributes in R.
Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = π Attributes (R). The projection F S of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing F S : 1. Compute {F} +. 2. F S is the set of all FDs in {F} + that involve only the attributes in S. Book describes a different algorithm on page 82 (Chapter 3.2.8). Book s algorithm also shows how to compute a minimal basis of F S.
Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = π Attributes (R). The projection F S of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing F S : 1. Compute {F} +. 2. F S is the set of all FDs in {F} + that involve only the attributes in S. Book describes a different algorithm on page 82 (Chapter 3.2.8). Book s algorithm also shows how to compute a minimal basis of F S. R(A, B, C, D), F = {A B, B C, C D}. Which FDs hold in S(A, C, D)?
Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = π Attributes (R). The projection F S of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing F S : 1. Compute {F} +. 2. F S is the set of all FDs in {F} + that involve only the attributes in S. Book describes a different algorithm on page 82 (Chapter 3.2.8). Book s algorithm also shows how to compute a minimal basis of F S. R(A, B, C, D), F = {A B, B C, C D}. Which FDs hold in S(A, C, D)? {F} + is
Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = π Attributes (R). The projection F S of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing F S : 1. Compute {F} +. 2. F S is the set of all FDs in {F} + that involve only the attributes in S. Book describes a different algorithm on page 82 (Chapter 3.2.8). Book s algorithm also shows how to compute a minimal basis of F S. R(A, B, C, D), F = {A B, B C, C D}. Which FDs hold in S(A, C, D)? {F} + is {A B, B C, C D, A C, A D, B D}
Projecting Sets of FDs Suppose we have a relation R and a set of FDs F. Let S be a relation obtained by projecting R into a subset of the attributes of R, i.e., S = π Attributes (R). The projection F S of F is the set of FDs that follow from F and hold in S (involve only attributes of S). Algorithm for computing F S : 1. Compute {F} +. 2. F S is the set of all FDs in {F} + that involve only the attributes in S. Book describes a different algorithm on page 82 (Chapter 3.2.8). Book s algorithm also shows how to compute a minimal basis of F S. R(A, B, C, D), F = {A B, B C, C D}. Which FDs hold in S(A, C, D)? {F} + is {A B, B C, C D, A C, A D, B D} F S is {C D, A C, A D}.
Design of Relational Database Schemas Careless design of relational schemas can cause problems. Example: Combining the relation for a many-many relationship with the relation for one of its entity sets causes redundancy.
Design of Relational Database Schemas Careless design of relational schemas can cause problems. Example: Combining the relation for a many-many relationship with the relation for one of its entity sets causes redundancy. Suppose we combine the schemas Courses(Number, DepartmentName, CourseName, Classroom, Enrollment) and Take(StudentName, Address, Number, DepartmentName) into one relation Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address).
Anomalies in Relational Schemas Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address)
Anomalies in Relational Schemas Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) Redundancy: information is repeated unnecessarily in several tuples. Update anomalies: We change information in one tuple but leave the old information in another tuple. Insertion anomalies: It is not possible to store some information unless some other, unrelated information is stored as well. Deletion anomalies: If a set of values becomes empty, we may lose other information as a side effect.
Decomposing Relations Accepted way to eliminate anomalies is to decompose relations. Given a relation R(A 1, A 2,..., A n ), two relations S(B 1, B 2,..., B m ) and T (C 1, C 2,..., C k ) form a decomposition of R if 1. the attributes of S and T together make up the attributes of R, i.e., {A 1, A 2,..., A n } = {B 1, B 2,..., B m } {C 1, C 2,..., C k }. 2. the tuples in S are the projections into {B 1, B 2,..., B m } of the tuples in R, i.e. S π B1,B 2,...,B m (R). 3. the tuples in T are the projections into {C 1, C 2,..., C k } of the tuples in R, i.e., T π C1,C 2,...,C k (R).
Example of Decomposition Decompose Courses into Courses1(Number, DepartmentName, CourseName, Classroom, Enrollment) and Courses2(Number, DepartmentName, StudentName, Address). Are the anomalies removed? Redundancy Update Insertion Deletion
Boyce-Codd Normal Form Condition on the FDs in a relation that guarantees that anomalies do not exist.
Boyce-Codd Normal Form Condition on the FDs in a relation that guarantees that anomalies do not exist. A relation R is in Boyce-Codd Normal Form (BCNF) if and only if for every non-trivial FD A 1 A 2... A n B for R, {A 1, A 2,..., A n } is a superkey for R. Informally, the left side of every non-trivial FD must be a superkey. A relation R violates BCNF if it has an FD such that the attributes of the left side of an FD do not form a superkey. In other words, there is some key of R such that only some (not all) of the attributes in this key appear on the left side of the FD.
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey.
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys!
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF?
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName CourseName Number DepartmentName Classroom Number DepartmentName Enrollment
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName CourseName Number DepartmentName Classroom Number DepartmentName Enrollment What is {Number, DepartmentName} +?
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName CourseName Number DepartmentName Classroom Number DepartmentName Enrollment What is {Number, DepartmentName} +? {Number, DepartmentName, Coursename, Classroom, Enrollment}.
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName CourseName Number DepartmentName Classroom Number DepartmentName Enrollment What is {Number, DepartmentName} +? {Number, DepartmentName, Coursename, Classroom, Enrollment}. Therefore, the key is {Number, DepartmentName, StudentName, Address}.
Checking for BCNF Violations 1. List all FDs. 2. Ensure that left hand side of each FD is a superkey. We have to first find all the keys! Is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address) in BCNF? FDs are Number DepartmentName CourseName Number DepartmentName Classroom Number DepartmentName Enrollment What is {Number, DepartmentName} +? {Number, DepartmentName, Coursename, Classroom, Enrollment}. Therefore, the key is {Number, DepartmentName, StudentName, Address}. The relation is not in BCNF.
Decomposition into BCNF Suppose R is a relation schema that violates BCNF. We can decompose R into a set S of new relations such that 1. each relation in S is in BCNF and 2. we can recover R from the relations in S, i.e., we can reconstruct R exactly from the relations in S.
BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X 1 X 2... X m Y violates BCNF.
BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X 1 X 2... X m Y violates BCNF. 1. Compute {X 1 X 2..., X m } +.
BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X 1 X 2... X m Y violates BCNF. 1. Compute {X 1 X 2..., X m } +. 2. Decompose R into two relations R 1 and R 2 with schemas R 1 : all the attributes in {X 1, X 2..., X m } + R 2 : all the attributes on the left side of the violating FD and all the attributes of R not in {X 1, X 2,..., X m } +, i.e., A {X 1, X 2..., X m } + {X 1, X 2..., X m }.
BCNF Normalisation Algorithm Let A be the set of all attributes of R. Let F be the set of all FDs of R. Suppose the FD X 1 X 2... X m Y violates BCNF. 1. Compute {X 1 X 2..., X m } +. 2. Decompose R into two relations R 1 and R 2 with schemas R 1 : all the attributes in {X 1, X 2..., X m } + R 2 : all the attributes on the left side of the violating FD and all the attributes of R not in {X 1, X 2,..., X m } +, i.e., A {X 1, X 2..., X m } + {X 1, X 2..., X m }. 3. Find FDs in R 1 and R 2 and decompose them if they are not in BCNF 4. For i = 1, 2 Compute F Ri the projection of the FDs in F into R i. If any of the FDs in F Ri violates BCNF, decompose R i recursively.
Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName CourseName Classroom Enrollment.
Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName CourseName Classroom Enrollment. What is {Number, DepartmentName} +?
Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName CourseName Classroom Enrollment. What is {Number, DepartmentName} +? {Number, DepartmentName, CourseName, Classroom, Enrollment
Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName CourseName Classroom Enrollment. What is {Number, DepartmentName} +? {Number, DepartmentName, CourseName, Classroom, Enrollment Decompose Courses into Courses1(Number, DepartmentName, CourseName, Classroom, Enrollment) and Courses2(Number, DepartmentName, StudentName, Address).
Decomposing Courses Schema is Courses(Number, DepartmentName, CourseName, Classroom, Enrollment, StudentName, Address). BCNF-violating FD is Number DepartmentName CourseName Classroom Enrollment. What is {Number, DepartmentName} +? {Number, DepartmentName, CourseName, Classroom, Enrollment Decompose Courses into Courses1(Number, DepartmentName, CourseName, Classroom, Enrollment) and Courses2(Number, DepartmentName, StudentName, Address). Are there any BCNF violations in the two new relations?
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId)
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key?
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}.
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation?
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes.
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID Name FavouriteAdvisorId to decompose.
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID Name FavouriteAdvisorId to decompose. {ID} + is
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID Name FavouriteAdvisorId to decompose. {ID} + is {ID, Name, FavouriteAdvisorId}.
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID Name FavouriteAdvisorId to decompose. {ID} + is {ID, Name, FavouriteAdvisorId}. Schemas for new relations are
Another Example of Decomposition (1) Schema is Students(Id, Name, AdvisorId, AdvisorName, FavouriteAdvisorId) What are the FDs? ID Name FavouriteAdvisorId AdvisorId AdvisorName What is the key? {ID, AdvisorId}. Is there a BCNF violation? Yes. Use ID Name FavouriteAdvisorId to decompose. {ID} + is {ID, Name, FavouriteAdvisorId}. Schemas for new relations are Students1(ID, Name, FavouriteAdvisorId) Students2(ID, AdvisorId, AdvisorName)
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)?
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF.
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)?
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId AdvisorName Does it violate BCNF?
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId AdvisorName to decompose. {AdvisorId} + is
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId AdvisorName to decompose. {AdvisorId} + is {AdvisorId, AdvisorName}.
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId AdvisorName to decompose. {AdvisorId} + is {AdvisorId, AdvisorName}. Schemas for new relations are
Another Example of Decomposition (2) What are the FDs in Student1(ID, Name, FavouriteAdvisorId)? There are none that violate BCNF. What are the FDs in Students2(ID, AdvisorId, AdvisorName)? AdvisorId AdvisorName Does it violate BCNF? Yes. Repeat the decomposition process. Use AdvisorId AdvisorName to decompose. {AdvisorId} + is {AdvisorId, AdvisorName}. Schemas for new relations are Students2(ID, AdvisorId) Students3(AdvisorId, AdvisorName).
Examples (Problem 1 of Handout 3) Apply the BCNF normalisation algorithm to Inventory relation. (Problem 2 of Handout 3) Apply the BCNF normalisation algorithm to Concerts relation. For both problems, try all possible choices of FDs to start the normalisation with. Compare the advantages and disadvantages of the choices.
BCNFs and Two-Attribute Relations True or False: Every two-attribute relation R(A, B) is in BCNF.
BCNFs and Two-Attribute Relations True or False: Every two-attribute relation R(A, B) is in BCNF. The statement is true. Why?
BCNFs and Two-Attribute Relations True or False: Every two-attribute relation R(A, B) is in BCNF. The statement is true. Why? Consider four possible cases: 1. There are no non-trivial FDs. 2. A B is the only non-trivial FD. 3. B A is the only non-trivial FD. 4. Both A B and B A hold in R.
Decomposition into BCNF Suppose R is a relation schema that violates BCNF. We can decompose R into a set S of two or more new relations such that 1. each relation in S is in BCNF and 2. we can recover R from the relations in S, i.e., we can reconstruct R from the relations in S. How does the normalisation algorithm guarantee the second condition? In general, what properties does the decomposition satisfy?
Desirable Properties of a Decomposition 1. Eliminate anomalies. 2. Recover the original relation exactly from the relations it is decomposed into. 3. When we reconstruct the original relation, the result will satisfy the original FDs.
Desirable Properties of a Decomposition 1. Eliminate anomalies. 2. Recover the original relation exactly from the relations it is decomposed into. 3. When we reconstruct the original relation, the result will satisfy the original FDs. BCNF decomposition algorithm:
Desirable Properties of a Decomposition 1. Eliminate anomalies. 2. Recover the original relation exactly from the relations it is decomposed into. 3. When we reconstruct the original relation, the result will satisfy the original FDs. BCNF decomposition algorithm: gives us properties 1 and 2 but not 3. 3NF decomposition algorithm: gives us properties 2 and 3 but not 1. (Discuss in the next class.)
Candidate Normalisation Algorithm Every two-attribute relation is in BCNF.
Candidate Normalisation Algorithm Every two-attribute relation is in BCNF. Can we bring any relation R into BCNF by arbitrarily decomposing it into two-attribute relations?
Candidate Normalisation Algorithm Every two-attribute relation is in BCNF. Can we bring any relation R into BCNF by arbitrarily decomposing it into two-attribute relations? No, since we may not be able to recover R correctly from the decomposition.
Joining Relations R A B a 1 b 1 a 2 b 1 a 2 b 2 S B C b 1 c 1 b 2 c 2 b 2 c 3 = T A B C a 1 b 1 c 1 a 2 b 1 c 1 a 2 b 2 c 2 a 2 b 2 c 3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if
Joining Relations R A B a 1 b 1 a 2 b 1 a 2 b 2 S B C b 1 c 1 b 2 c 2 b 2 c 3 = T A B C a 1 b 1 c 1 a 2 b 1 c 1 a 2 b 2 c 2 a 2 b 2 c 3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if the attributes of T are the union of the attributes of R and S,
Joining Relations R A B a 1 b 1 a 2 b 1 a 2 b 2 S B C b 1 c 1 b 2 c 2 b 2 c 3 = T A B C a 1 b 1 c 1 a 2 b 1 c 1 a 2 b 2 c 2 a 2 b 2 c 3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if the attributes of T are the union of the attributes of R and S, every tuple t T is the join of two tuples r R and s S that agree on the attribute B, i.e., t agrees with r on all the attributes in R and with s on all attributes in S,
Joining Relations R A B a 1 b 1 a 2 b 1 a 2 b 2 S B C b 1 c 1 b 2 c 2 b 2 c 3 = T A B C a 1 b 1 c 1 a 2 b 1 c 1 a 2 b 2 c 2 a 2 b 2 c 3 Let R and S be two relations with one common attribute B. Relation T is the natural join of R and S, denoted R S if and only if the attributes of T are the union of the attributes of R and S, every tuple t T is the join of two tuples r R and s S that agree on the attribute B, i.e., t agrees with r on all the attributes in R and with s on all attributes in S, T contains all tuples formed in this manner.
Recovering Information from a Decomposition Suppose R is a relation schema that violates BCNF. The BCNF decomposition algorithm decomposes R into a set {S 1, S 2,... S k } of new relations such that 1. each relation S i, 1 i k is in BCNF and 2. the decomposition of R into {S 1, S 2,... S k } is a lossless-join decomposition, i.e., R = S 1 S 2... S k.
Recovering Information from a Decomposition Suppose R is a relation schema that violates BCNF. The BCNF decomposition algorithm decomposes R into a set {S 1, S 2,... S k } of new relations such that 1. each relation S i, 1 i k is in BCNF and 2. the decomposition of R into {S 1, S 2,... S k } is a lossless-join decomposition, i.e., R = S 1 S 2... S k. 2.1 Every tuple in R is a tuple in S 1 S 2... S k. 2.2 Every tuple in S 1 S 2... S k is in R.
Example of Lossless-Join Decomposition Relation schema is R(A, B, C). FD is B C.
Example of Lossless-Join Decomposition Relation schema is R(A, B, C). FD is B C. Relations in BCNF are S(A, B) and T (B, C).
Example of Lossless-Join Decomposition Relation schema is R(A, B, C). FD is B C. Relations in BCNF are S(A, B) and T (B, C). Prove that R = S T 1. Every tuple in R is in S T. 2. Every tuple in S T is in R.
Example of Lossless-Join Decomposition Relation schema is R(A, B, C). FD is B C. Relations in BCNF are S(A, B) and T (B, C). Prove that R = S T 1. Every tuple in R is in S T. 2. Every tuple in S T is in R. What if FD were A C and we decomposed R into S and T as above?
Example of Lossless-Join Decomposition Relation schema is R(A, B, C). FD is B C. Relations in BCNF are S(A, B) and T (B, C). Prove that R = S T 1. Every tuple in R is in S T. 2. Every tuple in S T is in R. What if FD were A C and we decomposed R into S and T as above? S T contains tuples not in R! In general, if R s attributes are X Y Z and Y Z holds in R, then R = π X Y (R) π Y Z (R).
The Chase Test for Lossless Join Suppose we have a relation R, a set F of FDs that hold in R, and a decomposition of R into relations S 1, S 2,..., S k. We have forgotten how we decomposed R. Is there a way to check R equals the natural join of S 1, S 2,..., S k? 1. Every tuple in R is a tuple in S 1 S 2... S k. 2. Every tuple in S 1 S 2... S k is in R.
The Chase Test for Lossless Join Suppose we have a relation R, a set F of FDs that hold in R, and a decomposition of R into relations S 1, S 2,..., S k. We have forgotten how we decomposed R. Is there a way to check R equals the natural join of S 1, S 2,..., S k? 1. Every tuple in R is a tuple in S 1 S 2... S k. 2. Every tuple in S 1 S 2... S k is in R. R S A B C D A 1 S 2 S 3 S 1 S 2 S 3 D A C B C D A B C D
The Chase Test for Lossless Join S R A B C D A 1 S 2 S 3 S 1 S 2 S 3 D A C B C D A B C D t 1. Natural join is associate and commutative: For each tuple t in S 1 S 2... S k, projection of t into S i is a tuple in π Si (R), for every 1 i k.
u The Chase Test for Lossless Join S R A B C D A 1 S 2 S 3 S 1 S 2 S 3 D A C B C D A B C D u 1. Natural join is associate and commutative: For each tuple t in S 1 S 2... S k, projection of t into S i is a tuple in π Si (R), for every 1 i k. 2. Every tuple u in R is surely in π S1 (R) π S2 (R)... π Sk (R).
t The Chase Test for Lossless Join S R A B C D A 1 S 2 S 3 S 1 S 2 S 3 D A C B C D A B C D t 1. Natural join is associate and commutative: For each tuple t in S 1 S 2... S k, projection of t into S i is a tuple in π Si (R), for every 1 i k. 2. Every tuple u in R is surely in π S1 (R) π S2 (R)... π Sk (R). 3. How can we show that if the FDs in F hold in R, then every tuple in π S1 (R) π S2 (R)... π Sk (R) is also a tuple in R? Use the Chase test.
t 1 t 2 Steps in Chase Test S R A B C D A 1 S 2 S 3 S 1 S 2 S 3 D A C B C D A B C D t t 3 If a tuple t is in π S1(R) π S2(R)... π Sk (R), then there must be tuples t1, t 2,... t k in R such that t is the join of the projections of t i into S i, for every 1 i k. each ti agrees with t in the attributes in S i but has unknown values for the attributes not in S i. Using the FDs in F, we want to prove that t must be equal to some t i.
t 1 t 2 Steps in Chase Test S R A B C D A 1 S 2 S 3 S 1 S 2 S 3 D A C B C D A B C D t t 3 If a tuple t is in π S1(R) π S2(R)... π Sk (R), then there must be tuples t1, t 2,... t k in R such that t is the join of the projections of t i into S i, for every 1 i k. each ti agrees with t in the attributes in S i but has unknown values for the attributes not in S i. Using the FDs in F, we want to prove that t must be equal to some t i. 1. Draw a tableau to indicate which attributes we know the values of in the tuples t 1, t 2,... t k. 2. Use FDs to equate unknown attributes of these tuples. 3. When no FD can be applied, check if t is one of the tuples in the tableau.
Example of Chase Test Work out following two cases: 1. Decomposition of R(A, B, C, D) into S 1 (A, D), S 2 (A, C) and S 3 (B, C, D) with FDs A B, B C, and CD A. 2. Same decomposition with FD B AD Work out examples in Handout 3: 1. (Problem 1, part 6) Apply Chase test to decomposition of Inventory into Inventory1 and Inventory2. 2. (Problem 1, part 7) Modify one of the attributes in either Inventory1 or Inventory2 to obtain a lossless-join decomposition. Verify using the chase test. 3. (Problem 2, part 8(ii)) Apply Chase test to decomposition of Concerts into Concerts1 and Concerts2. 4. Apply Chase test to decomposition of Concerts into Concerts1 and Concerts3(City, Song, Album) and Concerts4(City, Year).