Chapter 8. Database Design II: Relational Normalization Theory

Save this PDF as:

Size: px
Start display at page:

Download "Chapter 8. Database Design II: Relational Normalization Theory"

Transcription

1 Chapter 8 Database Design II: Relational Normalization Theory The E-R approach is a good way to start dealing with the complexity of modeling a real-world enterprise. However, it is only a set of guidelines that requires considerable expertise and intuition, and it can lead to several alternative designs for the same enterprise. Unfortunately, the E-R approach does not include the criteria or tools to help evaluate alternative designs and suggest improvements. In this chapter, we present the relational normalization theory, which includes a set of concepts and algorithms that can help with the evaluation and refinement of the designs obtained through the E-R approach. The main tool used in normalization theory is the notion of functional dependency (and, to a lesser degree, join dependency). Functional dependency is a generalization of the key dependencies in the E-R approach whereas join dependencies do not have an E-R counterpart. Both types of dependency are used by designers to spot situations in which the E-R design unnaturally places attributes of two distinct entity types into the same relation schema. These situations are characterized in terms of normal forms, whence comes the term normalization theory. Normalization theory forces relations into an appropriate normal form using decompositions, which break up schemas involving unhappy unions of attributes of unrelated entity types. Because of the central role that decompositions play in relational design, the techniques that we are about to discuss are sometimes also called relational decomposition theory. 8.1 THE PROBLEM OF REDUNDANCY Example is the best way to understand the potential problems with relational designs based on the E-R approach. Consider the CREATE TABLE Person statement (5.1) on page 93. Recall that this relation schema was obtained by direct translation from the E-R diagram in Figure 5.1. The first indication of something wrong with this translation was the realization that SSN is not a key of the resulting Person relation. Instead, the key is a combination (SSN, Hobby). In other words, the attribute SSN does not uniquely identify the tuples in the Person relation even though it 211

2 212 Chapter 8 Database Design II: Relational Normalization Theory does uniquely identify the entities in the Person entity set. Not only is this counterintuitive, but it also has a number of undesirable effects on the instances of the Person relation schema. To see this, we take a closer look at the relation instance shown in Figure 5.2, page 93. Notice that John Doe and Mary Doe are both represented by multiple tuples and that their addresses, names, and Ids occur multiple times as well. Redundant storage of the same information is apparent here. However, wasted space is the least of the problems. The real issue is that when database updates occur, we must keep all the redundant copies of the same data consistent with each other, and we must do it efficiently. Specifically, we can identify the following problems: Update anomaly. If John Doe moves to 1 Hill Top Dr., updating the relation in Figure 5.2 requires changing the address in both tuples that describe the John Doe entity. Insertion anomaly. Suppose that we decide to add Homer Simpson to the Person relation, but Homer s information sheet does not specify any hobbies. One way around this problem might be to add the tuple , Homer Simpson, Fox 5 TV, NULL that is, to fill in the missing field with NULL. However, Hobby is part of the primary key, and most DBMSs do not allow null values in primary keys. Why? For one thing, DBMSs generally maintain an index on the primary key, and it is not clear how the index should refer to the null value. Assuming that this problem can be solved, suppose that a request is made to insert , Homer Simpson, Fox 5 TV, acting. Should this new tuple just be added, or should it replace the existing tuple , Homer Simpson, Fox 5 TV, NULL? A human will most likely choose to replace it, because humans do not normally think of hobbies as a defining characteristic of a person. However, how does a computer know that the tuples with primary key , NULL and , acting refer to the same entity? (Recall that the information about which tuple came from which entity is lost in the translation!) Redundancy is at the root of this ambiguity: If Homer were described by at most one tuple, only one course of action would be possible. Deletion anomaly. Suppose that Homer Simpson is no longer interested in acting. How are we to delete this hobby? We can, of course, delete the tuple that talks about Homer s acting hobby. However, since there is only one tuple that refers to Homer (see Figure 5.2), this throws out perfectly good information about Homer s Id and address. To avoid this loss of information, we can try to replace acting with NULL. Unfortunately, this again raises the issue of nulls in primary key attributes. Once again, redundancy is the culprit. If only one tuple could possibly describe Homer, the attribute Hobby would not be part of the key. For convenience, we sometimes use the term update anomalies to refer to all of the above anomaly types.

3 8.2 Decompositions 213 Figure 8.1 Decomposition of the Person relation shown in Figure 5.2. SSN Hobby SSN Name Address John Doe 123 Main St Mary Doe 7 Lake Dr Bart Simpson Fox 5 TV (a) Person stamps hiking coins hiking skating acting (b) Hobby 8.2 DECOMPOSITIONS The problems caused by redundancy wasted storage and anomalies can be easily fixed using the following simple technique. Instead of having one relation describe all that is known about persons, we can use two separate relation schemas. Person1(SSN, Name, Address) Hobby(SSN, Hobby) (8.1) Projecting the relation in Figure 5.2 on each of these schemas yields the result shown in Figure 8.1. The new design has the following important properties: 1. The original relation of Figure 5.2 is exactly the natural join of the two relations in Figure 8.1. In fact, one can prove that this property is not an artifact of our example that is, that under certain natural assumptions 1 any relation, r, over the schema of Figure 5.2 equals the natural join of the projections of r on the two relational schemas in (8.1). This property, called losslessness, will be discussed in Section This means that our decomposition preserves the original information represented by the Person relation. 2. The redundancy present in the original relation of Figure 5.2 is gone and so are the update anomalies. The only items that are stored more than once are SSNs, which are identifiers of entities of type Person. Thus, changes to addresses, names, or hobbies now affect only a single tuple. Likewise, the removal of Bart Simpson s hobbies from the database does not delete the information about his address and does not require us to rely on null values. The insertion anomaly is also gone because we can now add people and hobbies independently. Observe that the new design still has a certain amount of redundancy and that we might still need to use null values in certain cases. First, since we use SSNs as tuple 1 Namely, that SSN uniquely determines the name and address of a person.

4 214 Chapter 8 Database Design II: Relational Normalization Theory Figure 8.2 The ultimate decomposition. SSN Name John Doe Mary Doe Bart Simpson Address 123 Main St. 7 Lake Dr. Fox 5 TV Hobby stamps hiking coins skating acting identifiers, each SSN can occur multiple times and all of these occurrences must be kept consistent across the database. So consistency maintenance has not been eliminated completely. However, if the identifiers are not dynamic (for instance, SSNs, which do not change frequently), consistency maintenance is considerably simplified. Second, imagine a situation in which we add a person to our Person1 relation and the address is not known. Clearly, even with the new design, we have to insert NULL in the Address field for the corresponding tuple. However, Address is not part of a primary key, so the use of NULL here is not that bad (we will still have difficulties joining Person1 ontheaddress attribute). It is important to understand that not all decompositions are created equal. In fact, most of them do not make any sense even though they might be doing a good job at eliminating redundancy. The decomposition Ssn(SSN) Name(Name) Address(Address) (8.2) Hobby(Hobby) is the ultimate redundancy eliminator. Projecting on the relation of Figure 5.2 yields a database where each value appears exactly once, as shown in Figure 8.2. Unfortunately, this new database is completely devoid of any useful information; for instance, it is no longer possible to tell where John Doe lives or who collects stamps as a hobby. This situation is in sharp contrast with the decomposition of Figure 8.1, where we were able to completely restore the information represented by the original relation using a natural join. The need for schema refinement. Translation of the Person entity type into the relational model indicates that one cannot rely solely on the E-R approach for designing database schemas. Furthermore, the problems exhibited by the Person example are by no means rare or unique. Consider the relationship HasAccount of Figure A typical translation of HasAccount into the relational model might be CREATE TABLE HasAccount ( AccountNumber INTEGER NOT NULL, ClientId CHAR(20), (8.3) OfficeId INTEGER,

5 8.3 Functional Dependencies 215 PRIMARY KEY (ClientId, OfficeId), FOREIGN KEY (OfficeId) REFERENCES Office ) Recall that a client can have at most one account in an office and hence (ClientId, OfficeId) is a key. Also, an account must be assigned to exactly one office. Careful analysis shows that this requirement leads to some of the same problems that we saw in the Person example. For example, a tuple that records the fact that a particular account is managed by a particular office cannot be added without also recording client information (since ClientId is part of the primary key) which is an insertion anomaly. This (and the dual deletion anomaly) is perhaps not a serious problem, because of the specifics of this particular application, but the update anomaly could present maintenance issues. Moving an account from one office to another involves changing OfficeId in every tuple corresponding to that account. If the account has multiple clients, this might be a problem. We return to this example later in this chapter because HasAccount exhibits certain interesting properties not found in the Person example. For instance, even though a decomposition of HasAccount might still be desirable, it incurs additional maintenance overhead that the decomposition of Person does not. The above discussion brings out two key points: (1) Decomposition of relation schemas can serve as a useful tool that complements the E-R approach by eliminating redundancy problems; (2) The criteria for choosing the right decomposition are not immediately obvious, especially when we have to deal with schemas that contain many attributes. For these reasons, the purpose of Sections 8.3 through 8.6 is to develop techniques and criteria for identifying relation schemas that are in need of decomposition, as well as to understand what it means for a decomposition not to lose information. The central tool in developing much of decomposition theory is functional dependency, which is a generalization of the idea of key constraints. Functional dependencies are used to define normal forms a set of requirements on relational schemas that are desirable in update-intensive transaction systems. This is why the theory of decompositions is often also called normalization theory. Sections 8.5 through 8.9 develop algorithms for carrying out the normalization process. 8.3 FUNCTIONAL DEPENDENCIES For the remainder of this chapter, we use a special notation for representing attributes, which is common in relational normalization theory, as follows: Capital letters from the beginning of the alphabet (A, B, C, D) represent individual attributes; capital letters from the middle to the end of the alphabet with bars over them (P, V, W, X, Y, Z) represent sets of attributes. Also, strings of letters, such as ABCD, denote sets of the respective attributes ({A, B, C, D}); strings of letters with bars over them, (X Y Z), stand for unions of these sets (X Y Z). Although this notation requires some getting used to, it is very convenient and provides a succinct language, which we use in examples and definitions.

6 216 Chapter 8 Database Design II: Relational Normalization Theory A functional dependency (FD) on a relation schema, R, is a constraint of the form X Y, where X and Y are sets of attributes used in R. Ifr is a relation instance of R, it is said to satisfy this functional dependency if For every pair of tuples, t and s, inr, ift and s agree on all attributes in X, then t and s agree on all attributes in Y. Put another way, there must not be a pair of tuples in r such that they have the same values for every attribute in X but different values for some attribute in Y. Note that the key constraints introduced in Chapter 4 are a special kind of FD. Suppose that key(k) is a key constraint on the relational schema R and that r is a relational instance over R. By definition, r satisfies key(k) if and only if there is no pair of distinct tuples, t, s r, such that t and s agree on every attribute in key(k). Therefore, this key constraint is equivalent to the FD K R, where K is the set of attributes in the key constraint and R denotes the set of all attributes in the schema R. Keep in mind that functional dependencies are associated with relation schemas, but when we consider whether or not a functional dependency is satisfied we must consider relation instances over those schemas. This is because FDs are integrity constraints on the schema (much like key constraints), which restrict the set of allowable relation instances to those that satisfy the given FDs. Thus, given a schema, R = (R; Constraints), where R is a set of attributes and Constraints is a set of FDs, we are looking for a set of all relation instances over R that satisfies every FD in Constraints. Such relational instances are called legal instances of R. Functional dependencies and update anomalies. Certain functional dependencies that exist in a relational schema can lead to redundancy in the corresponding relation instances. Consider the two examples discussed in Sections 8.1 and 8.3: the schemas Person and HasAccount. Each has a primary key, as illustrated by the corresponding CREATE TABLE commands (5.1), page 93, and (8.3), page 214. Correspondingly, there are the following functional dependencies: Person: SSN,Hobby SSN,Name,Address,Hobby HasAccount: ClientId,OfficeId AccountNumber, (8.4) ClientId,OfficeId These are not the only FDs implied by the original specifications, however. For instance, both Name and Address are defined as single-valued attributes in the E-R diagram of Figure 5.1. This clearly implies that one Person entity (identified by its attribute SSN) can have at most one name and one address. Similarly, the business rules of PSSC (the brokerage firm discussed in Section 5.5) require that every account be assigned to exactly one office, which means that the following FDs must also hold for the corresponding relation schemas: Person: SSN Name, Address HasAccount: AccountNumber OfficeId (8.5)

7 8.4 Properties of Functional Dependencies 217 It is easy to see that the syntactic structure of the dependencies in (8.4) closely corresponds to the update anomalies that we identified for the corresponding relations. For instance, the problem with Person was that for any given SSN we could not change the values for the attributes Name and Address independently of whether the corresponding person had hobbies: If the person had multiple hobbies, the change had to occur in multiple rows. Likewise with HasAccount we cannot change the value of OfficeId (i.e., transfer an account to a different office) without having to look for all clients associated with this account. Since a number of clients might share the same account, there can be multiple rows in HasAccount that refer to the same account; hence, multiple rows in which OfficeId must be changed. Note that in both cases, the attributes involved in the update anomalies appear on the left-hand sides of an FD. We can see that update anomalies are associated with certain kinds of functional dependencies. Which dependencies are bad? At the risk of giving away the secret, we draw your attention to one major difference between the dependencies in (8.4) and (8.5): The former specify key constraints for their corresponding relations whereas the latter do not. However, simply knowing which dependencies cause the anomalies is not enough we must do something about them. We cannot just abolish the offending FDs, because they are part of the semantics of the enterprise being modeled by the databases. They are implicitly or explicitly part of the Requirements Document and cannot be changed without an agreement with the customer. On the other hand, we saw that schema decomposition can be a useful tool. Even though a decomposition cannot abolish a functional dependency, it can make it behave. For instance, the decomposition shown in (8.1) yields schemas in which the offending FD, SSN Name, Address, becomes a well-behaved key constraint. 8.4 PROPERTIES OF FUNCTIONAL DEPENDENCIES Before going any further, we need to learn some mathematical properties of functional dependencies and develop algorithms to test them. Since these properties and algorithms rely heavily on the notational conventions introduced at the beginning of Section 8.3, it might be a good idea to revisit these conventions. The properties of FDs that we are going to study are based on entailment. Consider a set of attributes R, a set, F, of FDs over R, and another FD, f,onr. We say that F entails f if every relation r over the set of attributes R has the following property: If r satisfies every FD in F, then r satisfies the FD f. Given a set of FDs, F, the closure of F, denoted F +, is the set of all FDs entailed by F. Clearly, F + contains F as a subset. 2 2 If f F, then every relation that satisfies every FD in F obviously satisfies f. Therefore, by the definition of entailment, f is entailed by F.

8 218 Chapter 8 Database Design II: Relational Normalization Theory If F and G are sets of FDs, we say that F entails G if F entails every individual FD in G. F and G are said to be equivalent if F entails G and G entails F. We now present several simple but important properties of entailment. Reflexivity. Some FDs are satisfied by every relation no matter what. These dependencies all have the form X Y, where Y X, and are called trivial FDs. The reflexivity property states that, if Y X, then X Y. To see why trivial FDs are always satisfied, consider a relation, r, whose set of attributes includes all of the attributes mentioned in X Y. Suppose that t, s r are tuples that agree on X. But, since Y X, this means that t and s agree on Y as well. Thus, r satisfies X Y. We can now relate trivial FDs and entailment. Because a trivial FD is satisfied by every relation, it is entailed by every set of FDs! In particular, F + contains every trivial FD. Augmentation. Consider an FD, X Y, and another set of attributes, Z. Let R contain X Y Z. Then X Y entails X Z Y Z. In other words, every relation r over R that satisfies X Y must also satisfy the FD X Z Y Z. The augmentation property states that, if X Y, then X Z Y Z. To see why this is true, if tuples t, s r agree on every attribute of X Z then in particular they agree on X. Since r satisfies X Y, t and s must also agree on Y. They also agree on Z, since we have assumed that they agree on a bigger set of attributes, X Z. Thus, if s, t agree on every attribute in X Z, they must agree on Y Z. As this is an arbitrarily chosen pair of tuples in r, it follows that r satisfies X Z Y Z. Transitivity. The set of FDs {X Y, Y Z} entails the FD X Z. The transitivity property states that, if X Y and Y Z, then X Z. This property can be established similarly to the previous two (see the exercises). These three properties of FDs are known as Armstrong s Axioms, 3 and their main use is in the proofs of correctness of various database design algorithms. However, they are also a powerful tool used by (real, breathing human) database designers because they can help them quickly spot problematic FDs in relational schemas. We now show how Armstrong s axioms are used to derive new FDs. Union of FDs. Any relation, r, that satisfies X Y and X Z must also satisfy X Y Z. To show this, we can derive X Y Z from X Y and X Z using simple syntactic manipulations defined by Armstrong s axioms. Such manipulations can 3 Strictly speaking, these are inference rules, because they derive new rules out of old ones. Only the reflexivity rule can be viewed as an axiom.

9 8.4 Properties of Functional Dependencies 219 be easily programmed on a computer, unlike the tuple-based considerations we used to establish the axioms themselves. Here is how it is done: (a) X Y Given (b) X Z Given (c) X Y X Adding X to both sides of (a): Armstrong s augmentation rule (d) Y X Y Z Adding Y to both sides of (b): Armstrong s augmentation rule (e) X Y Z By Armstrong s transitivity rule applied to (c) and (d) Decomposition of FDs. In a similar way, we can prove the following rule: Every relation that satisfies X Y Z must also satisfy the FDs X Y and X Z. This is accomplished by the following simple steps: (a) X Y Z Given (b) Y Z Y By Armstrong s reflexivity rule, since Y Y Z (c) X Y By transitivity from (a) and (b) Derivation of X Z is similar. Armstrong s axioms are obviously sound.by sound we mean that any expression of the form X Y derived using the axioms is actually a functional dependency. Soundness follows from the fact that we have proved that these inference rules are valid for every relation. It is much less obvious, however, that they are also complete that is, if a set of FDs, F, entails another FD, f, then f can be derived from F by a sequence of steps similar to the ones above that rely solely on Armstrong s axioms! We do not prove this fact here, but a proof can be found in [Ullman 1988]. Soundness and completeness of Armstrong s axioms is not just a theoretical curiosity this result has considerable practical value because it guarantees that entailment of FDs (i.e., the question of whether f F + ) can be verified by a computer program. We are now going to develop one such algorithm. An obvious way to verify entailment of an FD, f, by a set of FDs, F, isto instruct the computer to apply Armstrong s axioms to F in all possible ways. Since the number of attributes mentioned in F and f is finite, this derivation process cannot go on forever. When we are satisfied that all possible derivations have been made, we can simply check whether f is among the FDs derived by this process. Completeness of Armstrong s axioms guarantees that f F + if and only if f is one of the FDs thus derived. To see how this process works, consider the following sets of FDs: F ={AC B, A C, D A} and G ={A B, A C, D A, D B}. We can use Armstrong s axioms to prove that these two sets are equivalent, i.e., every FD in G is entailed by F, and vice versa. For instance, to prove that A B is implied by F, we can apply Armstrong s axioms in all possible ways. Most of these attempts will not lead anywhere, but a few will. For instance, the following derivation establishes the desired entailment:

10 220 Chapter 8 Database Design II: Relational Normalization Theory (a) A C (b) A AC (c) A B An FD in F From (a) and Armstrong s augmentation axiom From (b), AC B F, and Armstrong s transitivity axiom The FDs A C and D A belong to both F and G, so the derivation is trivial. For D B in G, the computer can try to apply Armstrong s axioms until this FD is derived. After a while, it will stumble upon this valid derivation: (a) D A an FD in F (b) A B derived previously (c) D B from (a), (b), and Armstrong s transitivity axiom This shows that every FD in F entailed by G is done similarly. Although the simplicity of checking entailment by blindly applying Armstrong s axioms is attractive, it is not very efficient. In fact, the size of F + can be exponential in the size of F, so for large database schemas it can take a very long time before the designer ever sees the result. We are therefore going to develop a more efficient algorithm, which is also based on Armstrong s axioms but which applies them much more judiciously. Checking entailment of FDs. The idea of the new algorithm for verifying entailment is based on attribute closure. Given a set of FDs, F, and a set of attributes, X, we define the attribute closure of X with respect to F, denoted X + F, as follows: X + F ={A X A F+ } In other words, X + F is a set of all those attributes, A, such that X A is entailed by F. Note that X X + F because, if A X, then, by Armstrong s reflexivity axiom, X A is a trivial FD that is entailed by every set of FDs, including F. It is important to understand that the closure of F (i.e., F + ) and the closure of X (i.e., X + F ) are related but different notions: F+ is a set of functional dependencies, whereas X + F is a set of attributes. The more efficient algorithm for checking entailment of FDs now works as follows: Given a set of FDs, F, and an FD, X Y, check whether Y X + F. If this is so, then F entails X Y. Otherwise, if Y X + F, then F does not entail X Y. The correctness of this algorithm follows from Armstrong s axioms. If Y X + F, then X A F + for every A Y (by the definition of X + F ). By the union rule for FDs, F thus entails X Y. Conversely, if Y X + F, then there is B Y such that B X + F. Hence, X B is not entailed by F. But then F cannot entail X Y; ifit did, it would have to entail X B as well, by the decomposition rule for FDs. The heart of the above algorithm is a check of whether a set of attributes belongs to X + F. Therefore, we are not done yet: We need an algorithm for computing the

11 8.4 Properties of Functional Dependencies 221 Figure 8.3 Computation of attribute closure X + F. closure := X repeat old := closure if there is an FD Z V F such that Z closure then closure := closure V until old = closure return closure closure of X, which we present in Figure 8.3. The idea behind the algorithm is enlarging the set of attributes known to belong to X + F by applying the FDs in F. The closure is initialized to X, since we know that X is always a subset of X + F. The soundness of the algorithm can be proved by induction. Initially, closure is X, so X closure is in F +. Then, assuming that X closure F + at some intermediate step in the repeat loop of Figure 8.3, and given an FD Z V F such that Z closure, we can use the generalized transitivity rule (see exercise 8.7) to infer that F entails X closure V. Thus, if A closure at the end of the computation, then A X + F. The converse is also true: If A X+ F, then at the end of the computation A closure (see exercise 8.8). Unlike the simple-minded algorithm that uses Armstrong s axioms indiscriminately, the run-time complexity of the algorithm in Figure 8.3 is quadratic in the size of F. In fact, an algorithm for computing X + F that is linear in the size of F is given in [Beeri and Bernstein 1979]. This algorithm is better suited for a computer program, but its inner workings are more complex. Example (Checking Entailment) Consider a relational schema, R = (R; F), where R = ABCDEFGHIJ, and the set of FDs, F, which contains the following FDs: AB C, D E, AE G, GD H, ID J. We wish to check whether F entails ABD GH and ABD HJ. First, let us compute ABD + F. We begin with closure = ABD. Two FDs can be used in the first iteration of the loop in Figure 8.3. For definiteness, let us use AB C, which makes closure = ABDC. In the second iteration, we can use D E, which makes closure = ABDCE. Now it becomes possible to use the FD AE G in the third iteration, yielding closure = ABDCEG. This in turn allows GD H to be applied in the fourth iteration, which results in closure = ABDCEGH. In the fifth iteration, we cannot apply any new FDs, so closure does not change and the loop terminates. Thus, ABD + F = ABDCEGH. Since GH ABDCEGH, we conclude that F entails ABD GH. On the other hand, HJ ABDCEGH, so we conclude that ABD HJ is not entailed by F. Note, however, that F does entail ABD H.

12 222 Chapter 8 Database Design II: Relational Normalization Theory Figure 8.4 Testing equivalence of sets of FDs. Input: F, G FD sets Output: true,iff is equivalent to G; false otherwise for each f F do if G does not entail f then return false for each g G do if F does not entail g then return false return true The above algorithm for testing entailment leads to a simple test for equivalence between a pair of sets of FDs. Let F and G be such sets. To check that they are equivalent, we must check that every FD in G is entailed by F, and vice versa. The algorithm is depicted in Figure NORMAL FORMS To eliminate redundancy and potential update anomalies, database theory identifies several normal forms for schemas such that, if a schema is in one of the normal forms, it has certain predictable properties. Originally, [Codd 1970] proposed three normal forms, each eliminating more anomalies than the previous one. The first normal form (1NF), as introduced by Codd, is equivalent to the definition of the relational data model. The second normal form (2NF) was an attempt to eliminate some potential anomalies, but it has turned out to be of no practical use, so we do not discuss it. The third normal form (3NF) was initially thought to be the ultimate normal form. However, Boyce and Codd soon realized that 3NF can still harbor undesirable combinations of functional dependencies, so they introduced the so-called Boyce- Codd normal form (BCNF). Unfortunately, the sobering reality of computational sciences is that there is no free lunch. Even though BCNF is more desirable, it is not always achievable without paying a price elsewhere. In this section, we define both BCNF and 3NF. Subsequent sections develop algorithms for automatically converting relational schemas that possess various bad properties into sets of schemas in 3NF and BCNF. We also study the trade-offs associated with such conversions. Toward the end of this chapter, we show that certain types of redundancy are caused not by FDs but by other dependencies. To deal with this problem, we introduce the fourth normal form (4NF), which further extends BCNF. The Boyce-Codd Normal Form. A relational schema, R = (R; F), where R is the set of attributes of R and F is the set of functional dependencies associated with R, is in Boyce-Codd normal form if, for every FD X Y F, either of the following is true: Y X (i.e., this is a trivial FD). X is a superkey of R.

13 8.5 Normal Forms 223 In other words, the only nontrivial FDs are those in which a key functionally determines one or more attributes. It is easy to see that Person1 and Hobby, the relational schemas given in (8.1), are in BCNF, because the only nontrivial FD is SSN Name, Address. It applies to Person1 which has SSN as a key. On the other hand, consider the schema Person defined by the CREATE TABLE statement (5.1), page 93, and the schema HasAccount defined by the SQL statement (8.3), page 214. As discussed earlier, these statements fail to capture some important relationships, which are represented by the FDs in (8.5), page 216. Each of these FDs is in violation of the requirement to be in BCNF: They are not trivial, and their left-hand sides SSN and AccountNumber are not keys of their respective schemas. Note that a BCNF schema can have more than one key. For instance, R = (ABCD; F), where F ={AB CD, AC BD} has two keys, AB and AC. And yet it is in BCNF because the left-hand side of each of the two FDs in F is a key. An important property of BCNF schemas is that their instances do not contain redundant information. Since we have been illustrating redundancy problems only through concrete examples, the above statement might seem vague. Exactly what is redundant information? For instance, does the abstract relation A B C D over the above mentioned BCNF schema R store redundant information? Superficially it might seem so, because the two tuples agree on all but one attribute. However, having identical values in some attributes of different tuples does not necessarily imply that the tuples are storing redundant information. Redundancy arises when the values of some set of attributes, X, necessarily implies the value that must exist in another attribute, A a functional dependency. If two tuples have the same values in X, they must have the same value of A. Redundancy is eliminated if we store the association between X and A only once (in a separate relation) instead of repeating it in all tuples of an instance of the schema R that agree on X. Since R does not have FDs over the attributes BCD, no redundant information is stored. The fact that the tuples in the relation coincide over BCD is coincidental. For instance, the value of attribute D in the first tuple can be changed from 4 to 5 without regard for the second tuple. A DBMS automatically eliminates one type of redundancy: Two tuples with the same values in the key fields are prohibited in any instance of a schema. This is a special case: The key identifies an entity and so determines the values of all attributes describing that entity. As the definition of BCNF precludes associations that do not contain keys, the relations over BCNF schemas do not store redundant information. As a result, deletion and update anomalies do not arise in BCNF relations.

14 224 Chapter 8 Database Design II: Relational Normalization Theory Relations with more than one key still can have insertion anomalies. To see this, suppose that associations over ABD and over ACD are added to our relation as shown: A B C D NULL 5 3 NULL 2 5 Because the value over the attribute C in the first association and over B in the second is unknown, we fill in the missing information with NULL. However, now we cannot tell if the two newly added tuples are the same it all depends on the real values for the nulls. A practical solution to this problem, as adopted by the SQL standard, is to designate one key as primary and to prohibit null values in its attributes. The Third Normal Form. A relational schema, R = (R; F), where R is the set of attributes of R and F is the set of functional dependencies associated with R, isin third normal form if, for every FD X A F, any one of the following is true: A X, (i.e., this is a trivial FD). X is a superkey of R. A K for some key K of R. Observe that the first two conditions in the definition of 3NF are identical to the conditions that define BCNF. Thus, 3NF is a relaxation of BCNF s requirements. Every schema in BCNF must also be in 3NF, but the converse is not true in general. For instance, the relation HasAccount (8.3) on page 214 is in 3NF because the only FD that is not based on a key constraint is AccountNumber OfficeId, and OfficeId is part of the key. However, this relation is not in BCNF, as shown previously. 4 If you are wondering about the intrinsic merit of the third condition in the definition of 3NF, the answer is that there is none. In a way, 3NF was discovered by mistake in the search for what we now call BCNF! The reason for its remarkable survival is that it was found later to have some desirable algorithmic properties that BCNF does not have. We discuss these issues in subsequent sections. Recall from Section 8.2 that relation instances over HasAccount might store redundant information. Now we can see that this redundancy arises because of the functional dependency that relates AccountNumber and OfficeId and that is not implied by key constraints. 4 In fact, HasAccount is the smallest possible example of a 3NF relation that is not in BCNF (see Exercise 8.5).

15 8.6 Properties of Decompositions 225 For another example, consider the schema Person discussed earlier. This schema violates the 3NF requirements because, for example, the FD SSN Name is not based on a key constraint (SSN is not a superkey) and Name does not belong to a key of Person. However, the decomposition of this schema into Person1 and Hobby in (8.1), page 213, yields a pair of schemas that are in both 3NF and BCNF. 8.6 PROPERTIES OF DECOMPOSITIONS Since there is no redundancy in BCNF schemas and redundancy in 3NF is limited, we are interested in decomposing a given schema into a collection of schemas, each of which is in one of these normal forms. The main thrust of the discussion in the previous section was that 3NF does not completely solve the redundancy problem. Therefore, at first glance, there appears to be no justification to consider 3NF as a goal for database design. It turns out, however, that the maintenance problems associated with redundancy do not show the whole picture. As we will see, maintenance is also associated with integrity constraints, 5 and 3NF decompositions sometimes have better properties in this regard than do BCNF decompositions. Our first step is to define these properties. Recall from Section 8.2 that not all decompositions are created equal. For instance, the decomposition of Person shown in (8.1) is considered good while the one in (8.2) makes no sense. Is there an objective way to tell which decompositions make sense and which do not, and can this objective way be explained to a computer? The answer to both questions is yes. The decompositions that make sense are called lossless. Before tackling this notion, we need to be more precise about what we mean by a decomposition. A decomposition of a schema, R = (R; F), where R is a set of attributes of the schema and F is its set of functional dependencies, is a collection of schemas R 1 = (R 1 ; F 1 ), R 2 = (R 2 ; F 2 ),..., R n = (R n ; F n ) such that the following conditions hold 1. R = n i=1 R i 2. F entails F i for every i = 1,..., n. The first part of the definition is clear: A decomposition should not introduce new attributes, and it should not drop attributes found in the original schema. The second part of the definition says that a decomposition should not introduce new functional dependencies (but may drop some). We discuss the second requirement in more detail later. The decomposition of a schema naturally leads to decomposition of relations over it. A decomposition of a relation, r, over schema R is a set of relations r 1 = π R1 (r), r 2 = π R2 (r),..., r n = π Rn (r) 5 For example, a particular integrity constraint in the original table might be checkable in the decomposed tables only by taking the join of these tables which results in significant overhead at run time.

16 226 Chapter 8 Database Design II: Relational Normalization Theory where π is the projection operator. It can be shown (see exercise 8.9) that, if r is a valid instance of R, then each r i satisfies all FDs in F i and thus each r i is a valid relation instance over the schema R i. The purpose of a decomposition is to replace the original relation, r, with a set of relations r 1,..., r n over the schemas that constitute the decomposed schema. In view of the above definitions, it is important to realize that schema decomposition and relation instance decomposition are two different but related notions. Applying the above definitions to our running example, we see that splitting Person into Person1 and Hobby (see (8.1) and Figure 8.1) yields a decomposition. Splitting Person as shown in (8.2) and Figure 8.2 is also a decomposition in the above sense. It clearly satisfies the first requirement for being a decomposition. It also satisfies the second, since only trivial FDs hold in (8.2), and these are entailed by every set of dependencies. This last example shows that the above definition of a decomposition does not capture all of the desirable properties of a decomposition because, as you may recall from Section 8.2, the decomposition in (8.2) makes no sense. In the following, we introduce additional desirable properties of decompositions Lossless and Lossy Decompositions Consider a relation, r, and its decomposition, r 1,..., r n, as defined above. Since after the decomposition the database no longer stores the relation r and instead maintains its projections r 1,..., r n, the database must be able to reconstruct the original relation r from these projections. Not being able to reconstruct r means that the decomposition does not represent the same information as does the original database (imagine a bank losing the information about who owns which account or, worse, giving those accounts to the wrong owners!). In principle, one can use any computational method that guarantees reconstruction of r from its projections. However, the natural and, in most cases, practical method is the natural join. We thus assume that r is reconstructible if and only if r = r 1 r 2 r n Reconstructibility must be a property of schema decomposition and not of a particular instance over this schema. At the database design stage, the designer manipulates schemas, not relations, and any transformation performed on a schema must guarantee that reconstructibility holds for all of its valid relation instances. This discussion leads to the following notion. A decomposition of schema R = (R; F) into a collection of schemas R 1 = (R 1 ; F 1 ), R 2 = (R 2 ; F 2 ),..., R n = (R n ; F n ) is lossless if, for every valid instance r of schema R, where r = r 1 r 2 r n r 1 = π R1 (r), r 2 = π R2 (r),..., r n = π Rn (r)

17 8.6 Properties of Decompositions 227 A decomposition is lossy otherwise. In plain terms, a lossless schema decomposition is one that guarantees that any valid instance of the original schema can be reconstructed from its projections on the individual schemas of the decomposition. Note that r r 1 r 2 r n holds for any decomposition (exercise 8.10), so losslessness really just states the opposite inclusion. The fact that r r 1 r 1 r n r r 1 r 2 r n holds, no matter what, may seem confusing at first. If we can get more tuples by joining the projections of r, why is such a decomposition called lossy? After all, we gained more tuples, not less! To clarify this issue, observe that what we might lose here are not tuples but rather information about which tuples are the right ones. Consider, for instance, the decomposition (8.2) of schema Person. Figure 8.2 presents the corresponding decomposition of a valid relation instance over Person shown in Figure 5.2. However, if we now compute a natural join of the relations in the decomposition (which becomes a Cartesian product since these relations do not share attributes), we will not be able to tell who lives where and who has what hobbies. The relationship among names and SSNs is also lost. In other words, when reconstructing the original relation getting more tuples is as bad as getting fewer we must get exactly the set of tuples in the original relation. Now that we are convinced of the importance of losslessness, we need an algorithm that a computer can use to verify this property, since the definition of lossless joins does not provide an effective test but only tells us to try every possible relation. This is not feasible or efficient. A general test of whether a decomposition into n schemas is lossless exists, but is somewhat complex. It can be found in [Beeri et al. 1981]. However, there is a much simpler test that works for binary decompositions, that is, decompositions into a pair of schemas. This test can establish losslessness of a decomposition into more than two schemas, provided that this decomposition was obtained by a series of binary decompositions. Since most decompositions are obtained in this way, the simple binary test introduced below is sufficient for most practical purposes. Testing the losslessness of a binary decomposition. Let R = (R; F) be a schema and R 1 = (R 1 ; F 1 ), R 2 = (R 2 ; F 2 ) be a binary decomposition of R. This decomposition is lossless if and only if either of the following is true: (R 1 R 2 ) R 1 F +. (R 1 R 2 ) R 2 F +.

18 228 Chapter 8 Database Design II: Relational Normalization Theory Figure 8.5 Tuple structure in a lossless binary decomposition: A row of r 1 combines with exactly one row of r 2. R 1 R 2 r 1 r 2 R 1 R2 R 2 To see why this is so, suppose that (R 1 R 2 ) R 2 F +. Then R 1 is a superkey of R since the values in a subset of attributes of R 1 determine the values of all attributes of R 2 (and, obviously, R 1 functionally determines R 1 ). Let r be a valid relation instance for R. Since R 1 is a superkey of R every tuple in r 1 = π R1 (r) extends to exactly one tuple in r. Thus, as depicted in Figure 8.5, the cardinality (the number of tuples) in r 1 equals the cardinality of r, and every tuple in r 1 joins with exactly one tuple in r 2 = π R2 (r) (if more than one tuple in r 2 joined with a tuple in r 1, (R 1 R 2 ) R 2 would not be an FD). Therefore, the cardinality of r 1 r 2 equals the cardinality of r 1, which in turn equals the cardinality of r. Since r must be a subset of r 1 r 2, it follows that r = r 1 r 2. Conversely, if neither of the above FDs holds, it is easy to construct a relation r such that r r 1 r 2. Details of this construction are left to exercise The above test can now be used to substantiate our intuition that the decomposition (8.1) of Person into Person1 and Hobby is a good one. The intersection of the attributes of Hobby and Person1 is{ssn}, and SSN is a key of Person1. Thus, this decomposition is lossless Dependency-Preserving Decompositions Consider the schema HasAccount once again. Recall that it has attributes AccountNumber, ClientId, OfficeId, and its FDs are ClientId, OfficeId AccountNumber (8.6) AccountNumber OfficeId (8.7) According to the losslessness test, the following decomposition is lossless:

19 8.6 Properties of Decompositions 229 AcctOffice = (AccountNumber, OfficeId; {AccountNumber OfficeId}) (8.8) AcctClient = (AccountNumber, ClientId; { }) because AccountNumber (the intersection of the two attribute sets) is a key of the first schema, AcctOffice. Even though decomposition (8.8) is lossless, something seems to have fallen through the cracks. AcctOffice hosts the FD (8.7), but AcctClient s associated set of FDs is empty. This leaves the FD (8.6), which exists in the original schema, homeless. Neither of the two schemas in the decomposition has all of the attributes needed to house this FD; furthermore, the FD cannot be derived from the FDs that belong to the schemas AcctOffice and AcctClient. In practical terms, this means that even though decomposing relations over the schema HasAccount into relations over AcctOffice and AcctClient does not lead to information loss, it might incur a cost for maintaining the integrity constraint that corresponds to the lost FD. Unlike the FD (8.7), which can be checked locally, in the relation AcctOffice, verification of the FD (8.6) requires computing the join of AcctOffice and AcctClient before checking of the FD can begin. In such cases, we say that the decomposition does not preserve the dependencies in the original schema. We now define what this means. Consider a schema, R = (R; F), and suppose that R 1 = (R 1 ; F 1 ), R 2 = (R 2 ; F 2 ),..., R n = (R n ; F n ) is a decomposition. F entails each F i by definition, so F entails n i=1 F i. However, this definition does not require that the two sets of dependencies be equivalent that is, that n i=1 F i must also entail F. This reverse entailment is what is missing in the above example. The FD set of HasAccount, which consists of dependencies (8.6) and (8.7), is not entailed by the union of the dependencies in decomposition (8.8), which consists of only a single dependency AccountNumber OfficeId. Because of this, (8.8) is not a dependency-preserving decomposition. Formally, R 1 = (R 1 ; F 1 ), R 2 = (R 2 ; F 2 ),..., R n = (R n ; F n ) is said to be a dependency-preserving decomposition of R = (R; F) if and only if it is a decomposition and the sets of FDs F and n i=1 F i are equivalent. The fact that an FD, f,inf is not in any F i does not mean that the decomposition is not dependency preserving, since f might be entailed by n i=1 F i. In this case, maintaining f as a functional dependency requires no extra effort: If the FDs in n i=1 F i are maintained, f will be also. It is only when f is not entailed by n i=1 F i that the decomposition is not dependency preserving and so maintenance of f requires a join. In view of this definition, the above decomposition of HasAccount is not dependency preserving, and this is precisely what is wrong. The dependencies that exist in the original schema but are lost in the decomposition become interrelational constraints that cannot be maintained locally. Each time a relation in

20 230 Chapter 8 Database Design II: Relational Normalization Theory Figure 8.6 Decomposition of the HasAccount relation. AccountNumber ClientId OfficeId B SB01 A MN08 HasAccount AccountNumber OfficeId AccountNumber ClientId B123 A908 SB01 MN08 AcctOffice B A AcctClient Figure 8.7 HasAccount and its decomposition after the insertion of several rows. AccountNumber ClientId OfficeId B SB01 B SB01 A MN08 HasAccount AccountNumber OfficeId AccountNumber ClientId B123 B567 A908 SB01 SB01 MN08 AcctOffice B B A AcctClient the decomposition is changed, satisfaction of the inter-relational constraints can be checked only after the reconstruction of the original relation. To illustrate, consider the decomposition of HasAccount in Figure 8.6. If we now add the tuple B567, SB01 to the relation AcctOffice and the tuple B567, to AcctClient, the two relations will still satisfy their local FDs (in fact, we see from (8.8) that only AcctOffice has a dependency to satisfy). In contrast, the inter-relational FD (8.6) is not satisfied after these updates, but this is not immediately apparent. To verify this, we must join the two relations, as depicted in Figure 8.7. We now see that constraint (8.6) is violated by the first two tuples in the updated HasAccount relation.

Database Management Systems. Redundancy and Other Problems. Redundancy

Database Management Systems. Redundancy and Other Problems. Redundancy Database Management Systems Winter 2004 CMPUT 391: Database Design Theory or Relational Normalization Theory Dr. Osmar R. Zaïane Lecture 2 Limitations of Relational Database Designs Provides a set of guidelines,

More information

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example

Limitations of E-R Designs. Relational Normalization Theory. Redundancy and Other Problems. Redundancy. Anomalies. Example Limitations of E-R Designs Relational Normalization Theory Chapter 6 Provides a set of guidelines, does not result in a unique database schema Does not provide a way of evaluating alternative schemas Normalization

More information

Database Design and Normalization

Database Design and Normalization Database Design and Normalization CPS352: Database Systems Simon Miner Gordon College Last Revised: 9/27/12 Agenda Check-in Functional Dependencies (continued) Design Project E-R Diagram Presentations

More information

Relational Database Design

Relational Database Design Relational Database Design To generate a set of relation schemas that allows - to store information without unnecessary redundancy - to retrieve desired information easily Approach - design schema in appropriate

More information

Design of Relational Database Schemas

Design of Relational Database Schemas Design of Relational Database Schemas T. M. Murali October 27, November 1, 2010 Plan Till Thanksgiving What are the typical problems or anomalies in relational designs? Introduce the idea of decomposing

More information

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy

Week 11: Normal Forms. Logical Database Design. Normal Forms and Normalization. Examples of Redundancy Week 11: Normal Forms Database Design Database Redundancies and Anomalies Functional Dependencies Entailment, Closure and Equivalence Lossless Decompositions The Third Normal Form (3NF) The Boyce-Codd

More information

Schema Refinement and Normalization

Schema Refinement and Normalization Schema Refinement and Normalization Module 5, Lectures 3 and 4 Database Management Systems, R. Ramakrishnan 1 The Evils of Redundancy Redundancy is at the root of several problems associated with relational

More information

Functional Dependencies and Normalization

Functional Dependencies and Normalization Functional Dependencies and Normalization 5DV119 Introduction to Database Management Umeå University Department of Computing Science Stephen J. Hegner hegner@cs.umu.se http://www.cs.umu.se/~hegner Functional

More information

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design Chapter 7: Relational Database Design Database System Concepts, 5th Ed. See www.db book.com for conditions on re use Chapter 7: Relational Database Design Features of Good Relational Design Atomic Domains

More information

Database Design and Normal Forms

Database Design and Normal Forms Database Design and Normal Forms Database Design coming up with a good schema is very important How do we characterize the goodness of a schema? If two or more alternative schemas are available how do

More information

Lecture Notes on Database Normalization

Lecture Notes on Database Normalization Lecture Notes on Database Normalization Chengkai Li Department of Computer Science and Engineering The University of Texas at Arlington April 15, 2012 I decided to write this document, because many students

More information

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Schema Design and Normal Forms Sid Name Level Rating Wage Hours Entity-Relationship Diagram Schema Design and Sid Name Level Rating Wage Hours Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Database Management Systems, 2 nd Edition. R. Ramakrishnan

More information

Chapter 10. Functional Dependencies and Normalization for Relational Databases

Chapter 10. Functional Dependencies and Normalization for Relational Databases Chapter 10 Functional Dependencies and Normalization for Relational Databases Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant

More information

Advanced Relational Database Design

Advanced Relational Database Design APPENDIX B Advanced Relational Database Design In this appendix we cover advanced topics in relational database design. We first present the theory of multivalued dependencies, including a set of sound

More information

Functional Dependencies and Finding a Minimal Cover

Functional Dependencies and Finding a Minimal Cover Functional Dependencies and Finding a Minimal Cover Robert Soulé 1 Normalization An anomaly occurs in a database when you can update, insert, or delete data, and get undesired side-effects. These side

More information

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31 Databases -Normalization III (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31 This lecture This lecture describes 3rd normal form. (N Spadaccini 2010 and W Liu 2012) Databases -

More information

Schema Refinement, Functional Dependencies, Normalization

Schema Refinement, Functional Dependencies, Normalization Schema Refinement, Functional Dependencies, Normalization MSCI 346: Database Systems Güneş Aluç, University of Waterloo Spring 2015 MSCI 346: Database Systems Chapter 19 1 / 42 Outline 1 Introduction Design

More information

Relational Database Design: FD s & BCNF

Relational Database Design: FD s & BCNF CS145 Lecture Notes #5 Relational Database Design: FD s & BCNF Motivation Automatic translation from E/R or ODL may not produce the best relational design possible Sometimes database designers like to

More information

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.)

Why Is This Important? Schema Refinement and Normal Forms. The Evils of Redundancy. Functional Dependencies (FDs) Example (Contd.) Why Is This Important? Schema Refinement and Normal Forms Chapter 19 Many ways to model a given scenario in a database How do we find the best one? We will discuss objective criteria for evaluating database

More information

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005

Introduction to Databases, Fall 2005 IT University of Copenhagen. Lecture 5: Normalization II; Database design case studies. September 26, 2005 Introduction to Databases, Fall 2005 IT University of Copenhagen Lecture 5: Normalization II; Database design case studies September 26, 2005 Lecturer: Rasmus Pagh Today s lecture Normalization II: 3rd

More information

Limitations of DB Design Processes

Limitations of DB Design Processes Normalization CS 317/387 1 Limitations of DB Design Processes Provides a set of guidelines, does not result in a unique database schema Does not provide a way of evaluating alternative schemas Pitfalls:

More information

Functional Dependency and Normalization for Relational Databases

Functional Dependency and Normalization for Relational Databases Functional Dependency and Normalization for Relational Databases Introduction: Relational database design ultimately produces a set of relations. The implicit goals of the design activity are: information

More information

Theory of Relational Database Design and Normalization

Theory of Relational Database Design and Normalization Theory of Relational Database Design and Normalization (Based on Chapter 14 and some part of Chapter 15 in Fundamentals of Database Systems by Elmasri and Navathe, Ed. 3) 1 Informal Design Guidelines for

More information

Database design theory, Part I

Database design theory, Part I Database design theory, Part I Functional dependencies Introduction As we saw in the last segment, designing a good database is a non trivial matter. The E/R model gives a useful rapid prototyping tool,

More information

CS143 Notes: Normalization Theory

CS143 Notes: Normalization Theory CS143 Notes: Normalization Theory Book Chapters (4th) Chapters 7.1-6, 7.8, 7.10 (5th) Chapters 7.1-6, 7.8 (6th) Chapters 8.1-6, 8.8 INTRODUCTION Main question How do we design good tables for a relational

More information

Normalization in OODB Design

Normalization in OODB Design Normalization in OODB Design Byung S. Lee Graduate Programs in Software University of St. Thomas St. Paul, Minnesota bslee@stthomas.edu Abstract When we design an object-oriented database schema, we need

More information

CSCI-GA.2433-001 Database Systems Lecture 7: Schema Refinement and Normalization

CSCI-GA.2433-001 Database Systems Lecture 7: Schema Refinement and Normalization CSCI-GA.2433-001 Database Systems Lecture 7: Schema Refinement and Normalization Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com View 1 View 2 View 3 Conceptual Schema At that point we

More information

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1 COSC344 Database Theory and Applications Lecture 9 Normalisation COSC344 Lecture 9 1 Overview Last Lecture Functional Dependencies This Lecture Normalisation Introduction 1NF 2NF 3NF BCNF Source: Section

More information

6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization 6.830 Lecture 3 9.16.2015 PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization Animals(name,age,species,cageno,keptby,feedtime) Keeper(id,name)

More information

Determination of the normalization level of database schemas through equivalence classes of attributes

Determination of the normalization level of database schemas through equivalence classes of attributes Computer Science Journal of Moldova, vol.17, no.2(50), 2009 Determination of the normalization level of database schemas through equivalence classes of attributes Cotelea Vitalie Abstract In this paper,

More information

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme

normalisation Goals: Suppose we have a db scheme: is it good? define precise notions of the qualities of a relational database scheme Goals: Suppose we have a db scheme: is it good? Suppose we have a db scheme derived from an ER diagram: is it good? define precise notions of the qualities of a relational database scheme define algorithms

More information

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University CS 377 Database Systems Database Design Theory and Normalization Li Xiong Department of Mathematics and Computer Science Emory University 1 Relational database design So far Conceptual database design

More information

Database Systems Concepts, Languages and Architectures

Database Systems Concepts, Languages and Architectures These slides are for use with Database Systems Concepts, Languages and Architectures Paolo Atzeni Stefano Ceri Stefano Paraboschi Riccardo Torlone To view these slides on-screen or with a projector use

More information

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Outline Informal Design Guidelines

More information

Database Management System

Database Management System UNIT -6 Database Design Informal Design Guidelines for Relation Schemas; Functional Dependencies; Normal Forms Based on Primary Keys; General Definitions of Second and Third Normal Forms; Boyce-Codd Normal

More information

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London.. 1 21.01 20 2 23.01 2 Martin Paris.. 1 26.10 25 3 Deen London.. 2 29.

C# Cname Ccity.. P1# Date1 Qnt1 P2# Date2 P9# Date9 1 Codd London.. 1 21.01 20 2 23.01 2 Martin Paris.. 1 26.10 25 3 Deen London.. 2 29. 4. Normalisation 4.1 Introduction Suppose we are now given the task of designing and creating a database. How do we produce a good design? What relations should we have in the database? What attributes

More information

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12

Theory behind Normalization & DB Design. Satisfiability: Does an FD hold? Lecture 12 Theory behind Normalization & DB Design Lecture 12 Satisfiability: Does an FD hold? Satisfiability of FDs Given: FD X Y and relation R Output: Does R satisfy X Y? Algorithm: a.sort R on X b.do all the

More information

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES

Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES 1 Chapter 5: FUNCTIONAL DEPENDENCIES AND NORMALIZATION FOR RELATIONAL DATABASES INFORMAL DESIGN GUIDELINES FOR RELATION SCHEMAS We discuss four informal measures of quality for relation schema design in

More information

Relational Database Design Theory

Relational Database Design Theory Relational Database Design Theory Informal guidelines for good relational designs Functional dependencies Normal forms and normalization 1NF, 2NF, 3NF BCNF, 4NF, 5NF Inference rules on functional dependencies

More information

Chapter 10 Functional Dependencies and Normalization for Relational Databases

Chapter 10 Functional Dependencies and Normalization for Relational Databases Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright 2004 Pearson Education, Inc. Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of

More information

Database Design and Normalization

Database Design and Normalization Database Design and Normalization Chapter 10 (Week 11) EE562 Slides and Modified Slides from Database Management Systems, R. Ramakrishnan 1 Computing Closure F + Example: List all FDs with: - a single

More information

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015

Introduction Decomposition Simple Synthesis Bernstein Synthesis and Beyond. 6. Normalization. Stéphane Bressan. January 28, 2015 6. Normalization Stéphane Bressan January 28, 2015 1 / 42 This lecture is based on material by Professor Ling Tok Wang. CS 4221: Database Design The Relational Model Ling Tok Wang National University of

More information

Formal Languages and Automata Theory - Regular Expressions and Finite Automata -

Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Formal Languages and Automata Theory - Regular Expressions and Finite Automata - Samarjit Chakraborty Computer Engineering and Networks Laboratory Swiss Federal Institute of Technology (ETH) Zürich March

More information

Introduction to Database Systems. Normalization

Introduction to Database Systems. Normalization Introduction to Database Systems Normalization Werner Nutt 1 Normalization 1. Anomalies 1. Anomalies 2. Boyce-Codd Normal Form 3. 3 rd Normal Form 2 Anomalies The goal of relational schema design is to

More information

Normalization in Database Design

Normalization in Database Design in Database Design Marek Rychly mrychly@strathmore.edu Strathmore University, @ilabafrica & Brno University of Technology, Faculty of Information Technology Advanced Databases and Enterprise Systems 14

More information

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints

SQL DDL. DBS Database Systems Designing Relational Databases. Inclusion Constraints. Key Constraints DBS Database Systems Designing Relational Databases Peter Buneman 12 October 2010 SQL DDL In its simplest use, SQL s Data Definition Language (DDL) provides a name and a type for each column of a table.

More information

Database Constraints and Design

Database Constraints and Design Database Constraints and Design We know that databases are often required to satisfy some integrity constraints. The most common ones are functional and inclusion dependencies. We ll study properties of

More information

2. Basic Relational Data Model

2. Basic Relational Data Model 2. Basic Relational Data Model 2.1 Introduction Basic concepts of information models, their realisation in databases comprising data objects and object relationships, and their management by DBMS s that

More information

Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations

Normalisation. Why normalise? To improve (simplify) database design in order to. Avoid update problems Avoid redundancy Simplify update operations Normalisation Why normalise? To improve (simplify) database design in order to Avoid update problems Avoid redundancy Simplify update operations 1 Example ( the practical difference between a first normal

More information

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina

Normalisation to 3NF. Database Systems Lecture 11 Natasha Alechina Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third Normal Forms For more

More information

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B.

Chapter 10. Functional Dependencies and Normalization for Relational Databases. Copyright 2007 Ramez Elmasri and Shamkant B. Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Chapter Outline 1 Informal Design Guidelines for Relational Databases

More information

Relational Normalization Theory (supplemental material)

Relational Normalization Theory (supplemental material) Relational Normalization Theory (supplemental material) CSE 532, Theory of Database Systems Stony Brook University http://www.cs.stonybrook.edu/~cse532 2 Quiz 8 Consider a schema S with functional dependencies:

More information

Theory of Relational Database Design and Normalization

Theory of Relational Database Design and Normalization Theory of Relational Database Design and Normalization (Based on Chapter 14 and some part of Chapter 15 in Fundamentals of Database Systems by Elmasri and Navathe) 1 Informal Design Guidelines for Relational

More information

Database Sample Examination

Database Sample Examination Part 1: SQL Database Sample Examination (Spring 2007) Question 1: Draw a simple ER diagram that results in a primary key/foreign key constraint to be created between the tables: CREATE TABLE Salespersons

More information

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

More information

KNOWLEDGE FACTORING USING NORMALIZATION THEORY

KNOWLEDGE FACTORING USING NORMALIZATION THEORY KNOWLEDGE FACTORING USING NORMALIZATION THEORY J. VANTHIENEN M. SNOECK Katholieke Universiteit Leuven Department of Applied Economic Sciences Dekenstraat 2, 3000 Leuven (Belgium) tel. (+32) 16 28 58 09

More information

DATABASE NORMALIZATION

DATABASE NORMALIZATION DATABASE NORMALIZATION Normalization: process of efficiently organizing data in the DB. RELATIONS (attributes grouped together) Accurate representation of data, relationships and constraints. Goal: - Eliminate

More information

Functional Dependencies

Functional Dependencies BCNF and 3NF Functional Dependencies Functional dependencies: modeling constraints on attributes stud-id name address course-id session-id classroom instructor Functional dependencies should be obtained

More information

Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications. Overview - detailed. Goal. Faloutsos CMU SCS 15-415

Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications. Overview - detailed. Goal. Faloutsos CMU SCS 15-415 Faloutsos 15-415 Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #17: Schema Refinement & Normalization - Normal Forms (R&G, ch. 19) Overview - detailed DB design

More information

BCA. Database Management System

BCA. Database Management System BCA IV Sem Database Management System Multiple choice questions 1. A Database Management System (DBMS) is A. Collection of interrelated data B. Collection of programs to access data C. Collection of data

More information

An Algorithmic Approach to Database Normalization

An Algorithmic Approach to Database Normalization An Algorithmic Approach to Database Normalization M. Demba College of Computer Science and Information Aljouf University, Kingdom of Saudi Arabia bah.demba@ju.edu.sa ABSTRACT When an attempt is made to

More information

Normalisation in the Presence of Lists

Normalisation in the Presence of Lists 1 Normalisation in the Presence of Lists Sven Hartmann, Sebastian Link Information Science Research Centre, Massey University, Palmerston North, New Zealand 1. Motivation & Revision of the RDM 2. The Brouwerian

More information

Reading 13 : Finite State Automata and Regular Expressions

Reading 13 : Finite State Automata and Regular Expressions CS/Math 24: Introduction to Discrete Mathematics Fall 25 Reading 3 : Finite State Automata and Regular Expressions Instructors: Beck Hasti, Gautam Prakriya In this reading we study a mathematical model

More information

Testing LTL Formula Translation into Büchi Automata

Testing LTL Formula Translation into Büchi Automata Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland

More information

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies

Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies Objectives of Database Design Functional Dependencies 1st Normal Form Decomposition Boyce-Codd Normal Form 3rd Normal Form Multivalue Dependencies 4th Normal Form General view over the design process 1

More information

Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks]

Question 1. Relational Data Model [17 marks] Question 2. SQL and Relational Algebra [31 marks] EXAMINATIONS 2005 MID-YEAR COMP 302 Database Systems Time allowed: Instructions: 3 Hours Answer all questions. Make sure that your answers are clear and to the point. Write your answers in the spaces provided.

More information

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010

Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010 Jordan University of Science & Technology Computer Science Department CS 728: Advanced Database Systems Midterm Exam First 2009/2010 Student Name: ID: Part 1: Multiple-Choice Questions (17 questions, 1

More information

OPRE 6201 : 2. Simplex Method

OPRE 6201 : 2. Simplex Method OPRE 6201 : 2. Simplex Method 1 The Graphical Method: An Example Consider the following linear program: Max 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2

More information

RELATIONAL DATABASE DESIGN

RELATIONAL DATABASE DESIGN RELATIONAL DATABASE DESIGN g Good database design - Avoiding anomalies g Functional Dependencies g Normalization & Decomposition Using Functional Dependencies g 1NF - Atomic Domains and First Normal Form

More information

SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by

SUBGROUPS OF CYCLIC GROUPS. 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by SUBGROUPS OF CYCLIC GROUPS KEITH CONRAD 1. Introduction In a group G, we denote the (cyclic) group of powers of some g G by g = {g k : k Z}. If G = g, then G itself is cyclic, with g as a generator. Examples

More information

INCIDENCE-BETWEENNESS GEOMETRY

INCIDENCE-BETWEENNESS GEOMETRY INCIDENCE-BETWEENNESS GEOMETRY MATH 410, CSUSM. SPRING 2008. PROFESSOR AITKEN This document covers the geometry that can be developed with just the axioms related to incidence and betweenness. The full

More information

MCQs~Databases~Relational Model and Normalization http://en.wikipedia.org/wiki/database_normalization

MCQs~Databases~Relational Model and Normalization http://en.wikipedia.org/wiki/database_normalization http://en.wikipedia.org/wiki/database_normalization Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy. Normalization usually involves

More information

C H A P T E R Regular Expressions regular expression

C H A P T E R Regular Expressions regular expression 7 CHAPTER Regular Expressions Most programmers and other power-users of computer systems have used tools that match text patterns. You may have used a Web search engine with a pattern like travel cancun

More information

Information, Entropy, and Coding

Information, Entropy, and Coding Chapter 8 Information, Entropy, and Coding 8. The Need for Data Compression To motivate the material in this chapter, we first consider various data sources and some estimates for the amount of data associated

More information

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model The entity-relationship (E-R) model is a a data model in which information stored

More information

Automata and Formal Languages

Automata and Formal Languages Automata and Formal Languages Winter 2009-2010 Yacov Hel-Or 1 What this course is all about This course is about mathematical models of computation We ll study different machine models (finite automata,

More information

Graham Kemp (telephone 772 54 11, room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00.

Graham Kemp (telephone 772 54 11, room 6475 EDIT) The examiner will visit the exam room at 15:00 and 17:00. CHALMERS UNIVERSITY OF TECHNOLOGY Department of Computer Science and Engineering Examination in Databases, TDA357/DIT620 Tuesday 17 December 2013, 14:00-18:00 Examiner: Results: Exam review: Grades: Graham

More information

Module 5: Normalization of database tables

Module 5: Normalization of database tables Module 5: Normalization of database tables Normalization is a process for evaluating and correcting table structures to minimize data redundancies, thereby reducing the likelihood of data anomalies. The

More information

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division

Answer Key. UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division Answer Key UNIVERSITY OF CALIFORNIA College of Engineering Department of EECS, Computer Science Division CS186 Fall 2003 Eben Haber Midterm Midterm Exam: Introduction to Database Systems This exam has

More information

You know from calculus that functions play a fundamental role in mathematics.

You know from calculus that functions play a fundamental role in mathematics. CHPTER 12 Functions You know from calculus that functions play a fundamental role in mathematics. You likely view a function as a kind of formula that describes a relationship between two (or more) quantities.

More information

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly

More information

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24 LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen 2007-05-24 1. A database schema is a. the state of the db b. a description of the db using a

More information

1 Uncertainty and Preferences

1 Uncertainty and Preferences In this chapter, we present the theory of consumer preferences on risky outcomes. The theory is then applied to study the demand for insurance. Consider the following story. John wants to mail a package

More information

Normalization. CIS 331: Introduction to Database Systems

Normalization. CIS 331: Introduction to Database Systems Normalization CIS 331: Introduction to Database Systems Normalization: Reminder Why do we need to normalize? To avoid redundancy (less storage space needed, and data is consistent) To avoid update/delete

More information

TOPOLOGY: THE JOURNEY INTO SEPARATION AXIOMS

TOPOLOGY: THE JOURNEY INTO SEPARATION AXIOMS TOPOLOGY: THE JOURNEY INTO SEPARATION AXIOMS VIPUL NAIK Abstract. In this journey, we are going to explore the so called separation axioms in greater detail. We shall try to understand how these axioms

More information

Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354

Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354 Quiz 3: Database Systems I Instructor: Hassan Khosravi Spring 2012 CMPT 354 1. [10] Show that each of the following are not valid rules about FD s by giving a small example relations that satisfy the given

More information

Design Theory for Relational Databases: Functional Dependencies and Normalization

Design Theory for Relational Databases: Functional Dependencies and Normalization Design Theory for Relational Databases: Functional Dependencies and Normalization Juliana Freire Some slides adapted from L. Delcambre, R. Ramakrishnan, G. Lindstrom, J. Ullman and Silberschatz, Korth

More information

Theory I: Database Foundations

Theory I: Database Foundations Theory I: Database Foundations 19. 19. Theory I: Database Foundations 07.2012 1 Theory I: Database Foundations 20. Formal Design 20. 20: Formal Design We want to distinguish good from bad database design.

More information

Lecture 6. SQL, Logical DB Design

Lecture 6. SQL, Logical DB Design Lecture 6 SQL, Logical DB Design Relational Query Languages A major strength of the relational model: supports simple, powerful querying of data. Queries can be written intuitively, and the DBMS is responsible

More information

Boyce-Codd Normal Form

Boyce-Codd Normal Form 4NF Boyce-Codd Normal Form A relation schema R is in BCNF if for all functional dependencies in F + of the form α β at least one of the following holds α β is trivial (i.e., β α) α is a superkey for R

More information

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design Chapter 7: Relational Database Design Pitfalls in Relational Database Design Decomposition Normalization Using Functional Dependencies Normalization Using Multivalued Dependencies Normalization Using Join

More information

11 Ideals. 11.1 Revisiting Z

11 Ideals. 11.1 Revisiting Z 11 Ideals The presentation here is somewhat different than the text. In particular, the sections do not match up. We have seen issues with the failure of unique factorization already, e.g., Z[ 5] = O Q(

More information

Chapter 23. Database Security. Security Issues. Database Security

Chapter 23. Database Security. Security Issues. Database Security Chapter 23 Database Security Security Issues Legal and ethical issues Policy issues System-related issues The need to identify multiple security levels 2 Database Security A DBMS typically includes a database

More information

Notes on Complexity Theory Last updated: August, 2011. Lecture 1

Notes on Complexity Theory Last updated: August, 2011. Lecture 1 Notes on Complexity Theory Last updated: August, 2011 Jonathan Katz Lecture 1 1 Turing Machines I assume that most students have encountered Turing machines before. (Students who have not may want to look

More information

The process of database development. Logical model: relational DBMS. Relation

The process of database development. Logical model: relational DBMS. Relation The process of database development Reality (Universe of Discourse) Relational Databases and SQL Basic Concepts The 3rd normal form Structured Query Language (SQL) Conceptual model (e.g. Entity-Relationship

More information

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation

Relational Normalization: Contents. Relational Database Design: Rationale. Relational Database Design. Motivation Relational Normalization: Contents Motivation Functional Dependencies First Normal Form Second Normal Form Third Normal Form Boyce-Codd Normal Form Decomposition Algorithms Multivalued Dependencies and

More information

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2.

a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. Chapter 1 LINEAR EQUATIONS 1.1 Introduction to linear equations A linear equation in n unknowns x 1, x,, x n is an equation of the form a 1 x 1 + a x + + a n x n = b, where a 1, a,..., a n, b are given

More information

[Refer Slide Time: 05:10]

[Refer Slide Time: 05:10] Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture

More information

Lecture 2 Normalization

Lecture 2 Normalization MIT 533 ระบบฐานข อม ล 2 Lecture 2 Normalization Walailuk University Lecture 2: Normalization 1 Objectives The purpose of normalization The identification of various types of update anomalies The concept

More information

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms

Introduction to Database Systems. Chapter 4 Normal Forms in the Relational Model. Chapter 4 Normal Forms Introduction to Database Systems Winter term 2013/2014 Melanie Herschel melanie.herschel@lri.fr Université Paris Sud, LRI 1 Chapter 4 Normal Forms in the Relational Model After completing this chapter,

More information