Multi-Valued Relationship Attributes in Extended Entity Relationship Model and Their Mapping to Relational Schema



Similar documents
Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model

CSC 742 Database Management Systems

A Tool for Generating Relational Database Schema from EER Diagram

Database Design Methodology

Designing Databases. Introduction

AVOIDANCE OF CYCLICAL REFERENCE OF FOREIGN KEYS IN DATA MODELING USING THE ENTITY-RELATIONSHIP MODEL

THE ENTITY- RELATIONSHIP (ER) MODEL CHAPTER 7 (6/E) CHAPTER 3 (5/E)

Unit 2.1. Data Analysis 1 - V Data Analysis 1. Dr Gordon Russell, Napier University

not necessarily strictly sequential feedback loops exist, i.e. may need to revisit earlier stages during a later stage

Relational Database Concepts

Data Analysis 1. SET08104 Database Systems. Napier University

Chapter 8 The Enhanced Entity- Relationship (EER) Model

Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant

Uses Crows feet notation for ER Diagrams in ERwin

Databases and BigData

Relational Schema Design

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

Fundamentals of Database System

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

The Entity-Relationship Model

Database Design Process

2. Conceptual Modeling using the Entity-Relationship Model

7.1 The Information system

Relational Database Basics Review

IV. The (Extended) Entity-Relationship Model

three Entity-Relationship Modeling chapter OVERVIEW CHAPTER

Performance Evaluation of Natural and Surrogate Key Database Architectures

Chapter 2: Entity-Relationship Model. E-R R Diagrams

Database Design Process

DATABASE DESIGN. - Developing database and information systems is performed using a development lifecycle, which consists of a series of steps.

The Relational Data Model: Structure

CS 377 Database Systems. Database Design Theory and Normalization. Li Xiong Department of Mathematics and Computer Science Emory University

Database IST400/600. Jian Qin. A collection of data? A computer system? Everything you collected for your group project?

Lesson 8: Introduction to Databases E-R Data Modeling

Concepts of Database Management Seventh Edition. Chapter 6 Database Design 2: Design Method

DATABASE MANAGEMENT SYSTEMS. Question Bank:

Entity-Relationship Model

Data Modeling. Database Systems: The Complete Book Ch ,

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

The 3 Normal Forms: Copyright Fred Coulson 2007 (last revised February 1, 2009)

On Development of Fuzzy Relational Database Applications

A Comparative Analysis of Entity-Relationship Diagrams 1

A brief overview of developing a conceptual data model as the first step in creating a relational database.

DATABASE INTRODUCTION

Databases -Normalization III. (N Spadaccini 2010 and W Liu 2012) Databases - Normalization III 1 / 31

We know how to query a database using SQL. A set of tables and their schemas are given Data are properly loaded

Chapter 3. Data Modeling Using the Entity-Relationship (ER) Model

Data Modelling and E-R Diagrams

Normalization in OODB Design

Methods Integration. Data Modelling in ZIM. Paper: V. Kasurinen and K. Sere. Proceedings of the Methods Integration Workshop, Leeds, March 1996

ER modelling, Weak Entities, Class Hierarchies, Aggregation

Foundations of Information Management

Fundamentals of Database Design

Designing a Database Schema

Lecture Notes INFORMATION RESOURCES

LiTH, Tekniska högskolan vid Linköpings universitet 1(7) IDA, Institutionen för datavetenskap Juha Takkinen

3. Relational Model and Relational Algebra

Database Design Methodology

XV. The Entity-Relationship Model

IT2305 Database Systems I (Compulsory)

ComponentNo. C_Description UnitOfMeasure. C_Quantity

Database Design Process. Databases - Entity-Relationship Modelling. Requirements Analysis. Database Design

Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases

Data Modeling with Entity-Relationship Diagrams

Chapter 2: Entity-Relationship Model

Conceptual Design Using the Entity-Relationship (ER) Model

COSC344 Database Theory and Applications. Lecture 9 Normalisation. COSC344 Lecture 9 1

IT2304: Database Systems 1 (DBS 1)

Schema Design and Normal Forms Sid Name Level Rating Wage Hours

Database Design. Marta Jakubowska-Sobczak IT/ADC based on slides prepared by Paula Figueiredo, IT/DB

How To Manage Data In A Database System

Theory of Relational Database Design and Normalization

Chapter 5: Logical Database Design and the Relational Model Part 2: Normalization. Introduction to Normalization. Normal Forms.

ENTITY-ANALYSIS AND VIEW-INTEGRATION DATABASE DESIGN METHODOLOGIES: A COMPARATIVE STUDY

Normalization in Database Design

Theory of Relational Database Design and Normalization

Representing XML Schema in UML A Comparison of Approaches

E-R Method Applied to Design the Teacher Information Management System s Database Model

RELATIONSHIP STRENGTH

SCHEMAS AND STATE OF THE DATABASE

Exercise 1: Relational Model

Using the Integrated Activity-Based Costing and Economic Value Added Information System for Project Management

Review: Participation Constraints

Converting E-R Diagrams to Relational Model. Winter Lecture 17

CDC UNIFIED PROCESS PRACTICES GUIDE

Umbrello UML Modeller Handbook

Using Use Cases for requirements capture. Pete McBreen McBreen.Consulting

The Entity-Relationship Model

Database Design. Adrienne Watt. Port Moody

The Entity-Relationship Model

The Relational Data Model and Relational Database Constraints

Lecture 9: Requirements Modelling

SAMPLE FINAL EXAMINATION SPRING SESSION 2015

Database Systems. Session 3 Main Theme. Enterprise Data Modeling Using The Entity/Relationship (ER) Model. Dr. Jean-Claude Franchitti

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

Conceptual Design: Entity Relationship Models. Objectives. Overview

Using Entity-Relationship Diagrams To Count Data Functions Ian Brown, CFPS Booz Allen Hamilton 8283 Greensboro Dr. McLean, VA USA

Entity/Relationship Modelling. Database Systems Lecture 4 Natasha Alechina

Transcription:

ulti-valued Relationship Attributes in Extended Entity Relationship odel Their apping to Relational Schema Tauqeer Hussain*, Shafay Shamail, ian. Awais Department of Computer Science Lahore University of anagement Sciences (LUS), Lahore, Pakistan Abstract Conceptual modeling is one of the most important phases in designing database applications. The success of this design relies heavily on how clearly the real world requirements are represented in the conceptual model. To date, the Extended Entity Relationship (EER) model extended from the traditional Entity Relationship (ER) model is a widely used modeling technique during the phase of conceptual modeling. This paper identifies semantic ambiguities that are still present in the EER model leading to incorrect knowledge representation eventually to incorrect design of relational database schema. These ambiguities are identified in case of many-to-many relationships which have their own attributes. This paper shows that mapping such relationships to a relational database schema generates relations having primary keys which cannot guarantee unique tuples for real world data thus violating the definition of a primary key. In addition, it shows that these relations may not satisfy second normal form. A number of such cases are elaborated a new concept of multi-valued relationship attribute is introduced that can successfully represent these real world constraints. For this concept, a diagrammatic notation to use in ER diagram is introduced. A mapping algorithm to transform the corresponding EER model to a relational database schema is also defined. This concept of multi-valued relationship attribute its mapping to relational schema generate relations which satisfy higher normal forms. Keywords: Extended ER model, conceptual modeling, EER-to-Relational mapping, relationship attribute, normalization * Corresponding author 1. Introduction Designing a good database is one of the most important steps of systems design phase it provides a strong foundation for the success of database applications. A database design methodology for relational databases is defined in three steps: 1) Conceptual modeling: the data requirements are conceptualized using a conceptual model representing the semantics of real world, 2) apping: the conceptual model is transformed into a set of cidate relations, 3) ormalization: the cidate relations are further refined to remove data redundancy to achieve higher degree of data integrity [Teorey86]. The most deming challenging step in this design methodology is conceptual modeling whereas the later steps are merely transformations [Engels92]. For conceptual modeling, the entity-relationship model [Chen76] has been successfully used for traditional database applications because of its ease of understing its convenience in representation [Engels92]. However, the task of capturing the semantics of data is a never-ending one [Codd79]. The ER model lacked modeling constructs like specialization (or generalization) to represent more complex requirements specially needed in applications of newer database technology [Teorey86]. In EER model, important enhancements to the traditional ER model were suggested in literature [Elmasri80, Hammer81, Elmasri85, Teorey86, Gogolla91]. These enhancements introduced concepts of subclass superclass, class/subclass relationships, category which is a representation of union of different entity types the related concepts of generalization specialization [Elmasri04]. Thalheim (2000) notes the shortcomings of the ER literature as the use of ER concepts often lack a clear statement of the intended semantics applies different semantics to the same concept,

mixes semantics of different constructs. In this paper, various situations are presented to appreciate that existing EER model does not have clear representation for relationship attributes. It results in relations which may not satisfy certain normal forms. A solution proposed in the literature is to introduce artificial constructs (weak entity types more relationship types) [Thalheim00] which do not exist in the real world. This makes the schema less understable [Thalheim00]. So, it is a problem not only of accurate knowledge representation but also of generating a normalized database schema. Analysis of this problem reveals that the definition of relationship attribute should be clarified in terms of the two concepts introduced in this paper - Single Valued Relationship Attribute (SVRA) ulti-valued Relationship Attribute (VRA). In the next section, we formally define concepts of SVRA VRA. We also define situations where a single VRA is required on a relationship type, where multiple VRA are required where SVRA VRA together represent a real world situation. We also introduce the diagrammatic notation that can be used in ER diagram corresponding to these concepts. Finally, we define an algorithm which provides an EER-to-relational mapping establish that the resulting relations satisfy those normal forms which are otherwise violated. The underlying assumption is that we are using EER diagram for conceptual modeling the database schema required is for a relational database. 2. ulti-valued Relationship Attributes This paper introduces the concept of a multivalued relationship attribute which is primarily different from nested attributes [Thalheim00], complex attributes [Elmasri04], multi-valued attributes of entity types [Elmasri04]. This new concept solves conceptual differences as posed in the problems presented in the following sections. We now present various scenarios for a common real world example of a sales system. In this system, are identified as entity types having appropriate attributes PURCHASES is identified as a relationship type between these two entity types. 2.1. any-to-any (:) Relationship Type Attributes Scenario 1: Consider a situation where a customer may purchase a number of products a product can be purchased by a number of customers. We are interested in keeping track of the date when a customer purchases a product. This situation is represented in Fig.1 using EER model notation. In order to keep track of the date when a customer purchases a product, Date is marked as an attribute on the relationship type. This model can be viewed in terms of a semantic net model as shown in Fig. 2, where for example the relationship instance r1 relating customer (id) 29 to the product (code) P- 101 has a value 14-Jan.-2003 for its Date attribute. When this part of the EER model is mapped to a relational database schema following the mapping algorithm given by Elmasri avathe [Elmasri04], the following relations are created: Date Fig. 1: A : relationship type with a relationship attribute (, ) (, ) PURCHASES (Cust f.k., Prod f.k., Date) The underlined attribute represents a primary key of the relation whereas a foreign key is represented by an attribute with subscript f.k. For relations, are primary keys respectively, whereas, for the relation PURCHASES created corresponding to : relationship type, primary keys of relations corresponding to participating entity types namely become foreign keys together form the primary key of relation PURCHASES. This model its mapping (as per the algorithm given by Elmasri avathe [Elmasri04]) works fine as long as a customer purchases a

product only on a single date, that is, only a single date is defined for each relationship instance of Fig. 2. In terms of functional dependency, this constraint can be written as: Cust, Prod Date But this is an unrealistic constraint for most of the real world situations where a customer is not bound to purchase a product only once. This leads us to scenario 2 where we discuss the situation of a relationship attribute having more than one value. 29 35 12 Fig. 2: Relationship Instances with SVRA 2.2. VRA PURCHASES 14-Jan-2003 r1 Scenario 2: A customer purchases the same product on different dates we want to keep track of all such purchases This scenario is represented in Fig. 3 where the relationship instance r1 has two values for the attribute Date : 14-Jan.-2003 12-ay-2003 against two purchases of the same product by the same customer. This requires a relationship attribute which can have more than one value (defined as multi-valued relationship attribute in definition 1) to be differentiated from the one which may have at most one value for a relationship instance (defined as single-valued relationship attribute in definition 2). Definition 1: A ulti-valued Relationship Attribute (VRA) is a relationship attribute which may have more than one value for a relationship instance of the relationship set. Definition 2: A Single-Valued Relationship Attribute (SVRA) is a relationship attribute which cannot have more than one value for a relationship instance of the relationship set. A relationship attribute (single-valued or multivalued) can now be defined mathematically as: r2 r3 r4 P-101 A-345 Definition 3: An attribute A of relationship type R whose value set is V is a function from R to the power set P(V) of V: A : R P( V ) This definition covers single-valued multivalued relationship attributes, as well as nulls. A null value is represented by the empty set. For single-valued relationship attributes, A(r) is always a singleton for each relationship instance r of the set R; whereas there is no such restriction for a VRA. Here A(r) refers to the value of attribute A for relationship instance r. Based upon the concept definitions given above, a new notation is proposed to represent the concept of VRA in the EER diagram. This notation is writing the name of the attribute in the set notation i.e. braces within the oval (symbol for attribute) corresponding to the idea that this attribute may have a set of values. ow Date attribute in Fig. 1 changes to {Date} in Fig. 4 for the revised situation. Fig. 3: Relationship Instances with VRA 29 35 12 {Date} PURCHASES {14-Jan-2003, 12- ay -2003} r1 Fig. 4: A : relationship type with a VRA Consequently the EER-to-relational mapping algorithm given by Elmasri avathe [Elmasri04] should also be modified to take care of this situation because otherwise there will be multiple tuples in the relation PURCHASES with identical values of Cust Prod violating the primary key constraint for this relation. This modified algorithm for mapping of : relationship type is presented in Algorithm 1 below: ALGORITH 1: apping : relationship type to a relational schema For every binary : relationship type R between entity types E 1 E 2, having a set of multi-valued relationship attributes, r2 r3 r4 P- 101 A - 345

1. create a new relation S to represent R such that: Attr(S) = {PK(E 1 )} U {PK(E 2 )} U Attr(R) U where PK(E i ) is primary key of the relation created for entity type E i, Attr(R) is the set of simple attributes (or simple components of composite attributes) of R 2. {PK(S)} = {PK(E 1 )} U {PK(E 2 )} U It should be noted that PK(E 1 ) PK(E 2 ) are foreign keys in relation S. Step 2 of this algorithm suggests that VRA should also be marked as part of the primary key of relation S along with PK(E 1 ) PK(E 2 ). Applying this algorithm, we get the following relation in our schema: PURCHASES (Cust f.k., Prod f.k., Date) It is interesting to note that the given scenario could be modeled by introducing an artificial entity type PURCHASE by adding its relationship types with ; however, introduction of these artificial constructs produces a complex conceptual model which is less explicable [Thalheim00]. ow, let us extend this scenario a bit further. 2.3. ultiple VRAs Scenario 3:. Apart from the date, we want to keep track of the quantity of a product, as well, purchased in every instance. In this scenario, the relationship type PURCHASES will have two attributes Date Quantity. The question then arises: Is each of these two attributes a VRA? According to the definition of VRA given above, the answer is, of course, yes; because each of these attributes may have multiple values for a single relationship instance present in the relationship set. This implies that, as per Algorithm 1, the relation PURCHASES will have attributes Date Quantity both as a part of the primary key. PURCHASES (Cust f.k., Prod f.k., Date,Quantity) 2.4. VRA SVRA Scenario 4: We have an additional constraint that a customer purchases a particular product always in the same quantity. The attribute Quantity, in this case, is no more a VRA but it is a SVRA; because it always has only a single value for each relationship instance. The ER model for this situation is given in Fig. 5 {Date} Fig. 5: A : relationship type with a VRA a relationship attribute Applying Algorithm 1 on Fig. 5 for its transformation to a relational schema, a relation PURCHASES is created having attributes Cust Prod as foreign keys, the attributes Date Quantity. Since Date is a VRA, the primary key for this relation comprises of Cust, Prod, Date. This solution, however, violates second normal form (2F), in this case, due to the existence of the following functional dependency: Cust, Prod Quantity This requires further refinement of the mapping algorithm (Algorithm 1) which is then given below as Algorithm 2: ALGORITH 2: apping : relationship type generating a normalized relational database schema For every binary : relationship type R between entity types E 1 E 2, 1. create a new relation S to represent R such that: Attr(S) = {PK(E 1 )} U {PK(E 2 )} U {SVRA} where PK(E i ) is primary key of the relation created for entity type E i, {SVRA} is the set of all singlevalued relationship attributes of R 2. {PK(S)} = {PK(E 1 )} U {PK(E 2 )} 3. if there exists a VRA of R, then: a. create a new relation T such that: Attr(T) = {PK(E 1 )} U {PK(E 2 )} U

b. {PK(T)} = {PK(E 1 )} U {PK(E 2 )} U where is the set of all multi-valued relationship attributes of R. Applying this algorithm for relationship type PURCHASES, we get the following relations: PURCHASES1 (Cust, Prod, Quantity) PURCHASES2 (Cust, Prod, Date) Each of these relations now satisfy 2F, if no other functional dependency violation occurs, 3F BCF are also satisfied. It should be noted that the concept of multi valued relationship attribute is not exclusive to : relationship types. It is as good as for other relationship types, as well. 3. Conclusion In this paper we highlighted the deficiency of EER model in semantic representation for manyto-many relationship types. This deficiency results in unclear conceptual model in a poor database design having violations of key constraints of second normal form. In order to eliminate this deficiency, a new concept of multi-valued relationship type was formally defined. It was shown with various examples that the new concept of VRA nicely resolved the semantic normalization problems. For this new concept, an ER diagram notation a mapping algorithm for its transformation to relational schema were also devised. It was demonstrated that the relations created using this algorithm satisfied the normal forms which were otherwise violated. In our future research, we intend to prove formally that if no other violations occur, the solution presented to the above stated problems would satisfy relations up to 4F. 4. References [Chen76] Chen, P.P. (1976) The entity relationship model: towards a unified view of data. AC Transactions on Database Design, 1(1), pp. 9-36 [Codd79] Codd, E.F. (1979) Extending the database relational model to capture more meaning. AC Transactions on Database Systems, 4(4), pp. 397-434 [Elmasri80] Elmasri, R., Wiederhold, G. (1980) Structural properties of relationships their representation. CC, AFIPS, 49 [Elmasri85] Elmasri, R., Weeldreyer, J., Henver, A. (1985) The category concept: An extension to the entity-relationship model. International Journal on Data Knowledge Engineering, 1(1) [Elmasri04] Elmasri, R., avathe, S. (2004) Fundamentals of database systems, 4 th Ed. Pearson Education Inc. [Engels92] Engels, G., Gogolla,., Hohenstein, U., Hülsmann, K., Löhr-Richter, P., Saake, G., Ehrich D. (1992) Conceptual modelling of database applications using an extended ER model. Data Knowledge Engineering, 9(2), pp. 157-204 [Gogolla91] Gogolla,., Hohenstein, U. (1991) Towards a semantic view of an extended entity-relationship model. AC Transactions on Database Systems, 16(3), pp. 369-416 [Hammer81] Hammer,., cleod, D. (1981) Database description with SD: a semantic data model. AC Transactions on Database Systems, 6(3), pp. 369-416 [Teorey86] Teorey, T., Yang, D., Fry, J. (1986) A logical design methodology for relational databases using the extended entity relationship model. AC Computing Surveys, 18(2), pp. 197-222 [Thalheim00] Thalheim, Bernhard (2000) Entity-Relationship odeling Foundations of Database Technology, Springer-Verlag