Mapping UML Class Diagrams into Object-Relational Schemas



Similar documents
OBJECT ORIENTED EXTENSIONS TO SQL

Aggregation and Composition in Object - Relational Database Design

Schema Evolution in SQL-99 and Commercial (Object-)Relational DBMS

Chapter 9: Object-Based Databases

Complex Data and Object-Oriented. Databases

Generating Enterprise Applications from Models

OBJECTS AND DATABASES. CS121: Introduction to Relational Database Systems Fall 2015 Lecture 21

Chapter 7 Data Modeling Using the Entity- Relationship (ER) Model

A Tool for Generating Relational Database Schema from EER Diagram

IT2305 Database Systems I (Compulsory)

Data Modeling Basics

Tool Support for Software Variability Management and Product Derivation in Software Product Lines

2 George Feuerlicht and Jaroslav Pokorný and Karel Richta

IT2304: Database Systems 1 (DBS 1)

Part VI. Object-relational Data Models

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

How To Design Software

CSC 742 Database Management Systems

Lecture 12: Entity Relationship Modelling

Implementing XML Schema inside a Relational Database

Database Modelling in UML

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Chapter 8 The Enhanced Entity- Relationship (EER) Model

Fundamentals of Database Design

IV. The (Extended) Entity-Relationship Model

14 Databases. Source: Foundations of Computer Science Cengage Learning. Objectives After studying this chapter, the student should be able to:

USING UML FOR OBJECT-RELATIONAL DATABASE SYSTEMS DEVELOPMENT: A FRAMEWORK

The Entity-Relationship Model

Overview RDBMS-ORDBMS- OODBMS

Handling Spatial Objects in a GIS Database -Relational v Object Oriented Approaches

Chapter 2: Entity-Relationship Model. Entity Sets. " Example: specific person, company, event, plant

Java (12 Weeks) Introduction to Java Programming Language


How To Create A Table In Sql (Ahem)

Database Design Methodology

SQL:2003 Has Been Published

Building Views and Charts in Requests Introduction to Answers views and charts Creating and editing charts Performing common view tasks

IRA 423/08. Designing the SRT control software: Notes to the UML schemes. Andrea Orlati 1 Simona Righini 2

Object Oriented Databases. OOAD Fall 2012 Arjun Gopalakrishna Bhavya Udayashankar

Managing a Class Hierarchy with SQL in Matisse

2. Conceptual Modeling using the Entity-Relationship Model

How To Improve Performance In A Database

From Object Oriented Conceptual Modeling to Automated Programming in Java

ER modelling, Weak Entities, Class Hierarchies, Aggregation

Once the schema has been designed, it can be implemented in the RDBMS.

A basic create statement for a simple student table would look like the following.

Tutorial on Relational Database Design

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

Data Modeling. Database Systems: The Complete Book Ch ,

Handling Unstructured Data Type in DB2 and Oracle

A brief overview of developing a conceptual data model as the first step in creating a relational database.

Designing a Family Tree Metamodel Case Study

Unified XML/relational storage March The IBM approach to unified XML/relational databases

Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?

Object Relational Database Mapping. Alex Boughton Spring 2011


USING OBJECT-RELATIONAL DATABASE TECHNOLOGY TO SOLVE PROBLEMS IN DATABASE DEVELOPMENT

Modern Systems Analysis and Design

Object-Based Databases

Contents RELATIONAL DATABASES

A Migration Methodology of Transferring Database Structures and Data

Database Design and Database Programming with SQL - 5 Day In Class Event Day 1 Activity Start Time Length

A Rational Software Whitepaper

Chapter 2: Entity-Relationship Model. E-R R Diagrams

Information Systems Modelling Information Systems I: Databases

Instant SQL Programming

Lesson 8: Introduction to Databases E-R Data Modeling

The Relational Model. Ramakrishnan&Gehrke, Chapter 3 CS4320 1

Requirement Analysis & Conceptual Database Design. Problem analysis Entity Relationship notation Integrity constraints Generalization

XV. The Entity-Relationship Model

A terminology model approach for defining and managing statistical metadata

virtual class local mappings semantically equivalent local classes ... Schema Integration

Conceptual Design Using the Entity-Relationship (ER) Model

The Relational Model. Why Study the Relational Model? Relational Database: Definitions. Chapter 3

The Sleepy Cat Database and Semantic Data Models in Java

WebSphere Business Monitor

Chapter 2. Data Model. Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel

Database Design Overview. Conceptual Design ER Model. Entities and Entity Sets. Entity Set Representation. Keys

Duration Vendor Audience 5 Days Oracle End Users, Developers, Technical Consultants and Support Staff

Oracle8i Spatial: Experiences with Extensible Databases

6. SQL/XML. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. 6.1 Introduction. XML Databases 6. SQL/XML. Creating XML documents from a database

Lecture Notes INFORMATION RESOURCES

4 Logical Design : RDM Schema Definition with SQL / DDL

Part 7: Object Oriented Databases

Microsoft s new database modeling tool: Part 1

A Framework for Data Migration between Various Types of Relational Database Management Systems

CS352 Lecture - Object-Based Databases

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Java Application Developer Certificate Program Competencies

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Relational model. Relational model - practice. Relational Database Definitions 9/27/11. Relational model. Relational Database: Terminology

Bridge from Entity Relationship modeling to creating SQL databases, tables, & relations

Database System Concepts

PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc.

RNDr. Michal Kopecký, Ph.D. Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague

Keywords: Regression testing, database applications, and impact analysis. Abstract. 1 Introduction

Databases and BigData

The Benefits of Data Modeling in Business Intelligence

SQL Server An Overview

Transcription:

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 65 Mapping UML Class Diagrams into Object-Relational Schemas María Fernanda Golobisky and Aldo Vecchietti Departamento de Sistemas Universidad Tecnológica Nacional. Facultad Regional Santa Fe, Argentina (CIDISI) Centro de Investigación y Desarrollo en Ingeniería en Sistemas de Información INGAR Instituto de Desarrollo y Diseño e-mails: {mfgolo, aldovec}@ceride.gov.ar Abstract. Recently, the SQL:2003 standard has been completed by introducing corrections and new capabilities to the SQL:1999. With databases complying with the two standards more complex applications can be developed. For database design, in the literature it is possible to find articles about the mapping steps from an entity-relationship model and from UML class diagrams to relational schemas, with less frequency can be found also some efforts for transforming UML class diagrams into object-relational schemas, but they are focused in a portion of the whole problem. In this article we present several mappings functions from UML class diagrams into objects of the objectrelational schema approach related to the SQL:2003 standard, and from these objects into tables where the objects persist. The main objective of this paper is to provide a formalization of the steps involved in the design of objectrelational schemas by starting with an UML class diagram. 1. Introduction In the design and implementation of information systems, in the database design cycle, an important topic to address is the mapping of conceptual database designs to specific database implementation models. For example, for relational database systems it is very common to generate a relational schema with primary, foreign keys and other constraints by mapping an Extended-Entity Relationship (EER) conceptual model. Given the advances of the object-based technologies not only for modeling applications but also in database technologies, an important research activity has emerged to cover the use of object-based models for the design of database systems. Object-Relational Database Management Systems (ORDBMS), based on the SQL:1999 [11] and SQL:2003 standards [12], are a recent technology including object concepts into relational systems, they offer a much better representation of the real world than the tabular form used in Relational Database Management Systems (RDBMS) based on the previous standards (SQL-92, SQL-89). ORDBMS are supposed to be the new wave for developing applications. For representing data from the real world into objects, class diagram of the Unified Modeling Language (UML) has become a standard of the Information Technology (IT) industry. It is the intention of this paper to formalize a transformation mechanism from UML class

66 M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 diagrams into objects of an ORDBMS. Many papers can be found in the open literature related to the transformation from objects models into relational tables (SQL 92). Recent examples of these transformations can be read in [1], [2], [3] and [9]. Articles about the transformations from objects to object-relational databases are not as frequent. Mok and Paper [7] have presented the transformation of UML models into ORDBMS. Their work was concerned about static and dynamic aspects. In the static aspects they have concentrated the efforts on transforming UML class diagrams into normalized nested tables. They have presented an algorithm to cover this objective. No formalized procedures for more general mappings are presented in this work. Marcos et al. [6] have listed some guidelines about the transformation of UML class diagrams into objects of the SQL:1999 standard and then to Oracle 8i. A short explanation about those transformations is also given, no further analysis is made and this approach can be considered as guidelines for the design of conceptual schema of object-relational databases. Liu et al. [5] have proposed an approach for implementing ORDBMS s by using a heterogeneous database system architecture. Although at that time the SQL:1999 standard was not completed yet, their work is valuable in the sense that they have made some definitions in order to formalize the design of the ORDBMS. Golobisky et al. [4] have presented a transformation analysis about generalizationspecialization relationships modeled by UML class diagram. They have shown several transformations into an ORDBMS. In the analysis they have taken into account several important characteristics for applications linked to an ORDBMS like: implementation flexibility, access mechanisms (index structures) and constraints, navigability thorough the structures. The authors have implemented some examples by using Oracle 9i ORDBMS. The goal of this work is to formalize the steps and paths involved in the transformation from UML class diagrams into object-relational schemas based on the SQL:2003 standard. We considered the SQL:2003 standard because it added to SQL:1999 [11] the capability of defining multisets which will be used in the transformation proposed. In the creation of the conceptual schema it will be considered mappings into object-relational schema objects (Object Relational Layer) and into object-relational tables (Object Relational Persistent Layer). 2. Mapping Layers There are three layers involved in the transformation from UML class diagrams into persistent objects. The first one corresponds to the UML classes and relationships, the second is the object-relational layer composed of the objects proposed by the SQL:2003 standard designed to implement classes and relationships, and the third, the object-relational persistent layer composed of typed tables with keys, constraints, triggers (relational approach) and other elements added by the new SQL standards (1999/2003) like object identifiers (OID. The layers involved in the transformations and the elements composing them are shown in Fig. 1.

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 67 UML Class Diagrams Class State (attributes) Operations Relationships Objects to relational tables mapping (1) Object-Relational Persistent Layer Tables Keys Constraints OID Objects to object-relational components mapping (2) Object-Relational Layer Row Type Collection Type (Array) Reference Type (Ref) User Defined Type (UDT) Object-relational components to relational tables mapping (3) Figure 1. Layers involved in mapping objects into Object-Relational Database Systems One important characteristic of the ORDBMS is the capability of having pure relational transformations (1) complying with SQL 92 standard; or object-relational (OR) transformations [(2) and (3)], according to the new standards. Transformations (1) are well discussed in the literature and it is not the objective of this article. The goal of this paper is to propose a formalization of the mapping procedure from UML class diagrams into objects of the OR layer based on the SQL:2003 (2) and from OR layer to object-relational persistent tables (3). In this formalization process we start with the definition of the components of each layer, in terms of their characteristics and properties, and then we define mapping functions between those components, such that we can specify the paths from the objects of the first layer to the second and from the second to the third. 3. The UML Class Diagrams Layer A class diagram modeling the purchase order administration in a business company is show in Fig. 2. This model will be used in the examples presented along the paper. Although the UML class diagrams have many elements to model, we have selected the more commonly used for a database schema design, which are: Classes (C) Attributes: single (SA), composed(ca), multivalued(ma) Operations (O) Relationships (R) between classes: aggregation (AG), composition (CM), n- ary association (NAS), binary association (BAS), association class (AC), generalization-specialization (GS) The definitions of the elements of the UML class diagrams with the purpose of transforming it to an OR schema are presented below.

68 M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 Purchase Order order number shipping date tocity tostreet tozip 1 0..* 1 Customer customer number street city zip code phone customer name 1..15 1 Customer_Association identification description customer number percentage 1..20 OrderLineItem line number product quantity Person personid discount Company type taxes 1..* 1 Stock quantity date Products product number description price 0..* 1..15 Store location capacity street city zip code Figure 2. Class Diagram for Purchase Order Application Definition 1. A Class can be defined as C=(identity, state, behavior, relationship). Considering the perspective of mapping it into an object-relational schema, it is more convenient to decompose state and behavior in its individual components, C=(Identity, C, SA, CA, MA, O, R), where: identity is the name assigned to the class; C is a finite set of classes due to a class can be composed of other classes; SA is a finite set of single attributes; CA is a finite set of composed attributes; MA is a finite set of multivalued attributes; O is a finite set of operations and R is a finite set of relationships where the class is participating. For example the Purchase Order Class of the diagram presented can be represented as: C=( Purchase Order, order number, shipping date, tocity, toaddress, tozip, CM, BAS) where CM corresponds to a composition relationship and BAS a Binary Association relationship where Purchase Order participates. Definition 2. An Aggregation Relationship (AG) is a binary association specifying a whole-part relationship, it can be defined as AG=(W, P, MW, MP), where: W is the whole composed of 0..n objects of the part class P; P is the part whose objects compose W; MW is the multiplicity of the whole class, it can takes any value from 0 to n; MP is the multiplicity of the part class, it can also takes any value from 0 to n.

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 69 In the aggregation a part may belong to more than one aggregate, and it may exist independently of the aggregate. This characteristic is very important to distinguish aggregation from composition, which is a stronger whole-part relationship. Another characteristic of this relationship is that there are no cycles, meaning that an object can not be direct or indirectly part of itself [10]. For example the aggregation between Customer and Customer Association can be defined as: AG=( Customer Association, Customer, 1, 1..15) Definition 3. A Composition Relationship (CM) is an association that specifies a whole-part relationship, but stronger than aggregation because the lifetime of the part depends on the lifetime of the whole. Besides, the part must belong to only one composite [10]. CM can be defined as CM=(W, P, 1, MP), where: W is the whole class which is composed of 0..n objects of the part ; P is the part class whose objects compose the whole W; 1 is the multiplicity of the whole and MP is the cardinality of the part class. It can take any value from 0 to n. For example by completing the definition made of Purchase Order : C=( Purchase Order, order number, shipping date, tocity, toaddres, tozip, CM=( Purchase Order, OrderLineItem, 1,1..20), BAS) Definition 4. A N-ary Association (NAS) is an association among three or more classes. A binary association (which is the most common association) is a special case with its own simpler notation and certain additional properties [10], which will be defined later. A n-ary association (NAS) can be defined NAS=(C1, C2,, Cn), where C1, C2,,Cn are the name of the classes related by the association. The multiplicity for an n-ary association is not a very obvious property and in general it is assumed to be many [10]. Definition 5. A Binary Association (BAS) is an association with exactly two association ends. This association is particularly useful for specifying navigation paths from object to object [10]. Binary association can be defined as BAS=(C1, C2, MC1, MC2), where: class C1 is related by means of the association to C2; class C2 is related by means of the association to C1; MC1 is the multiplicity (cardinality) of C1 in the association, it can take any value from 0 to n; MC2 is the multiplicity (cardinality) of C2 in the association, it can take any value from 0 to n. Completing the definition of Purchase Order : C=( Purchase Order, order number, shipping date, tocity, toaddres, tozip, CM=( Purchase Order, OrderLineItem, 1,1..20), BAS=( Purchase Order, Customer,0..*, 1)) Definition 6. An Association Class (CA) is an association that is also a class. It connects two or more classes and possesses its own attributes and operations. The attributes corresponds to the relationship not to the classes referenced by the association class. The particularity is that the association class has references to the classes for each association end [10]. CA can be defined as CA=(identity, Ref(C1), Ref(C2),..., Ref(Cn), SA, CA, MA, O), where: identity is the name assigned to the association class; Ref(C1), Ref(C2),, Ref(Cn) are references to the corresponding classes related participating of the association class; SA is a finite set of single

70 M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 attributes corresponding to the relationship itself; CA is a finite set of composed attributes corresponding to the relationship itself; MA is a finite set of multivalued attributes corresponding to the relationship itself; O is a finite set of operations for the association class. In our example Stock is an association class, which can be defined as: C=( Stock, Ref(Products), Ref(Store), quantity, date) Definition 7. A Generalization-Specialization Relationship (GS) is a relationship between the top class of a hierarchy, called the superclass, and the lower level classes in the hierarchy, called the subclasses. The subclasses have the properties of the parent but also have additional properties peculiar to the child [10]. In this sense, the definition for a GS can be specified as GS=(SPC, SBC 1, SBC 2,,SBC n ), where SPC corresponds to the definition of the superclass, and SBC 1, SBC 2,,SBC n are the subclasses in the immediate lower level of the hierarchy. In Fig. 2 we have this relationship type among Customer, Person and Company. GS=(Customer, Person, Company) 4. The Object-Relational (SQL:1999) Layer Components of this layer are mainly defined in the SQL:2003 standard [12]. Only the most relevant parts of the document for this paper are considered, which are: row type, collection types (arrays and multisets), reference type and user-defined type. Complete the type definition some new built-in single data types such as binary large objects (BLOB), character large objects (CLOB), boolean type, etc. Definition 8. A Row Type is defined as a set of pairs RT=(F 1 :T 1, F 2 :T 2,,F n T n ) where F i is the name of a field in the row type and T i is a built-in data type such as integer, real, character, LOB, etc. Definition 9. The Array Type (AT) is an ordered collection of elements of any type admissible, except array type, with a maximal length. AT can be defined as AT=(ET, ML), where: ET is the element type of the array, which can be: a built-in data type, a reference type or a structured type (row type, UDT). ML is the maximum length of the array type. Definition 10. The Multiset Type (MT) is an unordered collection of elements of any type admissible, without syntax for a maximal length. MT can be defined as MT=(ET), where: ET is the element type of the array, which can be: a built-in data type, a reference type or a structured type (row type, UDT). Definition 11. A Reference Type called Ref(T) is also a data type which contains the reference values (OID) of T. T is a row in a typed table (rows in a table containing user defined type rows).

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 71 Definition 12. A User-Defined Type (UDT) is a structured type having state, behavior and relationship to other types. In this sense, an UDT in SQL:2003 is the equivalent concept to a class in UML. The UDT can be defined as UDT=(identity, BIT, UDT, RT, AT, MT, Ref, MM, Q), where: identity is the name assigned to the UDT; BIT is a finite set of built-in data type; UDT is a finite set of UDT due to an UDT can be defined in terms of other UDTs; RT is a finite set of row type; AT is a finite set of array type; MT is a finite set of multiset type; MM is a finite set of member methods; Ref is a finite set of reference type; Q is a finite set of qualifiers like FINAL/NOTFINAL, INSTANTIABLE/NONINSTANTIABLE, UNDER, etc., which give some other important characteristics to the UDT. For example the qualifier UNDER allows the definition of an UDT s hierarchy. 5. The Object-Relational Persistent Layer This layer is related to the logical containers of the data, which in the object-relational approach will be in some sort of a two dimensional table like the relational system. Unlike the relational system, in the OR databases the columns are not restricted to built-in data types. In the object-relational approach to store data it is needed to create a table by using the corresponding SQL CREATE TABLE command like the relational system does. It must be remarked that the persistent layer presented at this point is a logical representation of the data instead of a physical one, because we are not including any physical aspect about how the rows and columns of the tables will be stored in the storage media. The data repositories of an object-relational database can be defined as follows: Definition 13. A Typed Table T can be defined as T=(identity, UDT, SC), where: identity is the name assigned to the table, UDT is the corresponding type from which the table is generated, SC is a finite set of constraints definitions like primary, unique and/or foreign key, not null, check constraints. The definition of primary and unique key must be considered at this point because they define access paths to the objects from which the query optimizer can take advantage for performing a query. Until the Typed Table is not created any physical support for object persistency is generated and the objects (UDTs occurrences) can not be created and stored. 6. Mapping Functions To complete the transformations it is necessary to define several functions to translate the components from one layer to the other. Definition 14. A mapping function f from A to B is a binary relation between A and B. A function f:a B can be regarded as mapping elements from A into elements of B. Each element in A maps to exactly one element in B.

72 M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 6.1. Mapping Functions from UML to OR Layer In this section mapping functions from the UML class diagrams layer to the objectrelational layer will be presented. In the description of these functions will frequently appears ellipsis signs ( ). The ellipsis replaces some definitions that are not relevant for the matter described. 6.1.1. Transforming Classes A class in the UML layer can be transformed to an UDT in the OR layer by defining a mapping function f 1 :C UDT, which implicitly defines the following transformations between the individual components of the Class and the UDT: Class UDT Single attribute Built-in type Composed attribute Row type Multivalued attribute Array type, multiset type Operation Member method The relationships can not be transformed in the same way than the other elements of the class because several alternatives mapping functions can be applied according to the characteristics of the relationships and the classes involved on it. Mapping functions for the relationships are proposed in the next sections. Only the most evident mapping functions are presented according to the characteristic of the relationship. Other transformations can be formulated but they are beyond of the scope of this paper. 6.1.2. Transforming Aggregation Relationships For mapping functions definition for aggregation, and considering that in this relationship the participating classes have independent existence, then the more appropriate representation is by means of object references. For an aggregation since it is a binary relationship, two mapping functions are needed, one for the whole and one for the part. The exact mapping function for an aggregation depends on the multiplicity of the classes participating on the relationship, as follows: For a multiplicity of 1, the mapping function for classes with that multiplicity is defined as follows: f2:ag=(,1,...) UDT=(,Ref(UDT( )), ) which means that for the aggregation where the classes participate with a multiplicity of 1, corresponds to include a reference on the corresponding UDT to represent the relationship. For example, if the whole has a fixed structure (multiplicity 1), then the mapping function is: f2:ag=(w,,1, ) UDT=(P,...,Ref(W), ) and in the Object Relational Layer, in the UDT corresponding to the part (identified by P) we must include a reference to the whole UDT (Ref(W)). When the multiplicity of the classes involved in the aggregation is well known having a maximal number say n, then the aggregation can be mapped to an array of references: f3:ag=(...,n, ) UDT=(,AT(Ref(UDT(...)),n),...)

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 73 For example, if the part participates in the aggregation with a known multiplicity of n, then the mapping function will be: f3:ag=(,p,,n) UDT=(W,...,AT(Ref(P),n),...) where the UDT defined for the whole class (W) must include an array type (AT) with references to objects of the part class (P) with a maximal dimension of n. When the multiplicity is not know and can have very different values between their extremes (some objects with a low multiplicity and some other with a very high multiplicity), then a more appropriate transformation for the case is to define a multiset of references to the other UDT in the relationship, as follows: f4:ag=(,*,.) UDT=(.,UDT=(,MT(Ref(UDT),...) For example, if the part participates in the aggregation with an unknown multiplicity of *, then the mapping function will be: f4:ag=(,p,,*) UDT=(W,...,MT(Ref(P)),...) where the UDT defined for the whole class (W) must include a multiset of references to objects of the part class (P). Note that by defining a multiset no maximal dimension is required. 6.1.3. Transforming Composition relationships As was pointed out, composition is a much stronger relationship than aggregation, because the life of the part depends of the whole, then instead of using references, for this transformation it is convenient to include the objects of the part into the whole object and, again, the exact mapping function for a composition depends on the multiplicity of the part participating on the relationship, as follows: For a multiplicity of 1 for the part, the mapping function for the composition is defined as follows: f5:cm=(w,p,1,1) UDT=(W,,UDT(P, ), ) meaning that for UDT defined for the part class (P) with a multiplicity of 1 corresponds to include it into the whole UDT (W). When the multiplicity of the part involved in the composition is well known with a maximal number say n, then the composition can be mapped to an array of UDT of the part (P) included into the UDT of the whole, as follows: f6: CM=(W,P,1,n) UDT=(W,...,AT(UDT(P, ),n), ) When the multiplicity is not know, then a more appropriate transformation for this case is to include a multiset of the UDT for the part into the UDT of whole, as follows: f7: CM=(W,P,1,*) UDT=(W,...,MT(UDT(P, )), ) Again, in this way no maximal dimension is needed and the whole can have as many parts as needed. 6.1.4. Transforming Binary Associations The mapping function for binary associations is similar to aggregation. The main difference between them is that aggregation considers only whole-part relationships, while association is a more general association type. One thing that is particular for association is the navigability. If navigability is given in both directions, it is needed

74 M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 the transformation of both end of the association. If navigability is given in one direction only, then only one mapping is needed in the class which is the entry point of the traversal path. Besides navigability, the exact mapping function for a binary association depends on the multiplicity of the classes participating on the relationship, as follows: Suppose C1 participates in the binary association with a multiplicity of 1, then mapping function for C2 is defined as follows: f8:bas=(c1,c2,1, ) UDT=(C2,,Ref(C1), ) Now, if the multiplicity of the class C1 involved in the association is well known having a maximal number say n, then the aggregation can be mapped to an array of references with maximal size n to C1 in the UDT defined for C2: f9:bas=(c1,c2,n, ) UDT=(C2,,AT(Ref(C1),n),...) When the multiplicity of C1 is not exactly known and can take very different values between their extremes (some objects with a low multiplicity and some other with a very high multiplicity), then a more appropriate transformation for the case is to define a multiset of references to C1 on the UDT defined for C2, as follows: f10:bas=(c1,c2,*, ) UDT=(C2,MT(Ref(C1)), ) 6.1.5. Transforming N-ary Associations N-ary Associations involves three or more classes participating into the relationship with a very wide multiplicity range. Since it is needed to have together all the objects participating into the relationship, the mapping function for this transformation is to create a new class, probably having a name equal to the rolename of the association, with references to the classes s objects participating into it. Having this concept in mind, the mapping function can be defined as: f11:nas=(c1,c2,, Cn) UDT=( name, Ref(C1), Ref(C2),, Ref(Cn)) 6.1.6. Transforming Association Class Since an association class is both an association and a class, the mapping function is like the one defined for a class plus the references to the classes to which the association class is linked. The mapping function is as follows: f12: AC=(identity, Ref(C1), Ref(C2)., Ref(Cn), SA, CA, MA, O) UDT=( name, Ref(C1), Ref(C2),, Ref(Cn),BIT, RT, AT, UDT, MM) where the UDT defined for the association class contains the references to the classes related by the association plus the built-in (BIT) data types, row types, array types, UDTs, and member methods of the association class. Mapping functions f8-f11 must be corrected when they involve an association class. In the relationship with an association class each UDT must have references to the association class, instead of having a reference to the class to which is related. By this way, classes participating in the relationship will be linked indirectly, thorough the association class. For example, mapping function f9 must be corrected to f9 : f9 :BAS=(C1,C2,n,, AC) UDT=(C2,,AT(Ref(AC),n),...) where AC means that existence of an association class because of the relationship and, hence, instead of referencing C1 directly, the association class is taking its place.

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 75 6.1.7. Transforming Generalization-Specialization Relationships The mapping function consists on the creation of one UDT for each type in the class hierarchy by defining on it all attributes and methods of each class. In this sense, the number of UDT generated by the mapping function will be the number of classes in the hierarchy. f13:gs=(spc,sbc 1, SBC 2,,,SBC n ) UDT=( SPC, BIT, UDT, RT, AT, Ref, MM, Q), UDT=( SBC1, BIT, UDT, RT, AT, MT, Ref, MM, Q), UDT=( SBC2, BIT, UDT, RT, AT, MT, Ref, MM, Q),, UDT=( SBCn, BIT, UDT, RT, AT, MT, Ref, MM, Q) where for each UDT on the hierarchy, BIT, UDT, RT, AT, MT, Ref, MM, Q corresponds to the built-in data types, user-defined types, row types, array types, multiset types, references types, member methods and qualifiers respectively, representing their state and behavior. The qualifiers needed for this transformation are, in general, UNDER to define which subclass is of the superclass type, if they are INSTANTIABLE or not (abstract class, no objects can be created from it), if they are FINAL (can not have subclasses depending on it) or not. 6.2. Mapping Functions from Object-Relational Layer to Object-Relational Persistent Layer This mapping function can be described in the same terms than transforming a class of the UML layer to an UDT in the OR layer. In this case, each UDT in the OR layer will be mapped into a table in the OR persistent layer by defining a mapping function f 14 :UDT T, which implicitly defines the transformation between the individual components of the UDT and the table: identity, built-in type, UDT type, row type, array type, multiset type, ref type and member method. Complete this transformation the set of constraints (SC) which is particular of the table definition and must be defined on the Object-Relational Persistent Layer. Constraints do not have an equivalent on the OR Layer. 7. Mapping the Example from UML to Object-Relational Layer In this section, the mapping functions from the UML layer to the OR layer, corresponding to the example of Fig. 2, are given. The underlying parts of the mapping functions correspond to the transformation of relationships. The names enclosed by quotation marks are the identities assigned. Numbers correspond to multiplicity. The names without quotations marks correspond to names given to builtin attributes. C=( Purchase Order, order number, shipping date, tocity, toaddres, tozip, CM=( Purchase Order, OrderLineItem, 1,1..20), BAS=( Purchase Order, Customer,0..*, 1)) UDT=( Purchase Order, order number, shipping date, tocity, toaddress, tozip, AT(UDT=( OrderLineItem, ),20), Ref(Customer));

76 M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 GS=(SPC=( Customer, customer number, customer name, street, city, zipcode, phone, BAS=( Purchase Order, Customer,0..*, 1), AG=( Customer Association, Customer, 1, 1..15)), SBC 1 =( Person, personid, discount), SBC 2 =( Company, type,taxes)) UDT=( Customer, customer number, customer name, street, city, zipcode, phone, MT(Ref(Purchase Orders), Ref(Customer Association))), UDT=( Person, personid, discount, UNDER Customer ), UDT=( Company, type, taxes, UNDER Customer )); C=( Customer Association, identification, description, AG=( Customer Association, Customer, 1, 1..15)) UDT=( Customer Association, identification, description, AT(Ref(Customer,15))); C=( OrderLineItem, line number, product, quantity, BAS( OrderLineItem, Products, 0..*, 1)) UDT=( OrderLineItem, line number, product, quantity, Ref(Products)); C=( Products, product number, description, price, BAS( Products, Store, 0..*, 0..*)) UDT=( Products, product number, description, price, AT(Ref(Stock))); C=( Store, location, capacity, street, city, zipcode, BAS( Products, Store, 0..*, 0..*)) UDT=( Store, location, capacity, street, city, zipcode, MT(Ref(Stock))); C=( Stock, Ref(Products), Ref(Store), quantity, date) UDT=( Stock, Ref(Products), Ref(Store), quantity, date) 8. Mapping the Example from UML to Object-Relational Layer in ORACLE 10g In this section is described the schema definition of the purchase order example using the commercial Database Oracle 10g. Note that Oracle 10g [8] differs in some issues to the SQL:2003 standard as follows: a. The UDT definition is made by defining an OBJECT type. b. It is not possible to directly define an array in an object, it must be previously defined as a type. c. The implementation of a multiset is performed by defining nested tables. This is done by creating a TABLE type, and then including this type in the object definition (UDT). CREATE or REPLACE TYPE orderlineitem_ob as OBJECT ( line_number NUMBER(2), product NUMBER(5), quantity NUMBER(3)); CREATE or REPLACE TYPE orderlineitem_va as VARRAY(20) of orderlineitem_ob;

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 77 CREATE or REPLACE TYPE purchaseorder_ob as OBJECT ( order_number NUMBER(6),shipdate DATE, tocity VARCHAR2(25), tostreet VARCHAR2(30), tozip VARCHAR2(8), orderlines orderlineitem_va); CREATE or REPLACE TYPE customerassociation_ob as OBJECT ( id VARCHAR2(6), description VARCHAR2(60)); CREATE or REPLACE TYPE refpurchaseorder_ob as OBJECT ( reforder REF purchaseorder_ob); CREATE or REPLACE TYPE refpurchaseorder_tab as TABLE OF refpurchaseorder_ob; CREATE or REPLACE TYPE customer_ob as OBJECT ( customer_number NUMBER(5), customer_name VARCHAR2(50), street VARCHAR2(30), city VARCHAR2(30), zip_code VARCHAR2(8), phone NUMBER(15), reforders refpurchaseorder_tab, refcustassoc REF customerassociation_ob) NOT FINAL; CREATE or REPLACE TYPE product_ob as OBJECT ( product_number NUMBER(5), description VARCHAR2(50), price NUMBER(4)); CREATE or REPLACE TYPE store_ob as OBJECT ( location VARCHAR2(20), capacity VARCHAR2(10), street VARCHAR2(30), city VARCHAR2(30), zip_code VARCHAR2(8)); CREATE or REPLACE TYPE stock_ob as OBJECT ( quantity NUMBER(10), fecha DATE, refproducts REF product_ob, refstores REF store_ob); CREATE or REPLACE TYPE refstock_ob as OBJECT ( refstock REF stock_ob); CREATE or REPLACE TYPE refstock_tab as TABLE OF refstock_ob; CREATE or REPLACE TYPE refstock_va as VARRAY (15) of REF stock_ob; ALTER TYPE purchaseorder_ob ADD ATTRIBUTE (refcustomer REF customer_ob) CASCADE; ALTER TYPE orderlineitem_ob ADD ATTRIBUTE (refproduct REF product_ob) CASCADE;

78 M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 CREATE or REPLACE TYPE customer_va as VARRAY(15) of REF customer_ob; ALTER TYPE customerassociation_ob ADD ATTRIBUTE ( customers customer_va) CASCADE; ALTER TYPE products_ob ADD ATTRIBUTE (refstocks refstock_va) CASCADE; ALTER TYPE store_ob ADD ATTRIBUTE (refstocks refstock_tab) CASCADE; CREATE or REPLACE TYPE person_ob UNDER customer_ob ( personid NUMBER(9), discount NUMBER(2)) NOT FINAL; CREATE or REPLACE TYPE company_ob UNDER customer_ob ( type NUMBER(2), taxes VARCHAR2(20)) NOT FINAL; 9. Mapping the Example from Object-Relational Layer into the persistent layer into ORACLE 10g CREATE TABLE customerassociation_t OF customerassociation_ob (Primary Key id); CREATE TABLE customers_t OF customer_ob (Primary Key customer_number) NESTED TABLE reforders STORE as reforders_t; CREATE TABLE purchaseorder_t OF purchaseorder_ob (Primary Key order_number) NESTED TABLE refstocks STORE as refstocks_t; CREATE TABLE stores_t OF stores_ob(primary Key location); CREATE TABLE stock_t OF stock_ob; CREATE TABLE products_t OF product_ob (Primary Key product_number); Note that there are less typed tables defined than object types (UDTs). For the generalization-specialization relationship we have defined a unique table because of the substitutability property [4] which states that a column or row defined to be of type t can contain instances of t and any of its subtypes.

M.F. Golobisky, A. Vecchietti/Proc. of Argentine Symposium on Software Engineering (2005) 65-79 79 10. Conclusions In this article we have been proposed mapping functions from the UML class diagrams to the object-relational schema objects related to the SQL:2003 standard.the main contribution of this paper is the formalization, by means of definitions and mapping functions, of the steps involved in the design of objectrelational databases. Starting with an UML class diagram, and using the definitions and functions provided in this work, it is possible to arrive to the set of tables compliant to the schema design of an ORDBMS. Mapping functions presented here are the most direct transformations according to the characteristics of the components involved on it. They do not constitute a complete set of all possible transformations between UML class diagrams and the object-relational layer. Other mapping functions can also be given and their study is part of the future work. By using this approach it is possible to develop a framework to allow the automatic generation of an OR schema design, which will also be part of the future work. References 1. Cabibbo, L.: Objects Meet Relations: On the Transparent Management of Persistent Objects. In: Proceedings of 16th International Conference on Advanced Information System Engineering (CAISE 2004), Riga Latvia (2004) 2. Cabibbo, L., Carosi, A.: Managing Inheritance Hierarchies in Object/Relational Mapping Tools. In: Proceedings of 17th International Conference on Advanced Information System Engineering (CAISE 2005), Porto Portugal (2005) 3. Fong, J.: Translating Object-Oriented Database Transactions into Relational Transactions. Information and Software Technology, Vol. 44, Issue 1 (2002) 41-51 4. Golobisky, M..F., Fiszman, F., Vecchietti, A.: Análisis de Transformaciones de Generalización-Especialización en Bases de Datos Objeto-Relacionales. In: Proceedings of the Argentinian Symposium on Technology, 33 JAIIO, Córdoba Argentina (2004) 5. Liu, C., Orlowska, M., Li, H.: Realizing Object-Relational Databases by Mixing Tables with Objects. In: Proceedings of the Object Oriented Information Systems (OOIS 1997), Brisbane Australia (1997) 6. Marcos, E., Vela, B., Cavero, J.M., Cáceres, P.: Aggregation and Composition in Object- Relational Database Design. In: Proceedings of the Fifth East-European Conference on Advances in Databases and Information Sytems, Vilnius Lithuania (2001) 7. Mok, W.Y., Paper, D.V.: On Transformations from UML Models to Object-Relational Databases. In: Proceedings of the 34th Hawaii International Conference on System Sciences, Maui Hawaii (2001) 8. Oracle Database Application Developer s Guide - Object-Relational Features, 10g Release 1 (10.1). Part No. B10799-01 (2003) 9. Philippi, S.: Model Driven Generation and Testing of Object-Relational Mappings. Journal of Systems and Software (2004) 10. Rumbaugh, J., Jacobson, I., Booch, G.: The Unified Modeling Language Reference Manual, Addisson-Wesley, Object Technology Series (1999) 11. Melton, J.; A.R. Simon y J. Gray, 2001. SQL: 1999 - Understanding Relational Language Components. Morgan Kaufmann Publishers; 1st edition. May 23, 2001. 928 p. 12. Melton, J. ISO/IEC 9075-2:2003 (SQL/Foundation), ISO standard (2003)