Applying MDA and universal data models data warehouse modeling MARIS KLIMAVICIUS Department of Applied Computer Science Riga echnical University Meza iela 1/3-506, LV-1048, Riga LAVIA maris.klimavicius@gmail.com ULDIS SUKOVSKIS Department of Applied Computer Science Riga echnical University Meza iela 1/3-506, LV-1048, Riga LAVIA uldis.sukovskis@cs.rtu.lv Abstract: - Business process monitoring provides an invaluable means of an enterprise to adapt to changing conditions. Data warehouse stores the process data which is foundation business process monitoring applications. Development of such applications by using traditional methods is challenging because of the complexity of integrating business processes and existing inmation systems. Different modeling approaches have been proposed to overcome every design pitfall of the development of data warehouse systems. On the other hand, model driven architecture is an approach to develop applications from domain-specific models to platm-sensitive models that bridges the gap between business processes and inmation technology. Model driven architecture is a standard framework software development that addresses the complete life cycle of designing, deploying, integrating, and managing applications by using models in software development. Authors propose to use model driven approach data warehouse development. Also the concept of universal data models was introduced in order to ease data warehouse development by providing standard data objects. his paper introduces the overall concept of applying model driven architecture and universal data models to development process of data warehouse. Key-Words: - Data warehouse, Business process modeling, Model driven architecture, Universal data models 1 Introduction During the last ten years, the interest to analyze data has increased significantly, because of the competitive advantages that inmation can provide decision-making process. Data warehouse systems represent a single source of inmation analyzing the development and results of an enterprise organization in a changing environment. he data in the data warehouse describes events and a status of business processes, products and services, goals and organizational units. Nowadays, a key to survival in the business world is ability to analyze, plan and react to changing business conditions as fast as possible. However the ability to change is bound to many constraints, such as staff knowledge, business supporting systems, etc. Business operations depend on enterprise inmation systems, it mean that changes in business processes requires changes in supporting inmation systems. A change in operational inmation systems requires changes in data warehouse, which is a central repository of atomic and summarized data from different operational systems. Data warehouses integrate data from multiple heterogeneous inmation sources and transm them into a multidimensional representation decision support applications. Research in the field of data warehousing and online analytical processing has produced important technologies the design, management, and use of inmation systems decision support. Much of the interest and success in this area can be attributed to the need software and tools to improve data management and analysis. However, despite the continued success and maturing of the field, much research remains to be done across many different areas of data warehousing In particular, data warehousing applications require improved and standardized conceptual modeling techniques as well as novel approaches to dealing with data quality issues. Considering that data that needs to be stored in the warehouse is getting more and more complex in both structure and semantics while the analysis must keep up with the demands of new applications. heree, there is still a lot of eft to put into developing advanced methods and standards data warehouse development framework. Proposed approach is based on the idea that requirements data warehouse can be elicited from business process models [12]. ISBN: 978-960-6766-63-3 332 ISSN: 1790-5117
2 Related work Different approaches the conceptual and logical design of data warehouse systems have been proposed in the last few years. In this section, authors present a brief discussion about some of the important approaches. While the standardization of metadata is discussed in numerous domains resulting in a different metadata standards, the specific requirements of data warehousing solutions are usually addressed insufficiently [1]. In [2], the authors present the multidimensional model, a logical model OLAP (On Line Analytical Processing) systems, and show how it can be used in the design of multidimensional databases. he authors also propose a general design method, aimed at building a multidimensional schema starting from an operational database described by an ER schema. Although the design steps are described in a logic and coherent way, the data warehouse design is only based on the operational data sources, what we consider insufficient because end users requirements and business processes are very important in the data warehouse design. In [14] authors present an approach to business metadata that is based on the relationship between the data warehouse data and the structure and behavior of enterprise. hey use models to derive business metadata, which ms an additional level of abstraction on top of the data-oriented data warehouse structure. Authors of this work also establish relationship between organization s processes and related data though they use business processes and MDA to accomplish it. here are also several works which address model driven architecture as the solution data warehouse implementation. One of the fist works which has been developed aligning the design of data warehouse with the general MDA paradigm, the model driven data warehouse [3]. his approach is based on the Common warehouse metamodel [4], which is a platm-independent metamodel definition interchanging data warehouse specifications between different platms and tools. However, Common warehouse metamodels are too generic to represent all peculiarities of multidimensional modeling in a conceptual model and too complex to be handled by both business users and designers [5]. In [6], authors describe how to align the whole data warehouse development process to MDA. hey define multidimensional model driven architecture, an approach applying the MDA framework to one of the stages of the data warehouse development: multidimensional modeling. hey also describe how to build the different MDA artifacts by using extensions of the UML. In this approach transmations between models are clearly and mally established by using the Query/View/ransmation approach. However, the authors are not addressing requirements gathering stage. Requirements are specified in CIM stage which is permed manually. 3 Data warehouse development Most techniques that are used by organizations to build a data warehousing system use either a topdown and bottom-up development approach. In the top-down approach [8], an enterprise data warehouse is built in an iterative manner, business area by business area, and underlying dependent data marts are created as required from the enterprise data warehouse content. In the bottom-up approach [9], independent data marts are created with the view to integrating them into an enterprise data warehouse at some time in the future. here are still a lot of discussions about the similarities and differences among these architectures, but despite these differences there are two main steps in data warehouse development, which are very closely connected requirements gathering and inmation modeling (design). Figure 1 shows typical data warehouse architecture, which might be addressed to any approach. Basically data warehouse has 5 layers. hese layers are possible to define as follows: source layer operational inmation systems, integration layer extraction, transmation and loading of data into data warehouse, Data warehouse layer central data storage, Data mart layer customized data according to needs of users, Application layer applications end users to analyze data. Fig.1. Data warehouse architecture In the paper authors address the development of central data warehouse component data warehouse layer. ISBN: 978-960-6766-63-3 333 ISSN: 1790-5117
4 Concept of Model Driven Architecture he idea of Model Driven Architecture was introduced by OMG (Object Management Group) as an approach to system specification and interoperability and is inspired by the use of several mal models. he key concepts of the MDA architecture are the default view points on a system specified by the MDA: computation independent, platm independent, platm specific and a code. PSM CODE CIM PIM...... PSM CODE Fig.2. Model driven architecture In MDA, platm-independent models (PIM) are initially expressed in a platm-independent modelling language. he platm-independent model is subsequently translated to a platmspecific model (PSM) by mapping the PIM to some implementation language or platm using mal rules. CIM (computation-independent model) A CIM is also often referred to as a business model because it uses a vocabulary that is familiar to the subject matter experts. It presents exactly what the system is expected to do, but hides all inmation technology related specifications to remain independent of how that system will be implemented. PIM (platm-independent model) A PIM has a sufficient degree of independence so as to enable its mapping to one or more platms. his is commonly achieved by defining a set of services in a way that abstracts out technical details. hat means that PIM does not contain any inmation specific to the platm or the technology that is used to realize it. PSM (platm-specific model) A PSM combines the specifications in the PIM with the details required to stipulate how a system uses a particular type of platm. If the PSM does not include all of the details necessary to produce an implementation of that platm it is considered abstract. 5 Concept of universal data models he concept of universal data models was introduced by Len Silverston [7] as an approach to system modeling and is inspired by the use of proven components. A universal data model is a template data model that can be used as a building block to start development of the logical data model or data warehouse data model. Effective methods incorporating the universal data models can be summarized as follows: Develop the enterprise data model by customizing and adding to the universal data models using the business terms that are commonly known in the enterprise and adding appropriate inmation requirements. Build the appropriate logical data models each project according to the business requirements that specific application. Create the necessary physical database designs based on the logical data model and the technical requirements. Customize the database design to the appropriate target DBMS (database management system). One of the key inmation issues today is how to develop integrated systems that facilitate consistent inmation use by the enterprise. When projects develop their database designs independent of an overall model, the same inmation items are often implemented in separate tables and sometimes with different meanings, leading to redundant, inconsistent data and non-integrated systems. he universal data models can be used to start an enterprise data model eft, providing the enterprise with a "road map" of their inmation and showing how inmation relates to other inmation. his approach can lead to much more data consistency, data quality, and ultimately to better inmation to be used to improve the operations of the enterprise. Universal data models can also serve as the basis a data warehouse design and implementation. Eventually, if universal data models are suitable enterprise application development, then these data structures are also valid data warehouse development. Example of universal data model of invoices and invoice items are shown on figure 3. ISBN: 978-960-6766-63-3 334 ISSN: 1790-5117
adjusted by described by INVOICE IEM YPE # INVOICE IEM YPE ID * DESCRIPION INVOICE IEM # INVOICE IEM SEQ ID * AXABLE FLAG - QUANIY - AMOUN - IEM DESCRIPION SALES INVOICE IEM the change the description billed via PRODUC # PRODUC ID * NAME - INRODUCION DAE - SALES DISCONINUAION DAE - SUPPOR DISCONINUAION DAE - COMMEN PURCHASE INVOICE IEM the change billed via PRODUC FEAURE # PRODUC FEAURE ID * DESCRIPION INVENORY IEM # INVENORY IEM ID SERIALIZED INVENORY IEM * SERIAL NUM the change billed via NON SERIALIZED INVENORY IEM - QUANIY ON HAND INVOICE # INVOICE ID * INVOICE DAE - MESSAGE - DESCRIPION part of the adjustment sold with sold composed of Fig.3. Universal data model of invoice item Invoices, like shipments and orders, may have many items showing the detailed inmation about the goods or services that are charged to parties. he items on an invoice may be products, features, work efts, time entries, or adjustments such as sales tax, shipping and handling charges, fees, and so on. 6 Alignment of MDA and universal data models to data warehouse development framework he authors have previously described MDA approach and universal data models concept. he purpose of this section is to combine these approaches to data warehouse development framework. MDA presents computational independent, platm independent, and platm specific viewpoints. Following these considerations authors present an MDA oriented data warehouse development framework. Following MDA viewpoints can be represented according to data warehouse development framework: CIM defines the requirements the data warehouse. It is a viewpoint of the data warehouse from business process perspective. Business processes has a crucial role in data warehouse development. Business processes and universal data models bridge the gap between those that are experts about the domain and process, and those that are experts of the design and construction of the data warehouse. PIM defines the data warehouse from a conceptual viewpoint. he major aim at this level is to represent the main data warehouse architecture - logical data warehouse data structures with appropriate attributes without taking into account any specific technology. PSM defines the data warehouse design from a certain platm view. For example, a data warehouse can be implemented according to different platms, such as Common warehouse metamodel (CWM) standard, SQL statements some particular warehouse platm. CODE defines implementation code. 6.1 CIM implementation As a basis CIM model serves business process model. Below is the business process diagram illustrating the Seller-initiated Invoice transaction process. his is not the only method by which the process may occur, however, it represents a primary process. Intermediaries, including routing hubs and/or networks, may be involved if necessary. CODE PIM PIM CIM Fig.4. MDA approach of data warehouse development Fig.5. Invoice transaction business process 6.2 PIM implementation A Platm independent model is a view of a system from the platm independent viewpoint [4]. his means that the model describes the system hiding the details necessary a particular platm. From the perspective of data warehouse development this view is logical data warehouse data model. Platm ISBN: 978-960-6766-63-3 335 ISSN: 1790-5117
independent model consists of integrated view of business process model and appropriate universal data model. CUSOMER # CUSOMER ID # SNAPSHO DAE * CUSOMER NAME - AGE - MARIAL SAUS * CREDI RAING Reconciliation Process Invoice Create invoice INVOICE # INVOICE ID # INVOICE IEM SEQ ID * INVOICE DAE * CUSOMER ID * BILL O ADDRESS ID * ORGANIZAION ID * ORG ADDRESS ID * PRODUC ID * QUANIY * AMOUN * EXENDED AMOUN - PRODUC COS * LOAD DAE SUPPLIER # SUPPLIER ID # ADDRESS ID * SUPPLIER NAME - POSAL CODE * LOAD DAE Fig. 6. Example of logical data structure of data warehouse 6.3 PSM implementation A Platm specific model is a view of a system from the platm specific viewpoint. his model represents platm independent model with perspective of how that model will be implemented by chosen platm. Platm Independent model might be implemented in different ways, example as XML description of data warehouse data structures. 7 Conclusion In the paper authors have introduced MDA oriented framework data warehouse development. his framework addresses the design of the data warehouse system by aligning every development stage of the data warehouse with the different MDA viewpoints. Authors introduced universal data models use in MDA oriented framework. Use of universal data models speeds up and facilitates development of data warehouse system. his approach is useful when process oriented data warehouse is developed. Authors consider that advantages of the approach are seen in the combination of model driven and universal data model s approach to data warehouse development framework. Both MDA and universal data models are designed to accelerate software development. Authors plan to evolve this approach to include transmation between different viewpoints of MDA. he aim is to develop fully automated transmation process. 8 Acknowledgments his work has been partly supported by the European Social Fund within the National Programme Support the carrying out doctoral study programm s and post-doctoral researches project Support the development of doctoral studies at Riga echnical University. References: [1] Staudt, M., Vaduva, A., & Vetterli,., Metadata Management and Data Warehousing (No. echnischer Report 99.04 Institut für Inmatik). Zürich: Universität Zürich, 1999 [2] Cabibbo L., orlone R. A Logical Approach to Multidimensional Databases. In: Proc. Of the 6th Intl. Conf. on Extending Database echnology (EDB 98). Volume 1377 of LNCS, pp. 183-197. Valencia, Spain. 1998. [3] Poole J. Model Driven Data Warehouse (MDDW). www.cwmum.org/pooleintegrate2003.pdf, 2003. [4] OMG Common Warehouse Metamodel (CWM) Specification 1.0.1. http://www.omg.org/cgibin/doc?mal/03-03-02, 2002. [5] Medina E., rujillo J. A Standard Representing Multidimensional Properties: he Common Warehouse Metamodel (CWM). In proceedings of the 6th East-European Conference on Advances in Databases and Inmation Systems (ADBIS 02), volume 2435 of Lecture Notes in Computer Science, pages 232-247, Bratislava, Slovakia. September, Springer-Verlag, 2002. [6] J.Mazón, J.rujillo, An MDA approach the development of data warehouses, An MDA approach the development of data warehouses, 1st issue, Vol. 45, Elsevier Science Publishers, 2008. [7] L.Silverston, he Data Model Resource Book Revised Edition Volume 1, Wiley, 2001 [8] W.H.Inmon, Building the Data Warehouse, 4 th edition, Wiley, 2005 [9] R.Kimball, L.Reeves, M.Ross, W.hornthwaite - he Data Warehouse Lifecycle oolkit, John Wiley & Sons (1998) [10] S.Kent, Model Driven Engineering, Lecture Notes in Computer Science, Vol. 2335, Springer, 2002. ISBN: 978-960-6766-63-3 336 ISSN: 1790-5117
[11] Marco, D., & Jennings, M., Universal Meta Data Models. New York et al.: Wiley Publishing., 2004. [12] M.Klimavicius, U.Sukovskis, Business process driven data warehouse development, Scien-tific Proceedings of Riga echnical University, Computer Science Series, Applied computer Systems, 6th issue, Vol. 22, Riga, Latvia, RU, 2005. [13] M. Klimavicius, owards Development of Solution Business Process-Oriented Data Analysis, Proceedings of World Academy of Science, Engineering and echnology, Volume 27, Cairo, Egypt, 2008 [14] V.Stefanov and B.List, Business Metadata the Data Warehouse - Weaving Enterprise Goals and Multidimensional Models, International Workshop on Models Enterprise Computing at the 10th International Enterprise Distributed Object Computing Conference, Hong Kong, China, 2006 ISBN: 978-960-6766-63-3 337 ISSN: 1790-5117