TEHNICAL UNIVERSITY OF CLUJ-NAPOCA FACULTY OF ELECTRICAL ENGINEERING Eng. Marius-Ștefan MUJI PHD THESIS (abstract) CONTRIBUTIONS TO THE DEVELOPMENT OF DATABASE-DRIVEN INFORMATION SYSTEMS Scientific advisor Prof.Eng. Radu CIUPA, PhD 2011
Information Systems are part of our every-day life. At the current state of technology, the final users of the information systems are completely isolated from the technical details of the system, due to the modern user interfaces, which provide usually a graphical representation, with intuitive and friendly functions. On the other hand, the developers of the information systems are still forced to specify the system at lower abstraction levels, corresponding to the technologies employed. This requires a substantial amount of effort to learn and use the respective technologies, which could have been better used for activities like analysis and design. When the scope of the information system is clearly defined, and restricted to a certain domain, there are more chances to develop the system using a vocabulary which is closer to the application domain. The technologies that make this possible are called domain-specific languages (DSL) [1] [2] [3]. Their usage provides major advantages during the development process, related primarily to the reduction of the semantic gap between the user requirements and the technical specifications. The development time and the maintainability of the system are also important issues addressed by these technologies. Currently, the scientific community puts significant efforts to ensure the convergence of the research fields related to domain specific languages and model driven engineering (MDE) [4] [5]. Moreover, it is considered that the development of DSL can be seen as a generalization of MDE, which, in turn, can be seen as a generalization of model driven architecture (MDA) [6] [7] [8]. Database driven systems are an important category of information systems, which, due to their specific architectures and data models, are usually built with domain specific technologies (e.g., the SQL language) [9] [10]. The major objective of this PhD thesis is to make contributions to the development of database-driven information systems, mainly by raising the level of abstraction of their technical specifications. In order to reach this goal, the paper attempts to: Identify the general objectives of the information systems, starting from their base components (users, data, procedures, and technology); Analyze the actual state of the art in the field; Identify possible development areas; Propose solutions in the respective areas. The first chapter of the paper is aimed to shortly introduce the targeted domains and the main objectives of the thesis. The second chapter identifies five general objectives of the information systems, which were always reflected by the architectures of this type of systems, and also by the various development technologies. 1
The five objectives are: 1. The declarative specification of the system; 2. The possibility to accept new users, without affecting the existing users of the system; 3. The possibility to change the technology, without affecting the specification of the system s data; 4. The possibility to define (specify) the information system at an abstraction level as high as possible; 5. The system s capacity to ensure the information quality required by the users. From these five objectives, the last is the only one which depends on the specific requirements of the application domain. The other objectives depend mostly on the architectures and the data models used by the information system. However, when the boundaries of the application domain are clearly defined, the fourth objective enumerated above can be reached more easily, through domain specific languages. For this reason, the thesis considers two distinct research directions: The first is related to the architectures and the data models used in general purpose database-driven development; The second is related to domain specific languages in database-driven systems. The third and the fourth chapters are dedicated to the first direction. Thus, the third chapter proposes a new architecture for database-driven systems, which integrates the classification criteria identified for the existent architectures (e.g., the ANSI three levels architecture, the four levels architecture, the four levels architecture extended by Halpin [11]). The main advantage of the proposed architecture is that it provides a distinct representation of the user views and of the external level of the system. Thus, it provides an explicit representation for the user views not only at the external level of the system, but also at the conceptual level, at the logical level, and at the physical level (see Table 1). COMMUNITY VIEW (CV) USER VIEWS (UV) EXTERNAL LEVEL CONCEPTUAL LEVEL Domain Vocabulary Data Use Cases (user interface) Conceptual Schema / CV Conceptual Schema / LOGICAL LEVEL Logical Schema / CV Logical Schema / PHYSICAL LEVEL Physical Schema / CV Physical Schema / Table 1: The simplified version of the proposed architecture for database-driven information systems 2
The architecture provides a framework for the declarative development of the entire information system, not only for the community view. Using this architecture, it becomes clear that the declarative specification of the user views could be done using the same approach as for the community view of the system. In this case, the relational model provides the theoretical grounds for a complete, declarative, data-oriented, approach in the development processes, even if current technologies (i.e., the Relational Database Management Systems) are limited in their (declarative) support of the theoretical model [12] [13] [14]. The current approach in the development of the user views of the system is determined by the process-orientation bias of the development teams. In chapter four of the thesis it is defined a new data model, inspired and derived from the relational model, whose aim is to provide the theoretical support for a declarative, data-oriented approach in the development of the user views. Being designed specifically for the development of user interfaces (UI), the proposed model was called presentation model. It is important to emphasize that the presentation model is not intended to overlap with the relational model, being complementary and perfectly compatible with it. Thus, for data structuring purposes, the presentation model takes from the relational model the valuable idea of a unique (essential) data constructor, adding an ordering relation and a current element to the relation. The ordered collection of tuples obtained is not a (mathematical) set anymore, but is able to represent those two semantic features (order and current position), necessary in a presentation context. From Table 2 it can be observed that the presentation model was defined as close as possible to the relational model, which ensures a better compatibility between the community view logical schema and the user views logical schema (eliminating the impedance mismatch, which needs to be addressed by the current technologies). Architectural level Data structures Integrity constraints Relational Model The Logical Schema of the Community View The Relation (Table) (unique/essential data constructor) Constraints categories: Attribute constraints; Tuple constraints; Relation constraints; Database constraints; State transition constraints Presentation Model The Logical Schema of the User Views The Array (unique/essential data constructor) Constraints categories: Attribute constraints; Tuple constraints; Relation constraints; Array constraints; User View constraints; 3
of the database. State transition constraints of the user view; State transition constraints of the database of the system o Update constraints; o Refresh constraints. Operators Set related:,, \, x, etc. Relational: projection, restriction, join, etc. Set related:,, \, x, etc. Relational: projection, restriction, join, etc. Array specific:, get_att_val, get_tuple, get_current, set_current, A2T, T2A, etc. Table 2: The presentation model compared with the relational model Chapter five of the thesis focuses on the second research direction, related to domain specific languages in database-driven systems. There are presented two applications, related to the electrical engineering domain, and to the medical domain, respectively. In both cases, the end users (i.e., the domain specialists) are able to extend the definition of the information system at their (external) abstraction level, without any intervention of the IS developers. The novelty of the approach is that, although both applications have a DSL-specific functionality, their architecture is a typical database systems architecture. The high degree of generality required for the DSL technologies is ensured, exclusively, through the database design. For this reason, were eliminated two important problems encountered for the majority of DSL technologies: the expert-level technical knowledge (in the programming field) needed for the development of this type of applications, and the difficulties related to the integration with the other modules of the information systems. For both applications, the integration with the rest of the system is realized directly at the database level, while the typical development expertise of the development team is the usual database systems development expertise. The first part of the chapter describes the first application, which is a healthcare information system [15] [16], which allows the medical team to permanently change the structure of the standardized files used in a daily bases for the completion of the electronic health record (HER) of every patient. The second part of the chapter describes a data mining tool, used for the short circuit analysis of the historical data generated by a SCADA system. In a similar approach as for the healthcare application, the specialists of the field are allowed to generate ad-hoc queries at the 4
external level of the system, not requiring any changes of the system s specifications, at the conceptual, logical, or physical level. It has to be emphasized that SCADA systems offer great opportunities for the development of this kind of highly specialized DSL tools, due to the stability of the business rules, determined by their scientific, mathematical foundation. The described application can also be easily reused for many different other purposes, having a great potential for complex systems with many integrated modules (e.g., SMART GRIDS), that need to bridge the semantic gap between the end users and the various technologies employed. REFERENCES - selection [1] Debasish Ghosh, "DSL for the uninitiated," Communications of the ACM, vol. 54, no. 7, pp. 44-50, July 2011. [2] Naveneetha Vasudevan and Laurence Tratt, "Comparative Study of DSL Tools," Electronic Notes in Theoretical Computer Science (ENTCS), vol. 264, no. 5, pp. 103-121, July 2011. [3] Arie van Deursen, Paul Klint, and Paul Visser, "Domain-specific languages: an annotated bibliography," ACM SIGPLAN Notices, vol. 35, no. 6, pp. 26-36, June 2000. [4] Frederic Jouault et al., "Inter-DSL coordination support by combining megamodeling and model weaving," in SAC '10, Sierre, 2010, pp. 2011-2018. [5] Jean Bezivin, Mikael Barbero, and Frederic Jouault, "On the Applicability Scope of Model Driven Engineering," in MOMPES '07, Braga, 2007, pp. 3-7. [6] Ivan Kurtev, Jean Bezivin, Frederic Jouault, and Patrick Valduriez, "Model-based DSL frameworks," in OOPSLA '06, Portland, 2006, pp. 602-616. [7] Jean Bezivin and Olivier Gerbe, "Towards a Precise Definition of the OMG/MDA Framework," in 16th IEEE international conference on Automated Software Engineering, 2001, p. 273. [8] OMG. (2000) MDA Specifications. [Online]. http://www.omg.org/mda/specs.htm [9] International Organization for Standardization. (2008) ISO/IEC 9075-1:2008 (SQL/Framework). Document. [10] Marjan Mernik, Jan Heering, and Anthony M. Sloane, "When and how to develop domainspecific languages," ACM Computing Surveys (CSUR), vol. 37, no. 4, pp. 316-344, December 2005. [11] Terry Halpin and Tony Morgan, Information Modeling and Relational Databases, 2nd ed.: Morgan Kaufmann, 2008. [12] C. J. Date, Date on Database: Writings 2000-2006.: Apress, 2006. [13] C. J. Date, What Not How: The Business Rules Approach to Application Development.: Addison- Wesley, 2000. [14] Lex de Haan and Toon Koppelaars, Applied Mathemathics for Database Professionals.: Apress, 2007. 5
[15] M. Muji et al., "Database Design Patterns for Healthcare Information Systems," in International Conference on Advancements of Medicine and Health Care through Technology, vol. 26, Cluj- Napoca, 2009, pp. 63--66. [16] M. Muji et al., "Information Systems as Tools for Medical Research Activities," Acta Electrotehnica, vol. 48, no. 4, pp. 47-50, 2007. [17] C. J. Date, An Introduction to Database Systems (8th edition).: Addison-Wesley, 2003. [18] Joseph S. Valacich, Joey F. George, and Jeffrey A. Hoffer, Essentials of Systems Analysis and Design, Prentice-Hall, Ed., 2001. [19] William J. Lewis, Data Warehousing and E-Commerce.: Prentice Hall PTR, 2001. [20] William J. Lewis, "E-Commerce Vs. Data Management," The Data Administration Newsletter TDAN.com, January 2002. [21] Ronald G. Ross, Business Rule Concepts - Getting to the Point of Knowledge.: Business Rule Solutions, LLC, 2005. [22] Marius S. Muji and Dorin Bica, "Information Systems Architects: Business Analysts or IT Engineers," in Balkan Region Conference on Engineering and Business Education & International Conference on Engineering and Business Education, vol. I, Sibiu, 2009, pp. 236-239. [23] Marius Muji, "Application Development in Database-Driven Information Systems," Acta Universitatis Sapientiae - Electrical and Mechanical Engineering, vol. 2, pp. 63-72, 2010. [24] Marius Muji, "A Data-Oriented Approach for Application Development," in The 4th edition of the Interdisciplinarity in Engineering International Conference (INTER-ENG 2009), Tg. Mures, 2009. [25] M. Muji et al., "Best Practices in the Design and Development of Health Care Information Systems," Acta Electrotehnica, vol. 48, no. 4, pp. 51-54, 2007. [26] D. Bica, C. Moldovan, and M. Muji, "Power engineering education using NEPLAN software," in Universities Power Engineering Conference, 2008. UPEC 2008. 43rd International, Padova, 2008, pp. 1-3. 6