A Metadata-based Approach to Leveraging the Information Supply of Business Intelligence Systems



Similar documents
Methodology Framework for Analysis and Design of Business Intelligence Systems

BIPM H6001: Bus Intel & Process Modelling

Adapting an Enterprise Architecture for Business Intelligence

A Knowledge Management Framework Using Business Intelligence Solutions

Aligning Data Warehouse Requirements with Business Goals

SENG 520, Experience with a high-level programming language. (304) , Jeff.Edgell@comcast.net

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of

Module compendium of the Master s degree course of Information Systems

Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach

Organization of data warehousing in large service companies - A matrix approach based on data ownership and competence centers

Deriving Business Intelligence from Unstructured Data

Meta-data and Data Mart solutions for better understanding for data and information in E-government Monitoring

Sales and Operations Planning in Company Supply Chain Based on Heuristics and Data Warehousing Technology

Metadata Management for Data Warehouse Projects

The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija,

Extending UML 2 Activity Diagrams with Business Intelligence Objects *

DEVELOPING REQUIREMENTS FOR DATA WAREHOUSE SYSTEMS WITH USE CASES

MIS636 AWS Data Warehousing and Business Intelligence Course Syllabus

MEASURING THE PERFORMANCE OF EDUCATIONAL ENTITIES WITH A DATA WAREHOUSE

PERFORMANCE MANAGEMENT METHOD FOR CONSTRUCTION COMPANIES

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE

The College of Computers and Information Systems at King Rudan University

A Survey on Data Warehouse Architecture

OLAP Theory-English version

Dimensional Modeling for Data Warehouse

CUBE INDEXING IMPLEMENTATION USING INTEGRATION OF SIDERA AND BERKELEY DB

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

OLAP. Business Intelligence OLAP definition & application Multidimensional data representation

TOWARDS A FRAMEWORK INCORPORATING FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTS FOR DATAWAREHOUSE CONCEPTUAL DESIGN

Analysis Patterns in Dimensional Data Modeling

Index Selection Techniques in Data Warehouse Systems

BUILDING OLAP TOOLS OVER LARGE DATABASES

Data Warehousing and Business Intelligence (DW/BI)

Data warehouses. Data Mining. Abraham Otero. Data Mining. Agenda

Data Warehousing and Data Mining in Business Applications

How To Develop Software

TDWI strives to provide course books that are content-rich and that serve as useful reference documents after a class has ended.

Slowly Changing Dimensions Specification a Relational Algebra Approach Vasco Santos 1 and Orlando Belo 2 1

CHAPTER 3. Data Warehouses and OLAP

Dx and Microsoft: A Case Study in Data Aggregation

Designing ETL Tools to Feed a Data Warehouse Based on Electronic Healthcare Record Infrastructure

Model-Driven Data Warehousing

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

Conceptual Design Model Using Operational Data Store (CoDMODS) for Developing Business Intelligence Applications

Datawarehousing and Analytics. Data-Warehouse-, Data-Mining- und OLAP-Technologien. Advanced Information Management

CRM Analytics Framework

Course Design Document. IS417: Data Warehousing and Business Analytics

A Framework for Developing the Web-based Data Integration Tool for Web-Oriented Data Warehousing

E-Governance in Higher Education: Concept and Role of Data Warehousing Techniques

Data warehouse life-cycle and design

Extended RBAC Based Design and Implementation for a Secure Data Warehouse

Design of a Multi Dimensional Database for the Archimed DataWarehouse

A Design and implementation of a data warehouse for research administration universities

The Benefits of Data Modeling in Business Intelligence

BIG DATA COURSE 1 DATA QUALITY STRATEGIES - CUSTOMIZED TRAINING OUTLINE. Prepared by:

Microsoft Data Warehouse in Depth

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

Data Warehousing Systems: Foundations and Architectures

Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Course 20467A; 5 Days

Lection 3-4 WAREHOUSING

An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies

BUSINESS INTELLIGENCE AS SUPPORT TO KNOWLEDGE MANAGEMENT

Data Integration and ETL Process

LEARNING SOLUTIONS website milner.com/learning phone

How To Model Data For Business Intelligence (Bi)

Business Intelligence Not a simple software development project

Trivadis White Paper. Comparison of Data Modeling Methods for a Core Data Warehouse. Dani Schnider Adriano Martino Maren Eschermann

Strategic Data Management to Maximise Performance and Funding

Chapter 2 Why Are Enterprise Applications So Diverse?

Data Testing on Business Intelligence & Data Warehouse Projects

Nothing in this job description restricts management's right to assign or reassign duties and responsibilities to this job at any time.

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

Requirements are elicited from users and represented either informally by means of proper glossaries or formally (e.g., by means of goal-oriented

The Role of the Analyst in Business Analytics. Neil Foshay Schwartz School of Business St Francis Xavier U

Asking Business Questions as an Analysis and Design Technique

What is Management Reporting from a Data Warehouse and What Does It Have to Do with Institutional Research?

INTELLIGENT PROFILE ANALYSIS GRADUATE ENTREPRENEUR (ipage) SYSTEM USING BUSINESS INTELLIGENCE TECHNOLOGY

Object Oriented Data Modeling for Data Warehousing (An Extension of UML approach to study Hajj pilgrim s private tour as a Case Study)

Investigating Role of Service Knowledge Management System in Integration of ITIL V3 and EA

Development of a Concept to Integrate Technical Product Data and Business Data

IMPROVING THE QUALITY OF THE DECISION MAKING BY USING BUSINESS INTELLIGENCE SOLUTIONS

Near Real-time Data Warehousing with Multi-stage Trickle & Flip

Data Mining and Warehousing

Deductive Data Warehouses and Aggregate (Derived) Tables

Pharmaceutical Sales Certificate

The Evolution of the Data Warehouse Systems in Recent Years

An Introduction to Master Data. Carlton B

Business Process Management Systems and Business Intelligence Systems as support of Knowledge Management

Data Warehouses in the Path from Databases to Archives

Semantic Data Modeling: The Key to Re-usable Data

The SMB s Blueprint for Taking an Agile Approach to BI

Understanding The Concept Of Business Intelligence In Slovak Enterprises

2 AIMS: an Agent-based Intelligent Tool for Informational Support

Database Marketing, Business Intelligence and Knowledge Discovery

DATA QUALITY MANAGEMENT IN A BUSINESS INTELLIGENCE

A Pattern-Based Method for Identifying and Analyzing Laws

A MULTI-DRIVEN APPROACH TO REQUIREMENTS ANALYSIS OF DATA WAREHOUSE SCHEMA: A CASE STUDY

Business Intelligence and Healthcare

Transcription:

University of Augsburg Prof. Dr. Hans Ulrich Buhl Research Center Finance & Information Management Department of Information Systems Engineering & Financial Management Discussion Paper WI-325 A Metadata-based Approach to Leveraging the Information Supply of Business Intelligence Systems by Benjamin Mosig, Maximilian Röglinger December 2011 in: Proceedings of the 31st International Conference on Conceptual Modeling, Florence, Italy, October, 2012, p. 537-542 Universität Augsburg, D-86135 Augsburg Visitors: Universitätsstr. 12, 86159 Augsburg Phone: +49 821 598-4801 (Fax: -4899) www.fim-online.eu

A Metadata-based Approach to Leveraging the Information Supply of Business Intelligence Systems Benjamin Mosig and Maximilian Röglinger FIM Research Center, University of Augsburg, Universitätsstraße 12, 86159 Augsburg, Germany {benjamin.mosig,maximilian.roeglinger}@wiwi.uni-augsburg.de Abstract. Ensuring adequate information provision continues to be a key challenge of corporate decision making and the usage of business intelligence systems. As a matter of fact, the situation becomes increasingly paradox: Whereas decision makers struggle to specify their information requirements and spend much time on obtaining the information they believe to require, the amount of information supplied by business intelligence systems grows at a speed that makes it hard to keep track. Thus, it is very likely that the required information or suitable alternatives are available, but neither found nor used. Instead, manual searching causes considerable opportunity cost. Existing approaches to information requirements analysis pay attention to incorporate information supply, but do not provide means for leveraging it in a systematic and IT supported manner. As a first step to close this research gap, we propose a metadata-based approach consisting of a procedure model and formalism that help identify a suitable subset of the information supplied by an existing business intelligence system. The formalism is specified using set theory and first-order logic to provide a general foundation that may be integrated into different conceptual modelling approaches. Keywords: Business intelligence, Data warehouse, Metadata, Information provision, Information supply.

2 1 Motivation Ensuring adequate information provision has been a recurring topic over the last decades. Numerous approaches to information requirements analysis and various types of management support systems most recently business intelligence (BI) systems have been proposed to increase the clarity in corporate decision making [5]. Despite these efforts, decision makers still face information overload with negative consequences such as mental stress, loss of clarity, and reduced decision quality [1]. A particular problem in recent times is that the convenient access to BI systems and the high storage capacity of underlying data warehouses entice companies into accumulating large amounts of information [4]. As BI systems are historically grown and have been subject to uncontrolled growth in many organizations, it is hard to keep track with the information they supply. Academics approvingly report that not missing information [is] the primary problem and that all information is available somewhere [6] in most companies. Against this backdrop, it is very likely that the required information or suitable alternatives are available within an organization, but neither found nor used. The potential of existing information supply to satisfy information requirements is not sufficiently tapped [6]. As mentioned, literature contains numerous approaches dedicated to the elicitation and specification of information requirements particularly for the development of data warehouses and BI systems [3, 5]. Apart from few exceptions, the proposed approaches pay attention to incorporating existing information supply. The approaches share several characteristics: First, most activities related to leveraging existing information supply require manual effort and are hardly IT supported. Second, the approaches center on informal or semi-formal concepts, which makes it difficult to cover large amounts of existing information supply systematically. Third, some approaches deal with the initial development of a data warehouse or BI system and thus focus on the information supply of operational information systems. Fourth, the approaches provide no explicit means for coping with decision makers struggles when specifying information requirements. Despite the value of the presented approaches, there is a need for additional support to leverage the information supply of existing BI systems. This leads to the following research question: How can the information supply of existing BI systems be leveraged in a systematic and IT supported manner? We address the research question by proposing a metadata-based approach consisting of a procedure model and formalism that complement the approaches discussed above and help identify a suitable subset of the information supplied by an existing BI system. We rely on metadata because they play an important role in BI systems and have the potential to structure large amounts of data [2]. Following an axiomatic and deductive research approach, we first sketch the general setting and explicate our assumptions (section 2). We then derive the procedure model on this foundation (section 3). The paper concludes with a discussion (section 4). The formalism is specified using set theory and first-order logic. It has been omitted due to space restrictions, but can be requested from the authors.

2 General Setting Our unit of analysis is a single historically grown BI system that is based on a data warehouse as informational infrastructure. The data warehouse is based on a multidimensional data schema whose core elements on schema level are measures and dimensions. We treat all dimensions as orthogonal and abstract from structural abnormalities such as parallel hierarchies [3]. While an examination on the schema level is reasonable in the context conceptual modeling, information requirements analysis extends to the instance level because information requirements typically relate to the actual values of measures and hierarchic levels. We assume: (A.1) The multi-dimensional data schema consists of measures and dimensions. Each dimension includes hierarchic levels. To incorporate metadata into the procedure model and the formalism, information requirements need to be split into two parts where the first part includes requirements that directly relate to the core elements of the multi-dimensional data schema and the second part comprises requirements that relate to meta-attributes (see Table 1). Table 1. Considered components of the information requirements Schema level Instance level Related to the core elements of the multi-dimensional data schema Requirements regarding measures Requirements regarding dimensions Requirements regarding hierarchic levels Requirements regarding the domain of selected measures Requirements regarding the domain of selected hierarchic levels Related to additional meta-attributes Requirements regarding the value of a meta-attribute for each single selected measure Requirements regarding the value of a meta-attribute for all selected measures The first part of the information requirements helps specify requirements where the decision makers know precisely which combinations of measures and dimensions they need. These requirements can be elicited using the existing approaches to information requirements analysis. As known from conceptual modeling, there is a dependency between requirements on schema level and on instance level. That is, requirements regarding the instance level of measures or hierarchic levels relate to the domains of the measures or hierarchic levels selected on schema level. Requirements belonging to the second part of the information requirements relate to additional meta-attributes. What is special about using meta-attributes is that usually multiple subsets of the information supply exist that meet the related requirements. Requirements can be defined at two distinct reference levels (see Table 1). Either

4 each single selected measure has to fulfill a requirement individually (reference level: each single measure) or all selected measures together have to fulfill a requirement (reference level: all measures). Moreover, the collection effort of all selected measures must not exceed a defined limit (meta-attribute: collection effort ). We assume: (A.2) Each measure features the same meta-attributes. Meta-attributes are only assigned to measures. (A.3) The information requirements I req = {F model,schema, F model,instance, F meta } comprise requirements related to core elements of the multi-dimensional data schema on schema level, F model,schema = f model,schema 1 f model,schema s, requirements related to the core elements of the multi-dimensional data schema on instance level F model,instance = f model,instance 1 f model,instance t, and requirements related to metaattributes F meta = f meta 1 f meta u. All requirements are specified in firstorder logic. F model,schema only contains requirements that can be covered by the information supply. (A.4) The subset of I supply that meets all requirements related to the core elements of the multi-dimensional data schema on schema level (F model,schema = T) is denoted by I selected,model,schema. The subsets of I supply that meet all requirements related to meta-attributes (F meta = T) are denoted as set family (I selected,meta ) v where V is an index set, V is the number of different sets, and v V. Due to the logical AND operator ( ) in (A.3), F model,schema, F model,instance, and F meta only evaluate to true (T) if all respective requirements are met. Each of the v set unions I selected,model,schema (I selected,meta ) v subsequently referred to as I v is a feasible alternative containing the required information on schema level. F model,instance has not been considered so far as the enclosed requirements can partly be formulated after one of the I v has been selected. In order to determine which of the I v should be selected, we assume that the decision makers assess the utility and disutility of each alternative. (A.5) Decision makers strive to maximize the net benefit they receive from the selected subset of the information supply. Each measure features a subjective utility value u(m p ) and disutility value d(m p ). The utility and disutility values of a particular subset of the information supply I v are calculated as follows:, and,. The overall net benefit is calculated as. Finally, we need to know how decision makers specify their information requirements. Based on our experience from related industry projects, we assume: (A.6) Decision makers base their information requirements primarily on measures. Moreover, decision makers are able to specify information requirements related to the core elements of the multi-dimensional data schema and requirements related to meta-attributes independent of one another.

3 Procedure Model Based on the elaborations concerning the general setting, we are able to derive properties of a procedure model for leveraging the information supply of existing BI systems. The overall procedure model is shown in Fig. 1. First, the procedure model can start with the simultaneous specification of requirements from F model,schema regarding measures as well as F meta. This is because decision makers base their information requirements primarily on measures (see A.6) and meta-attributes are only assigned to measures (see A.2). This results in steps and of the procedure model. Second, the utility and disutility of the selected measures can be assessed directly afterwards as only measures are assessed (see A.5). This results in steps and of the procedure model. Due to the interdependency of requirements on schema level and on instance level, the requirements regarding dimensions and hierarchic levels from F model,schema have to be specified first. After that, F model,instance can be formulated when it comes to report parameterization. This results in steps and of the procedure model. The position of steps and is also reasonable because labour-intense effort is reduced as decision makers would otherwise have to assess the (dis-) utility of measures, dimensions, and dimensional hierarchy levels that are not implemented. Determine information requirements Specify requirements related to measures of the multi-dimensional data schema on schema level Specify requirements related to additional meta attributes Assess utility u(m p ) and disutility d(m p ) of all measures that are part of at least one alternative I v Select alternative with the highest net benefit U net* (I v ) Specify requirements related to dimensions and hierarchic levels on schema level Specify requirements related to dimensions and hierarchic levels on instance level Fig. 1. Procedure model for leveraging the information supply of existing BI systems

6 4 Discussion The proposed approach is beset with limitations that need to be taken into account when applying it in industry settings. Other limitations motivate future research endeavors: 1. Prior to application, appropriate meta-attributes have to be identified and if not already available filled with values. While this may be quite costly for a single use case, it is worth the effort in case of repeated applications for multiple groups of decision makers. As Stroh et al. [5] point out, information requirements analysis is not a one-time project, but a continual process. The proposed formalization based on metadata is a first step in this direction since it enables the realization of meaningful automation potential. 2. The approach restricts itself to consider existing information supply which will in general not fully satisfy a decision maker's information requirements. In this case, the remaining parts of the information requirements have to be covered using existing approaches to information requirements analysis. 3. Although IT support plays an important role, an implementation is pending. This is already part of on-going research and will shape up useful for practical application and evaluation issues. 4. The information requirements are currently treated as constant. While this is approximately appropriate for standard reporting and well-structured problems, it is not always the case in a complex and disruptive business environment. Despite its limitations, the proposed approach is a first step to address the research gap of leveraging the information supply of existing BI systems and towards an enhanced usage of metadata in the context of BI systems. References 1. Bawden, D; Robinson, L: The dark side of information: overload, anxiety and other paradoxes and pathologies. Journal of Information Science 35(2), 180-191 (2009) 2. Foshay, N; Mukherjee, A; Taylor, A: Does Data Warehouse End-User Metadata Add Value? Communications of the ACM 50(11), 70-77 (2007) 3. Kimball, R; Ross, M; Thornthwaite, W; Mundy, J and Becker, B: The Data Warehouse Lifecycle Toolkit. 2. John Wiley & Sons, Indianapolis (2008) 4. Oppenheim, C: Managers' use and handling of information. International Journal of Information Management 17(4), 239-248 (1997) 5. Stroh, F; Winter, R; Wortmann, F: Method Support of Information Requirements Analysis for Analytical Information Systems. State of the Art, Practice Requirements, and Research Agenda. Business & Information Systems Engineering 3(1), 33-43 (2011) 6. Winter, R; Strauch, B: A Method for Demand-driven Information Requirements Analysis in Data Warehousing Projects. In: Proceedings of the 36 th Hawaii International Conference on System Sciences (HICSS'03), 231.1 (2003)