Metadata Management for Data Warehouse Projects
|
|
|
- Elmer Rogers
- 10 years ago
- Views:
Transcription
1 Metadata Management for Data Warehouse Projects Stefano Cazzella Datamat S.p.A. Abstract Metadata management has been identified as one of the major critical success factor in data warehousing projects, but the industrialization process of metadata management did not yet reach a satisfying level of maturity from both a methodological and a technological point of view; therefore every project has to build its own metadata management system following the methodology that better fits its requirements. This paper describes the metadata management methodology designed during a e-government data warehouse project and shows how it could be applied by means of a real case study. The methodology we have devised and effectively put into action covers the entire process of building and running a metadata management system in a data warehouse project; its main objective is to exploit metadata to create an homogeneous, structured and integrated vision of the data warehouse system in order to produce valuable metadata-based services for the end users. 1. Introduction Metadata is a wide and very pervasive subject matter that is rapidly gaining interest in the last years. In this paper we restrict our focus to metadata management in data warehousing contexts; anyway some of the considerations and the approaches proposed in this paper could be generalized to other application fields. Metadata has been recognized as a relevant topic since the beginning of data warehousing and it is still considered as one of the major critical success factor in data warehousing projects [1][2][3]. Indeed metadata management is crucial to improve overall project quality and flexibility, as to increase the usability of the system by the end users. Despite the attention addressed by all the operators of the data warehouse industry (technology vendors, consultant companies, etc.), the industrialization process of metadata management did not yet reach a satisfying level of maturity neither from a methodological nor from a technological point of view [3]. Today, we still does not have a ready to go metadata management solution that can be applied in every data warehouse project as-is. Conversely, almost every project has to build its own metadata management system from scratch or integrating different commercial tools with custom piece of software [4][5]. These customized systems are often realized in order to accomplish specific and ambitious requirements generated by the increasing expectations related to metadata management. Usually these requirements grow in time (iteration by iteration) as you achieve the first results in exploiting metadata. The main goal that should be pursued by a metadata manager is to create a structured, integrated and homogeneous vision of the whole data warehousing system using the deliverables produced during the life cycle of every project iteration. This goal is the foundation that makes a lot of metadata requirements reachable. If you have a structured, integrated and homogeneous vision of the whole system you can analyze the system itself and provide useful information about the system to all the actors involved in the data warehouse project: from the data warehouse administrator, to the end users of the system. More and more often the data warehouse architecture includes a Information Portal that represents the web interface used to deliver information and analytical services (e.g. OLAP, statistical analysis, KDD, etc.) to the end users of the data warehousing system. This portal is an ideal companion for the metadata management system; exploiting its web based technologies, it simplifies the integration between data and metadata realizing the well-known adage: data + metadata = information. Indeed metadata could be effectively exploited to help users to easily access, use and understand the business data provided by the data warehousing system [6]; support functionalities like these increase dramatically the success probability of the project, since it is closely correlated to the real value the users assign to the system. Many commercial tools and methodologies try to cope with metadata management from a technological perspective: they focus their attention on the metadata repository and its logical structure. The same issue
2 could be addressed in a more effective and easier way moving from a logical to a conceptual representation of metadata. Reasoning at a conceptual abstraction level gives you the flexibility you need to manage metadata in a double dereferenced way working dynamically both on metamodel and metadata. This approach has been largely used in the formulation of the methodology and in the development of the metadata management system shown in this paper. 2. The methodology The metadata management methodology we have devised identifies the main goals and the macro activities that should be part of the metadata management process of a data warehouse project. The methodology is independent from any technological issue and then could be implemented using a COTS approach, an home made one or a mix of them. It is independent from the architecture of the metadata management system, too. Our methodology is structured in four phases: 1. Metadata sources identification. In this phase all the potential metadata sources, that are involved in the developing process of the data warehouse system, are identified. Every source is analyzed to define its metamodel. 2. Metamodel integration. In this phase we integrate the metamodels defined in the previous phase to get a unique model that is able to represents the whole system. The integrated metamodel is the composition of the metamodels of the selected metadata sources. 3. Metadata integration. In this phase the metadata produced by every source are integrated to recompose an homogeneous view of the data warehouse system accordingly to the integrated metamodel built up in the second phase. 4. Metadata analysis and publication. In this phase the integrated metadata are used to carry out analysis of the data warehouse system (i.e. cross-impact analysis, data lineage, etc.) for the technical users and to create advanced functionalities that help the end users to better exploit the system and the information it provides. Usually a data warehouse system is developed iteratively to reduce risks and deal with the complexity of a data warehouse project [7]; this methodology offers the same flexibility. The four phases of the methodology could be applied iteratively to direct a gradual broadening of the metamodel and, consequently, the metadata handled by the system. This could be useful whether you plan to smoothly add metadata management to an already running data warehouse system, or whether you introduce new metadata sources (e.g. you change your production process adopting a new CASE tool or establishing a new deliverable) during the iterative building of a data warehouse system. It s also possible to reapply iteratively only the last two phases in order to update the metadata on every iteration of the life cycle of the data warehouse system; in this way the metamodel is still the same, while the metadata will change accordingly to the last version of the system Metadata sources identification The goal of this phase is to identify all the metadata sources that will be handled by the metadata management system and specify their metamodels. The metadata sources should be chosen among the different deliverables produced during the life cycle of the developing process, form the analysis phase to the test. Valid examples of metadata sources are: the conceptual, the logical, ETL procedures, business rules, user specifications, etc. The more tractable sources are the deliverables produced by CASE and developing tools, since they already have their own metamodels. To extend the metadata managed by the system to the information contained in other deliverables, new metamodels must be defined to specify the logic structure and the semantic of every piece of information the deliverables contain. At the end of this phase all the sources must have their own metamodel. Every metamodel should be represented using a single modeling language; since the proposition of CWM [8], metamodels are commonly represented using UML class diagrams. In the following we assume that every metamodel is represented by a set of classes, and a set of associations between classes; the two sets define a graph where the classes are the vertexes and the associations are the edges. Every vertex is labeled with a full qualified name composed by the metamodel name and the class name, and has a set of attributes that are the attributes of the class. Every edge is adorned with the roles of the participating classes in the corresponding association and, eventually, a name; moreover it should be classified in Depends-on association, Part-of association, etc. to enable impact analysis, data lineage, and other kind of analysis. Using UML this classification could be represented by stereotypes. During an iteration of a data warehouse project, these metamodels are progressively instantiated as the deliverables are produced and released; these instances are the metadata that will be handled by the metadata management system.
3 2.2. Metamodel integration The goal of this phase is to integrate the different metamodels identified in the previous phase. The integrated metamodel has to represent an homogeneous view of a data warehousing system by the merger of the metamodels of the deliverables that contribute to build the data warehouse system. The integration process is performed by the introduction of new associations between the classes of different metamodels; so every source metamodel maintains its identity and integrity. No metamodel should remain isolated; the classes of the integrated metamodel must build up a connected graph. This is the integrated metamodel graph. In order to find out which association could be added between the classes of the metamodels, it s better to follow a two step strategy. First you have to consider the set of source metamodels only; from this set you have to couple all the interrelated metamodels according to the architecture of the data warehouse system and the methodologies applied in the analysis, design and development phases. You get a graph whose vertex set is the set of metamodels and where every interrelated couple of metamodels is an edge; this graph should be a connected graph. Since every deliverable produced during the life cycle of the system takes part to the building of the system, if you have included enough deliverable metamodels during the first phase of the methodology, you should be able to get a connected graph. Then, as the second step, you have to look into every couple of interrelated metamodels to identify the associations between the classes of the two metamodels that represents the relationship of the couple. If every source metamodel is connected, the integrated metamodel graph will be connected, too. One of the more common kind of association used to integrate metamodels is an equivalence relation; this kind of association is used when two classes of two different metamodels represent the same type of object. For example, a table of a relational data base is represented by a Table class in both the ing metamodel and the ETL metamodel; these two metamodels could be integrated introducing an association between the two Table classes that asserts their equivalence (but not their equality) Metadata integration The goal of this phase is to integrate the metadata of a specific project (or an iteration of a project) to build up an homogeneous view of the particular data warehouse system realized by the project itself. The integration is performed instantiating the integrated metamodel defined in the previous phase. This process could be completed incrementally: every source metamodel is instantiated as the corresponding deliverable is consolidated; the associations between two different metamodels are instantiated when the two metamodels are instantiated. In this way the metadata could be regularly integrated during the life cycle of the project in order to support the following activities of the life cycle. Most association instances that integrates metadata of different metamodels could be derived applying some integration rules. For example, the equivalence association between two classes could be automatically instantiated when two instances of these classes has the same full qualified name; this rule could be applied to the instances of Table classes of the ling and ETL metamodels when the instances represents the same relational table, that is when they have the same name and belongs to the same data base (or schema). Usually not all the association instances needed to integrate the metadata could be derived by integration rules; in these cases you have to provide all the information needed to integrate the metadata by yourself. There are many ways to do this and they depend on the technical architecture of the metadata management system. At the end of this phase you get a new graph, the metadata graph, whose vertexes are the instances of the classes of the integrated metamodel and the edges are the instances of its associations (both the internal and the integrating ones). Indeed, this graph is an instance of the integrated metamodel graph defined in the previous phase and, if the metadata have been correctly integrated, it is a connected graph. From a technical/technological point of view, the integration could be carried out in different ways; in effect, no one is excluded a priori. You could choose a centralized architecture, with a unique metadata repository where you store the metadata extracted from different sources (the deliverables); the source metadata could be accessed directly or by means of interfaces provided by the tools used to create the deliverables. Otherwise you could decide for a distributed architecture where all the repositories of the different tools involved in the production process are federated; since the associations between different metamodels are not included in any of the of the source repositories, the metadata are merged using an additional repository to store only the instances of these associations. Anyway the most important feature of the architecture you choose is the availability of an interface that enables you to access all the metadata in the same way; you should be able to navigate among the metadata graph knowing only the integrated metamodel and the common interface. Often this interface is based on a sort of meta-meta-model that is used to represents all the metamodel instances handled by the metadata management system.
4 Moreover you could choose to build up the metadata management system from scratch, to assemble it using commercial components or to buy a commercial tool. In the last case, be sure the chosen metadata management system gives you the possibility to redesign your own metamodel. Commercial components and tools are generally useful to get out metadata from the source tools and to conduct the classic cross-impact analysis, but often system integration activities are required to fulfil all the requirements a metadata management system usually have Metadata analysis and publication This is a crucial step in the methodology, since all the effort spent to handle and integrate the metadata during the first three steps of the methodology is justified by the return earned by the analysis and the applications produced in this phase. The metadata graph enables to perform the classical cross-impact analysis and data lineage analysis; moreover the same graph could be used to evaluate software quality metrics; e.g. how many relational tables have not been derived from an entity of the conceptual? or what is the average number of mappings per target table?. These kind of analysis are addressed to technical users as workgroup members or data warehouse administrators; there is at least another kind of user that may be supported in its daily activities using metadata: the end user. Indeed there are a lot of interesting services that may be provided to the end users exploiting the information collected in the metadata graph. First of all the metadata graph itself could be rendered to enable end users to move around metadata. For example, every node of the graph could be rendered as an HTML page and every arc is implemented as an hypertext link between two pages. In this way every user could browse the graph as a web site. This pool of pages represents a sort of encyclopaedia about the system; with a good research engine you could index this pages to give the end users the possibility to perform text based searches. But, the most effective search capabilities may be realized building simple wizard applications that drives the end users in their search. These kind of applications starts the search from the most general context of analysis and reduce step by step the search space steering to the search target. Finally, the metadata graph could be used to produce technical documentation and manuals with a very low effort. For example you can get a list of the reports provided by the system with their descriptions and the sets of measures and dimension they contains. The metadata-based services (wizard and simple search engine), the metadata graph rendered as HTML pages, the manual and so on could be published on the Data Warehouse Portal together with the reports, the OLAP environments and the other functionalities provided by the system. The integration between data and metadata could be further carry out; e.g. every report should be coupled with the corresponding metadata in order to associate to the report a meaningful description of the data it contains and ensure that the users has all the possibilities to correctly understand the meaning of the numbers shown by a report. The publication of metadata and metadata-based services together with classical data and data-based services (e.g. OLAP, statistical analysis, etc.) enhances the usability of the system from the end users perspective and this is a crucial aspects that may significantly increase the success probabilities of a data warehouse system on the long run. 3. The case study The case study describes our successful experience in metadata management accrued during a complex data warehouse project for e government. During this project we have devised the methodology we discussed in the previous paragraphs, we have built up a metadata management system that supports this methodology and we have iteratively applied this methodology accordingly to the evolution of the whole data warehouse system The sources metamodels The data warehouse has the classical data flow shown in the left side of figure 1. To rebuild an integrated vision of a data warehouse, every component should be considered, especially the ones involved in the data flow architecture. So we have considered which deliverables were released for every component during the development life cycle (see right side of figure 1): - Databases component - the logical/physical s of the staging area, of the enterprise data warehouse (EDW) and of the data marts (DM); - ETL component - the procedures used to load the data warehouse (EDW ETL) and the data marts (DM ETL); - Front-end components - a semantic layer, that provides a business representations of the data stored in the data mart physical, and a pool of predefined reports.
5 Data warehouse data flow Metadata sources Operational system 1 Operational system n Op. system 1 logical Op. system n logical Staging area Staging area logical EDW ETL EDW ETL procedures EDW EDW logical DM X ETL DM Z ETL DM X ETL procedures DM Z ETL procedures DM X DM Z DM X logical DM Z logical DM X OLAP Engine DM Z OLAP Engine DM X semantic layer DM Z semantic layer DM X Inf. Portal DM Z Inf. Poral DM X reports DM Z reports Figure 1. Metadata sources All these deliverables were selected as metadata sources. To complete the picture we also included among the metadata sources the logical/physical data models of the source operational systems. The deliverables we listed above were released by the activities of the Design and Development phase of the development life cycle. As shown in figure 2, during the Definition phase two very valuable deliverables are produced: the conceptual, that represents all the information handled by the system independently from technical issues, and the dimensional fact model (DFM) [9], that was used to represent the end user s requirements in terms of multi-dimensional analysis capabilities. These models represents the whole system from a business point of Conceptual Definition DM DFM Logical Semantic s layer Design ETL procedures DM reports Development Figure 2. Development life cycle
6 view: they represents the available information with their business meaning and the user s analysis capabilities. These deliverables, together with the description of the derivation process of logical data models from the conceptual one (C2L), completes the metadata sources we considered for metadata management The integrated metamodel Every deliverable has its own metamodel; it could be the metamodel of the CASE tool used to realize the deliverable or a metamodel specifically designed to represent it in a structured form (see table 1). Sources logical Conceptual C2L EDW logical DMs logical DMs DFM DMs semantic layer 3.3. The metadata management system The metadata management system has been realized starting from a commercial metadata management tool. Its functionalities has been extended and integrated with other custom applications in order to realize a system that was able to cope with the client s requirements. The first kind of software applications we had to realize are the ones needed to represent in a structured form the metadata of the deliverables produced without the support of a commercial CASE tool (see table 1). These applications help analysts and developers to document their work in a common, standardized way also when there aren t a specific tool for that work; so important notions about the system, that otherwise would be retained only in the mind of one person, could be shared among different people and used as metadata. The other software application is the core of the metadata management infrastructure we realized; it is in charge to represent the integrated metadata using an homogeneous meta-meta-model and produce the metadata based services that have been published on the Information Portal. The acquisition and integration of all the metadata produced by commercial tools and the metadata analysis was performed using the commercial metadata management tool. Table 1. Metamodels EDW ETL procedures DMs ETL procedures DMs reports Figure 3. Integrated metamodel These metamodels have been integrated following the indications provided in the Metamodel integration paragraph. The integrated metamodel has been designed in order to represent, in a structured form, the whole data warehouse system, as illustrated in figure 3. The arcs between the nodes of this connected graph represents the associations used to integrate the different metamodels. For every couple of metamodels linked by at least one arc, one or more associations between classes of these metamodels have been defined in order to integrate the sources metamodels. At the end of this process we got an integrated metamodel that may represent every data warehouse that shares the same development methodology and technical architecture (i.e. different iterations of the same data warehouse). Metamodel Conceptual and logical/physical C2L Semantic layer DFM ETL procedures Reports Tool Commercial CASE tool with its own metamodel; it exports its metadata in XML (with proprietary DTD). Custom application; the metamodel is a subset of a classical E/R metamodel Commercial OLAP system with its own metamodel; it exports its metadata using public non-standardized APIs. Non-commercial CASE tool with its own metamodel; it exports its metadata in XML. Commercial ETL tool with its own metamodel; it exports its metadata by means of public non-standard relational views provided to access its private repository. Custom application; the metamodel has been designed adhoc to represents both report s content and its structure.
7 3.4. The metadata-based services The main metadata-based service that is published on the data warehouse information portal is the rendering of the integrated metadata graph as HTML pages. It is a large connected graph with more than nodes. A little subset of this nodes are proposed as entry point for browsing metadata; some examples of entry point nodes are the nodes representing reports, the conceptual, the dimensional fact models, and so on. Starting from any of these nodes, every other node is reachable; so, for example, you can browse from an entity of the conceptual model to the reports that contains the data modeled by this entity. Since browsing the metamodel may be a bit too difficult for some kind of non technical users, another category of metadata based service has been realized. They are search wizards that helps the users to find the information they need during their daily working activities. These wizards start from the context analysis of a semantic layer and give the user the possibility to choose the one that interests him. Then they propose all the measures that belongs to that context; and at the last step the wizards show all the reports that contains the selected measure. Then the user can open directly the report he was looking for. These wizards are very useful for new users that still don t know all the information provided by the data warehousing system; moreover they are useful when new information (or a new system) are provided to help the users to familiarize with them. 4. Conclusions The distance between the requirements about the exploitation of metadata to help the end users to get the best from the data warehousing system and the solutions proposed by metadata industry led us to devise a new methodology for metadata management. Many methodologies and tools are focused on collecting and integrating all the metadata in a common repository or in a federated one; in our methodology we define the metadata integration strategy at an higher level of abstraction working with a shared metamodel. The main goal is to exploit the deliverables released during the project life cycle in order to create an homogeneous, structured and integrated vision of the data warehouse system and produce valuable metadata based services for the end users. This methodology has driven the implementation of a flexible metadata management system, built up combining commercial products and custom applications. The data warehouse system, including the metadata management component, has been recently subjected to an external audit; the auditing results assert that one of the aspects that sticks out as being especially well done is the metadata surrounding the data warehouse and that the metadata infrastructure that has been created is truly remarkable. This paper has shown how we realized a technical infrastructure for metadata management founded on a sound methodology, but we didn't talk about organizational and cultural issues involved in metadata management. These aspects are as important as the technical ones in order to realize a successful metadata management system, but they are more insidious to cope with. Metadata management may require some extra effort, e.g. to document exhaustively every attribute of every entity, every ETL mapping, every cell of every report and so on, but the returns in terms of quality, maintainability and usability of the data warehousing system largely cover any extra cost related to metadata management activities. 5. References [1] W. H. Inmon, Building the data warehouse (2nd ed.), John Wiley & Sons Inc., 1996 [2] R. Kimball, M. Ross, The Data Warehouse Toolkit (2nd ed.), John Wiley & Sons Inc., 2002 [3] M. Staudt, A. Vaduva, T. Vetterli, The Role of Metadata for Data Warehousing, Technical Report 99.06, University of Zuerich - Dept. of information Technology, 1999 [4] A. Tannenbaum, Metadata Solutions, Addison Wesley, 2002 [5] G. Auth, E. von Maur, M. Helfert, A Model-based Software Architecture for Metadata Management in Data Warehouse Systems, Business Information Systems - Proceedings of BIS 2002, Witold Abramowicz (ed.), Poland, 2002 [6] J. Lawyer, S. Chowdhury, Best Practices in Data Warehousing to Support Business Initiatives and Needs, Proceedings of the 37th Hawaii International Conference on System Sciences, IEEE, 2004 [7] Prism, Iterations: The Data Warehouse Development Methodology, Prism Solutions Inc., 1997 [8] OMG, Common Warehouse Metamodel Specification, Object Management Group, 2001 [9] M. Golfarelli, D. Maio, S. Rizzi, The Dimensional Fact Model: A Conceptual Model For Data Warehouses, International Journal of Cooperative Information Systems, 1998
A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems
Proceedings of the Postgraduate Annual Research Seminar 2005 68 A Model-based Software Architecture for XML and Metadata Integration in Warehouse Systems Abstract Wan Mohd Haffiz Mohd Nasir, Shamsul Sahibuddin
Model-Driven Data Warehousing
Model-Driven Data Warehousing Integrate.2003, Burlingame, CA Wednesday, January 29, 16:30-18:00 John Poole Hyperion Solutions Corporation Why Model-Driven Data Warehousing? Problem statement: Data warehousing
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
Data warehouse and Business Intelligence Collateral
Data warehouse and Business Intelligence Collateral Page 1 of 12 DATA WAREHOUSE AND BUSINESS INTELLIGENCE COLLATERAL Brains for the corporate brawn: In the current scenario of the business world, the competition
Data Warehouse Overview. Srini Rengarajan
Data Warehouse Overview Srini Rengarajan Please mute Your cell! Agenda Data Warehouse Architecture Approaches to build a Data Warehouse Top Down Approach Bottom Up Approach Best Practices Case Example
Java Metadata Interface and Data Warehousing
Java Metadata Interface and Data Warehousing A JMI white paper by John D. Poole November 2002 Abstract. This paper describes a model-driven approach to data warehouse administration by presenting a detailed
Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain
Talend Metadata Manager Reduce Risk and Friction in your Information Supply Chain Talend Metadata Manager Talend Metadata Manager provides a comprehensive set of capabilities for all facets of metadata
BUILDING OLAP TOOLS OVER LARGE DATABASES
BUILDING OLAP TOOLS OVER LARGE DATABASES Rui Oliveira, Jorge Bernardino ISEC Instituto Superior de Engenharia de Coimbra, Polytechnic Institute of Coimbra Quinta da Nora, Rua Pedro Nunes, P-3030-199 Coimbra,
SAS BI Course Content; Introduction to DWH / BI Concepts
SAS BI Course Content; Introduction to DWH / BI Concepts SAS Web Report Studio 4.2 SAS EG 4.2 SAS Information Delivery Portal 4.2 SAS Data Integration Studio 4.2 SAS BI Dashboard 4.2 SAS Management Console
A Survey on Data Warehouse Architecture
A Survey on Data Warehouse Architecture Rajiv Senapati 1, D.Anil Kumar 2 1 Assistant Professor, Department of IT, G.I.E.T, Gunupur, India 2 Associate Professor, Department of CSE, G.I.E.T, Gunupur, India
Data Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
Data Warehouse: Introduction
Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,
Dimensional Modeling and E-R Modeling In. Joseph M. Firestone, Ph.D. White Paper No. Eight. June 22, 1998
1 of 9 5/24/02 3:47 PM Dimensional Modeling and E-R Modeling In The Data Warehouse By Joseph M. Firestone, Ph.D. White Paper No. Eight June 22, 1998 Introduction Dimensional Modeling (DM) is a favorite
Master Data Management and Data Warehousing. Zahra Mansoori
Master Data Management and Data Warehousing Zahra Mansoori 1 1. Preference 2 IT landscape growth IT landscapes have grown into complex arrays of different systems, applications, and technologies over the
TopBraid Insight for Life Sciences
TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.
SimCorp Solution Guide
SimCorp Solution Guide Data Warehouse Manager For all your reporting and analytics tasks, you need a central data repository regardless of source. SimCorp s Data Warehouse Manager gives you a comprehensive,
IAF Business Intelligence Solutions Make the Most of Your Business Intelligence. White Paper November 2002
IAF Business Intelligence Solutions Make the Most of Your Business Intelligence White Paper INTRODUCTION In recent years, the amount of data in companies has increased dramatically as enterprise resource
D6 INFORMATION SYSTEMS DEVELOPMENT. SOLUTIONS & MARKING SCHEME. June 2013
D6 INFORMATION SYSTEMS DEVELOPMENT. SOLUTIONS & MARKING SCHEME. June 2013 The purpose of these questions is to establish that the students understand the basic ideas that underpin the course. The answers
Data Warehouses in the Path from Databases to Archives
Data Warehouses in the Path from Databases to Archives Gabriel David FEUP / INESC-Porto This position paper describes a research idea submitted for funding at the Portuguese Research Agency. Introduction
Data warehouse life-cycle and design
SYNONYMS Data Warehouse design methodology Data warehouse life-cycle and design Matteo Golfarelli DEIS University of Bologna Via Sacchi, 3 Cesena Italy [email protected] DEFINITION The term data
BIG DATA COURSE 1 DATA QUALITY STRATEGIES - CUSTOMIZED TRAINING OUTLINE. Prepared by:
BIG DATA COURSE 1 DATA QUALITY STRATEGIES - CUSTOMIZED TRAINING OUTLINE Cerulium Corporation has provided quality education and consulting expertise for over six years. We offer customized solutions to
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
IST722 Data Warehousing
IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF
AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS
Distr. GENERAL Working Paper No.2 26 April 2007 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL
Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc.
Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc. Introduction Abstract warehousing has been around for over a decade. Therefore, when you read the articles
A Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
Designing ETL Tools to Feed a Data Warehouse Based on Electronic Healthcare Record Infrastructure
Digital Healthcare Empowering Europeans R. Cornet et al. (Eds.) 2015 European Federation for Medical Informatics (EFMI). This article is published online with Open Access by IOS Press and distributed under
Advanced Data Management Technologies
ADMT 2015/16 Unit 2 J. Gamper 1/44 Advanced Data Management Technologies Unit 2 Basic Concepts of BI and Data Warehousing J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:
Increasing Development Knowledge with EPFC
The Eclipse Process Framework Composer Increasing Development Knowledge with EPFC Are all your developers on the same page? Are they all using the best practices and the same best practices for agile,
Data Warehouse Design
Data Warehouse Design Modern Principles and Methodologies Matteo Golfarelli Stefano Rizzi Translated by Claudio Pagliarani Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City
The Role of the BI Competency Center in Maximizing Organizational Performance
The Role of the BI Competency Center in Maximizing Organizational Performance Gloria J. Miller Dr. Andreas Eckert MaxMetrics GmbH October 16, 2008 Topics The Role of the BI Competency Center Responsibilites
A Design and implementation of a data warehouse for research administration universities
A Design and implementation of a data warehouse for research administration universities André Flory 1, Pierre Soupirot 2, and Anne Tchounikine 3 1 CRI : Centre de Ressources Informatiques INSA de Lyon
Adapting an Enterprise Architecture for Business Intelligence
Adapting an Enterprise Architecture for Business Intelligence Pascal von Bergen 1, Knut Hinkelmann 2, Hans Friedrich Witschel 2 1 IT-Logix, Schwarzenburgstr. 11, CH-3007 Bern 2 Fachhochschule Nordwestschweiz,
Enabling Better Business Intelligence and Information Architecture With SAP PowerDesigner Software
SAP Technology Enabling Better Business Intelligence and Information Architecture With SAP PowerDesigner Software Table of Contents 4 Seeing the Big Picture with a 360-Degree View Gaining Efficiencies
www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
DEVELOPING REQUIREMENTS FOR DATA WAREHOUSE SYSTEMS WITH USE CASES
DEVELOPING REQUIREMENTS FOR DATA WAREHOUSE SYSTEMS WITH USE CASES Robert M. Bruckner Vienna University of Technology [email protected] Beate List Vienna University of Technology [email protected]
Business Intelligence Systems
12 Business Intelligence Systems Business Intelligence Systems Bogdan NEDELCU University of Economic Studies, Bucharest, Romania [email protected] The aim of this article is to show the importance
Business Intelligence Project Management 101
Business Intelligence Project Management 101 Managing BI Projects within the PMI Process Groups Too many times, Business Intelligence (BI) and Data Warehousing project managers are ill-equipped to handle
A Service-oriented Architecture for Business Intelligence
A Service-oriented Architecture for Business Intelligence Liya Wu 1, Gilad Barash 1, Claudio Bartolini 2 1 HP Software 2 HP Laboratories {[email protected]} Abstract Business intelligence is a business
Methodology Framework for Analysis and Design of Business Intelligence Systems
Applied Mathematical Sciences, Vol. 7, 2013, no. 31, 1523-1528 HIKARI Ltd, www.m-hikari.com Methodology Framework for Analysis and Design of Business Intelligence Systems Martin Závodný Department of Information
Trivadis White Paper. Comparison of Data Modeling Methods for a Core Data Warehouse. Dani Schnider Adriano Martino Maren Eschermann
Trivadis White Paper Comparison of Data Modeling Methods for a Core Data Warehouse Dani Schnider Adriano Martino Maren Eschermann June 2014 Table of Contents 1. Introduction... 3 2. Aspects of Data Warehouse
Data Virtualization A Potential Antidote for Big Data Growing Pains
perspective Data Virtualization A Potential Antidote for Big Data Growing Pains Atul Shrivastava Abstract Enterprises are already facing challenges around data consolidation, heterogeneity, quality, and
ETL-EXTRACT, TRANSFORM & LOAD TESTING
ETL-EXTRACT, TRANSFORM & LOAD TESTING Rajesh Popli Manager (Quality), Nagarro Software Pvt. Ltd., Gurgaon, INDIA [email protected] ABSTRACT Data is most important part in any organization. Data
Enabling Better Business Intelligence and Information Architecture With SAP Sybase PowerDesigner Software
SAP Technology Enabling Better Business Intelligence and Information Architecture With SAP Sybase PowerDesigner Software Table of Contents 4 Seeing the Big Picture with a 360-Degree View Gaining Efficiencies
Student Performance Analytics using Data Warehouse in E-Governance System
Performance Analytics using Data Warehouse in E-Governance System S S Suresh Asst. Professor, ASCT Department, International Institute of Information Technology, Pune, India ABSTRACT Data warehouse (DWH)
B.Sc (Computer Science) Database Management Systems UNIT-V
1 B.Sc (Computer Science) Database Management Systems UNIT-V Business Intelligence? Business intelligence is a term used to describe a comprehensive cohesive and integrated set of tools and process used
Business Process Modeling and Standardization
Business Modeling and Standardization Antoine Lonjon Chief Architect MEGA Content Introduction Business : One Word, Multiple Arenas of Application Criteria for a Business Modeling Standard State of the
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE
THE QUALITY OF DATA AND METADATA IN A DATAWAREHOUSE Carmen Răduţ 1 Summary: Data quality is an important concept for the economic applications used in the process of analysis. Databases were revolutionized
Information Management Metamodel
ISO/IEC JTC1/SC32/WG2 N1527 Information Management Metamodel Pete Rivett, CTO Adaptive OMG Architecture Board [email protected] 2011-05-11 1 The Information Management Conundrum We all have Data
DATA WAREHOUSE CONCEPTS DATA WAREHOUSE DEFINITIONS
DATA WAREHOUSE CONCEPTS A fundamental concept of a data warehouse is the distinction between data and information. Data is composed of observable and recordable facts that are often found in operational
DWEB: A Data Warehouse Engineering Benchmark
DWEB: A Data Warehouse Engineering Benchmark Jérôme Darmont, Fadila Bentayeb, and Omar Boussaïd ERIC, University of Lyon 2, 5 av. Pierre Mendès-France, 69676 Bron Cedex, France {jdarmont, boussaid, bentayeb}@eric.univ-lyon2.fr
Three Fundamental Techniques To Maximize the Value of Your Enterprise Data
Three Fundamental Techniques To Maximize the Value of Your Enterprise Data Prepared for Talend by: David Loshin Knowledge Integrity, Inc. October, 2010 2010 Knowledge Integrity, Inc. 1 Introduction Organizations
Business Intelligence Solutions. Cognos BI 8. by Adis Terzić
Business Intelligence Solutions Cognos BI 8 by Adis Terzić Fairfax, Virginia August, 2008 Table of Content Table of Content... 2 Introduction... 3 Cognos BI 8 Solutions... 3 Cognos 8 Components... 3 Cognos
Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach
2006 ISMA Conference 1 Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach Priya Lobo CFPS Satyam Computer Services Ltd. 69, Railway Parallel Road, Kumarapark West, Bangalore 560020,
OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA
OLAP and OLTP AMIT KUMAR BINDAL Associate Professor Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data,
Reflections on Agile DW by a Business Analytics Practitioner. Werner Engelen Principal Business Analytics Architect
Reflections on Agile DW by a Business Analytics Practitioner Werner Engelen Principal Business Analytics Architect Introduction Werner Engelen Active in BI & DW since 1998 + 6 years at element61 Previously:
Chapter 5. Learning Objectives. DW Development and ETL
Chapter 5 DW Development and ETL Learning Objectives Explain data integration and the extraction, transformation, and load (ETL) processes Basic DW development methodologies Describe real-time (active)
Metadata Repositories in Health Care. Discussion Paper
Health Care and Informatics Review Online, 2008, 12(3), pp 37-44, Published online at www.hinz.org.nz ISSN 1174-3379 Metadata Repositories in Health Care Discussion Paper Dr Karolyn Kerr [email protected]
COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design
COURSE OUTLINE Track 1 Advanced Data Modeling, Analysis and Design TDWI Advanced Data Modeling Techniques Module One Data Modeling Concepts Data Models in Context Zachman Framework Overview Levels of Data
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES
LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES MUHAMMAD KHALEEL (0912125) SZABIST KARACHI CAMPUS Abstract. Data warehouse and online analytical processing (OLAP) both are core component for decision
B. 3 essay questions. Samples of potential questions are available in part IV. This list is not exhaustive it is just a sample.
IS482/682 Information for First Test I. What is the structure of the test? A. 20-25 multiple-choice questions. B. 3 essay questions. Samples of potential questions are available in part IV. This list is
A McKnight Associates, Inc. White Paper: Effective Data Warehouse Organizational Roles and Responsibilities
A McKnight Associates, Inc. White Paper: Effective Data Warehouse Organizational Roles and Responsibilities Numerous roles and responsibilities will need to be acceded to in order to make data warehouse
Databases in Organizations
The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron
Implementing Oracle BI Applications during an ERP Upgrade
1 Implementing Oracle BI Applications during an ERP Upgrade Jamal Syed Table of Contents TABLE OF CONTENTS... 2 Executive Summary... 3 Planning an ERP Upgrade?... 4 A Need for Speed... 6 Impact of data
IS YOUR DATA WAREHOUSE SUCCESSFUL? Developing a Data Warehouse Process that responds to the needs of the Enterprise.
IS YOUR DATA WAREHOUSE SUCCESSFUL? Developing a Data Warehouse Process that responds to the needs of the Enterprise. Peter R. Welbrock Smith-Hanley Consulting Group Philadelphia, PA ABSTRACT Developing
TopBraid Life Sciences Insight
TopBraid Life Sciences Insight In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
Innovation. Simplifying BI. On-Demand. Mobility. Quality. Innovative
Innovation Simplifying BI On-Demand Mobility Quality Innovative BUSINESS INTELLIGENCE FACTORY Advantages of using our technologies and services: Huge cost saving for BI application development. Any small
Evaluating Data Warehousing Methodologies: Objectives and Criteria
Evaluating Data Warehousing Methodologies: Objectives and Criteria by Dr. James Thomann and David L. Wells With each new technical discipline, Information Technology (IT) practitioners seek guidance for
Data Vault and The Truth about the Enterprise Data Warehouse
Data Vault and The Truth about the Enterprise Data Warehouse Roelant Vos 04-05-2012 Brisbane, Australia Introduction More often than not, when discussion about data modeling and information architecture
Improving your Data Warehouse s IQ
Improving your Data Warehouse s IQ Derek Strauss Gavroshe USA, Inc. Outline Data quality for second generation data warehouses DQ tool functionality categories and the data quality process Data model types
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014
Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4
Automated Business Intelligence
Automated Business Intelligence Delivering real business value,quickly, easily, and affordably 2 Executive Summary For years now, the greatest weakness of the Business Intelligence (BI) industry has been
Meta-data and Data Mart solutions for better understanding for data and information in E-government Monitoring
www.ijcsi.org 78 Meta-data and Data Mart solutions for better understanding for data and information in E-government Monitoring Mohammed Mohammed 1 Mohammed Anad 2 Anwar Mzher 3 Ahmed Hasson 4 2 faculty
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT POINT-AND-SYNC MASTER DATA MANAGEMENT 04.2005 Hyperion s new master data management solution provides a centralized, transparent process for managing critical
Microsoft Business Intelligence
Microsoft Business Intelligence P L A T F O R M O V E R V I E W M A R C H 1 8 TH, 2 0 0 9 C H U C K R U S S E L L S E N I O R P A R T N E R C O L L E C T I V E I N T E L L I G E N C E I N C. C R U S S
Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities
Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities April, 2013 gaddsoftware.com Table of content 1. Introduction... 3 2. Vendor briefings questions and answers... 3 2.1.
Using Master Data in Business Intelligence
helping build the smart business Using Master Data in Business Intelligence Colin White BI Research March 2007 Sponsored by SAP TABLE OF CONTENTS THE IMPORTANCE OF MASTER DATA MANAGEMENT 1 What is Master
Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers
60 Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative
