Java Metadata Interface and Data Warehousing



Similar documents
Model-Driven Data Warehousing

Common Warehouse Metamodel (CWM): Extending UML for Data Warehousing and Business Intelligence

Model-Driven Architecture: Vision, Standards And Emerging Technologies

Information Management Metamodel

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

SAS 9.1. ETL Studio: User s Guide

Databases in Organizations

Data warehouse and Business Intelligence Collateral

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain

SAS Business Intelligence Online Training

Common Warehouse Metamodel (CWM): Extending UML for Data Warehousing and Business Intelligence

Oracle Warehouse Builder 10g

Practical meta data solutions for the large data warehouse

Course: SAS BI(business intelligence) and DI(Data integration)training - Training Duration: 30 + Days. Take Away:

IST722 Data Warehousing

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Oracle Identity Analytics Architecture. An Oracle White Paper July 2010

Metadata Management for Data Warehouse Projects

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Meta Data Management for Business Intelligence Solutions. IBM s Strategy. Data Management Solutions White Paper

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

SSIS Training: Introduction to SQL Server Integration Services Duration: 3 days

Quick start. A project with SpagoBI 3.x

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Integrating data in the Information System An Open Source approach

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

SAS BI Course Content; Introduction to DWH / BI Concepts

European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project

Elixir Business Analytics Platform and Data API Server for Harnessing Data for Value Creation CFC Presented by:

BENEFITS OF AUTOMATING DATA WAREHOUSING

SQL Server 2012 Business Intelligence Boot Camp

Enabling Better Business Intelligence and Information Architecture With SAP PowerDesigner Software

Overview. Stakes. Context. Model-Based Development of Safety-Critical Systems

Integrating Ingres in the Information System: An Open Source Approach

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Overview. DW Source Integration, Tools, and Architecture. End User Applications (EUA) EUA Concepts. DW Front End Tools. Source Integration

DSS based on Data Warehouse

Oracle Service Bus Examples and Tutorials

Data Warehouse Overview. Srini Rengarajan

IBM Configuring Rational Insight and later for Rational Asset Manager

CHAPTER 4: BUSINESS ANALYTICS

ODBIS: Towards a Platform for On-Demand Business Intelligence Services

Enabling Better Business Intelligence and Information Architecture With SAP Sybase PowerDesigner Software

Using UML to Construct a Model Driven Solution for Unified Access to Disparate Data

SAP Data Services 4.X. An Enterprise Information management Solution

TechTips. Connecting Xcelsius Dashboards to External Data Sources using: Web Services (Dynamic Web Query)

East Asia Network Sdn Bhd

SQL Server 2005 Features Comparison

Application. 1.1 About This Tutorial Tutorial Requirements Provided Files

Oracle Data Integrator 11g New Features & OBIEE Integration. Presented by: Arun K. Chaturvedi Business Intelligence Consultant/Architect

Performance Management Platform

Data-Warehouse-, Data-Mining- und OLAP-Technologien

IT Service Management with System Center Service Manager

Metamodels and Modeling Multiple Kinds of Information Systems

The basic data mining algorithms introduced may be enhanced in a number of ways.

LEARNING SOLUTIONS website milner.com/learning phone

NEW FEATURES ORACLE ESSBASE STUDIO

M Designing and Implementing OLAP Solutions Using Microsoft SQL Server Day Course

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

Qlik Sense Enabling the New Enterprise

Developing Microsoft SharePoint Server 2013 Advanced Solutions MOC 20489

Super-Charged Oracle Business Intelligence with Essbase and SmartView

HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT

XML Data Movement Components for Teradata

Fluency With Information Technology CSE100/IMT100

Turkish Journal of Engineering, Science and Technology

INTRODUCTION OVERVIEW OF THE ORACLE 9I AND BI BEANS ARCHITECTURE. Chris Claterbos, Vlamis Software Solutions, Inc.

Publishing, Consuming, Deploying and Testing Web Services

Data Integration and ETL with Oracle Warehouse Builder NEW

MIS636 AWS Data Warehousing and Business Intelligence Course Syllabus

Data Warehouse Design

Introduction. Document Conventions. Administration. In This Section

Jet Data Manager 2012 User Guide

MDA Overview OMG. Enterprise Architect UML 2 Case Tool by Sparx Systems by Sparx Systems

How To Build A Connector On A Website (For A Nonprogrammer)

Implementing Data Models and Reports with Microsoft SQL Server 20466C; 5 Days

Business Intelligence for SUPRA. WHITE PAPER Cincom In-depth Analysis and Review

Leveraging the Eclipse TPTP* Agent Infrastructure

EVALUATION ONLY. WA2088 WebSphere Application Server 8.5 Administration on Windows. Student Labs. Web Age Solutions Inc.

Creating Hybrid Relational-Multidimensional Data Models using OBIEE and Essbase by Mark Rittman and Venkatakrishnan J

WEB SERVICES. Revised 9/29/2015

Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006

Business Intelligence with AxCMS.net

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

Luna Decision Overview. Copyright Hicare Research 2014 All Rights Reserved

Microsoft Implementing Data Models and Reports with Microsoft SQL Server

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Actuate Business Intelligence and Reporting Tools (BIRT)

CHAPTER 5: BUSINESS ANALYTICS

INTERACTIVE DECISION SUPPORT SYSTEM BASED ON ANALYSIS AND SYNTHESIS OF DATA - DATA WAREHOUSE

COURSE SYLLABUS COURSE TITLE:

Microsoft Services Exceed your business with Microsoft SharePoint Server 2010

Transcription:

Java Metadata Interface and Data Warehousing A JMI white paper by John D. Poole November 2002 Abstract. This paper describes a model-driven approach to data warehouse administration by presenting a detailed scenario illustrating how JMI-enabled tools might be used in the realization of a data warehouse schema. Java code fragments illustrating the use of JMI interfaces are provided throughout. We also show how a JMI resource might integrate with the J2EE Connector Architecture. Introduction This paper describes the management of a data warehouse using the Java Metadata Interface (JMI) by walking through a detailed scenario in which a data warehouse is modeled and automatically constructed using meta data-driven tools. JMI is the pervasive access mechanism for managed meta data, and CWM is the metamodel defining that meta data. This scenario consists of five distinct components, and we illustrate how JMI supports each component: 1. Meta data service initialization. A centralized meta data service supporting a common metamodel is generated and brought online. 2. Model construction. The meta data service is used to create shared meta data based on the common metamodel. This meta data is then published to the rest of the environment. 3. Tool initialization. Each tool constructs its own internal information structures and links to other tools, based on shared meta data published via the central meta data service. 4. Metamodel interoperability. A new tool that supports a different metamodel is introduced to the data warehouse. JMI Reflection is used to reconcile differences between the tool-specific metamodel and the common metamodel used by the rest of the data warehouse. 5. Data warehouse information flow. The overall flow of data and control through the data warehouse is illustrated. These flows are meta data-driven and facilitated by JMI programmatic interfaces (both metamodel-specific and reflective) and by the XMI bulk import/export mechanism supported by JMI. The following subsections address each of the five components of the JMI-enabled data warehouse management scenario.

2 Meta Data Service Initialization In the first step of the scenario, a data warehouse administrator initializes a JMI-enabled meta data service as the central meta data store for the data warehouse. By meta data service, we mean a general-purpose tool that is capable of providing a complete implementation of the JMI specification. An implementation of the JMI specification might, for example, consist of a server process that realizes the JMI interfaces corresponding to some MOF-compliant metamodel, the JMI reflective interfaces, and the XMI reader and writer interfaces. Precisely how a JMI-enabled meta data service is implemented is not prescribed by the JMI specification itself. As illustrated in Figure 1, the administrator's warehouse management tool connects to the JMI service via the J2EE Connector Architecture's Custom Client Interface (CCI). In this case, we assume that the Connection interface has several custom methods that facilitate access to the JMI service. These methods are implementation-specific (i.e., not prescribed by JMI). OMG Web server HTTP request Internet HTTP request Data Warehouse Administration Console Connection establishment and JMI XMIReader calls JMI-Enabled Metadata Service Figure 1: Meta Data Service Initialization The custom Connection interface is defined as follows:

3 public interface com.mdservice.connection { public void close() throws com.mdservice.resourceexception; public javax.resource.cci.connectionmetadata getmetadata() throws com.mdservice.resourceexception; public javax.jmi.reflect.refpackage gettoplevelpackage() throws com.mdservice.resourceexception; public javax.jmi.xmi.xmireader getxmireader() throws com.mdservice.resourceexception; } public javax.jmi.xmi.xmiwriter getxmiwriter() throws com.mdservice.resourceexception; The gettoplevelpackage() method returns a single instance of RefPackage, and the semantics of this method specify that this RefPackage object represents the outermost package extent of the MOF Model instances supported by the JMI-enabled service. In other words, in this particular implementation strategy, we have established, purely as a convention, that there is always a single, outermost package provided that contains all of the metamodel-specific package extents 1. The Connection interface also provides the custom factory methods getxmireader() and getxmiwriter(). These methods are used for instantiating the XmiReader and XmiWriter interfaces, respectively, within the context of the connected meta data service. A data warehouse administration client first connects to the JMI service and obtains the (empty) outermost package extent of the meta data service by calling the gettoplevelpackage() method. Next, the client creates an instance of XmiReader and invokes the read() method on the XmiReader interface, supplying both the RefPackage returned by the previous call to gettoplevelpackage(), and a URL pointing to an XMI document containing the MOF metamodel to be loaded. In this case, the XMI document contains the definition of the Object Management Group's Common Warehouse Metamodel (CWM), rendered as MOF Model instance. The semantics of this particular implementation of the XmiReader::read() operation are such that the imported CWM XMI element is interpreted as an instance of the MOF Model and is subsequently used in generating the internal structure of the outermost package extent. In other words, the JMI service is "hard-wired" to understand MOF. If it imports an instance of the MOF Model (i.e., a MOF-compliant metamodel), it subsequently builds the package structure 1 Note that one might use JNDI to select a particular ConnectionFactory from a local JNDI tree. In this case, it happens that the ConnectionFactory is bound to precisely one outermost RefPackage instance, since each connection has a gettoplevelpackage() method that returns a single instance. Other implementation strategies might provide some alternative means of locating any one of several top level packages.

4 prescribed by the MOF Model instance. If, on the other hand, it imports an instance of a metamodel (i.e., meta data) corresponding to some MOF-compliant metamodel whose structure had already been built within the server, then it will treat the XMI element content as meta data and use it populate the meta data server. Once the CWM metamodel is loaded, the JMI service generates a complete set of metamodelspecific (tailored) Java interfaces, according to the JMI mapping templates, as well as a corresponding library of Java implementation classes. These implementation classes constitute a realization of a CWM server. Note, once again, that this particular implementation strategy of the JMI service is not defined by the JMI specification. Figure 1 illustrates the JMI meta data service initialization event. Figure 2 illustrates the CWM metamodel. Management Warehouse Process Warehouse Operation Analysis Transformation OLAP Data Mining Information Visualization Business Nomenclature Resource Object Relational Record Multidimensio nal XML Foundation Business Information Data Types Expressions Keys and Indexes Software Deployment Type Mapping Object Model Core Behavioral Relationships Instance Figure 2: CWM Metamodel The sequence of connection and JMI calls used to initialize the JMI service might consist of the following: ConnectionFactory cxf = new ConnectionFactory(); Connection cx = cxf.getconnection( properties ); RefPackage mstlp = Connection.getTopLevelPackage(); XmiReader xmireader = Connection.getXmiReader(); xmireader.read( "http://cgi.omg.org/docs/ad/01-02-03.txt", mstlp );

5 Model Construction Now that the JMI service has been initialized with the CWM metamodel, we can begin to construct a model of our data warehouse based on CWM. This is illustrated in Figure 3 below: Metadata Visual Modeling Tool Connection establishment and JMI metamodel-specific (tailored) interface calls JMI-Enabled Metadata Service Figure 3: Model Construction Here, a CWM-aware visual modeling tool is used to construct model instances. The modeling tool connects to the JMI service and then acquires the outermost RefPackage extent from the connection. From this point forward, the modeling tool directs its activities against local (clientside) JMI interfaces that are specific to the CWM metamodel. The user might, for example, drilldown on the outermost package to discover the metamodel-specific packages clustered in this package. These would consist of packages such as Warehouse Process, Warehouse Operation, Transformation, OLAP, etc. Each is a RefPackage instance corresponding to one of CWM metamodel packages shown in the block diagram of Figure 2. The end user constructs a CWM model by selecting various classes from each of the CWM packages and creating instances of them. For example, the user might create an OLAP Dimension by selecting a Dimension class icon from a palette representing OLAP objects, and then dragging it onto a display area. Internally, the modeling tool performs local calls against the CWM metamodel-specific JMI interfaces. The following code fragment illustrates several JMI calls that would be performed in the construction of a CWM OLAP Dimension instance: // Get the DimensionClass proxy org.omg.java.cwm.analysis.olap.dimensionclass dc = olappkg.getdimension(); // Create a Time Dimension org.omg.java.cwm.analysis.olap.dimension timedim = dc.createdimension(); timedim.setname( "Time" ); timedim.settime( true );

6 Now, let's assume that the data warehouse model, once completed, consists of instances of modeling elements taken from each of the following CWM packages: Record package: Used to model the raw data feeds supplying information to the data warehouse. Relational package: Used to construct models of both the operational data store (ODS) and the dimensional star-schema analysis database of the data warehouse. Transformation package: Used to model the data transformations going from the raw data source to the ODS, as well as from the ODS to the dimensional star-schema database. These models serve as the primary descriptions of any extract, transform, and load (ETL) process that the data warehouse might implement. OLAP package: Used to model the OLAP (Online Analytical Processing) abstractions exposed by the data warehouse for analysis and reporting. This model includes a mapping to the relational star-schema model, as a source of OLAP data. Warehouse Process and Warehouse Operation packages: Used to model the data warehouse control processes that manage and track any ETL activities that might be performed, based on the Transformation models. The complete data warehouse model is stored in a single RefPackage that is an instance of the CWM metamodel, and is now available to the rest of the data warehousing environment via the JMI interfaces of the central JMI service. Tool Initialization Now that the data warehouse schema has been completely defined in terms of a centrally stored CWM model, the data warehouse can be physically generated. This is done through the meta data initialization of a number of JMI-enabled data warehousing tools. This is illustrated in Figure 4. A data warehouse administration tool (possibly a backend to the modeling tool described earlier) initializes each of the installed data warehousing tools. The administration tool opens a connection on each of the data warehouse tools, and subsequently launches an XmiReader on each Connection. The administration tool then launches an XmiWriter on its own Connection to the JMI service, supplying it with a temporary output stream. The administration tool then exports the entire CWM model to the output XMI stream. It invokes each XmiReader::read() method on each tool-wise Connection, supplying the temporary stream as a parameter, as well as each tool's outermost RefPackage extent. The precise behavior of the XmiReader::read() method is determined by each tool implementation, but in general, tools will consume only those parts of the CWM model that they require or understand. For example, the OLAP server's implementation of XmiReader::read() might scan the entire XMI stream, looking specifically for

7 instances of CWM OLAP, Relational, and Transformation packages, and use these to populate the content of its outermost RefPackage. XmiReader.read() XmiReader.read() XmiReader.read() XmiReader.read() ETL Tool Operational Data Store (RDBMS) Analysis Store (RDBMS) OLAP Server Connection and JMI XMIReader calls; tool-specific API calls XMI stream-based importation of CWM metadata Data Warehouse Administration Console Connection and JMI metamodel-specific interface calls <package>.get<metaclass>() XmiWriter.write() JMI-Enabled Metadata Service Figure 4: Tool Initialization An alternative approach would be for the administration tool to simply notify each of the data warehouse tools that a new CWM model has been created and is now available. Each tool would connect separately to the JMI service and could peruse the content of the outermost RefPackage extent, navigating the model structure through JMI interface calls. Each tool would then subsequently load the particular meta data that it requires (i.e., once each tool has discovered an appropriate RefPackage instance within the CWM model), via either the JMI programmatic interfaces or the XmiReader and XmiWriter interfaces. Metamodel Interoperability This portion of the data warehouse scenario describes advanced functionality that can be realized through MOF/JMI reflection. In this case, users of the OLAP server have acquired an advanced, multidimensional visualization/reporting software package. Like other tools comprising the data warehouse, the new visualization tool is also JMI-enabled. However, it uses a different MOF metamodel than CWM; specifically, some MOF-compliant metamodel that represents very

8 specific aspects of advanced, multidimensional, visual analysis. The new tool needs to be integrated with the rest of the data warehouse, and, in particular, since the tool supports meta data-driven data integration with OLAP servers, it must be integrated at the meta data level with the CWM OLAP model used by the OLAP server. This is illustrated in Figure 5 below: OLAP Server Multidimensional Visualization Tool Tool Setup Screen Connection and JMI Reflective API calls Metadata Visual Modeling Tool Connection and JMI metamodel-specific (tailored) API calls JMI-Enabled Metadata Service Figure 5: Metamodel Interoperability To effectively integrate the new visualization tool with the rest of the environment, the data warehouse modeler first connects to the JMI service and then creates an instance of the CWM Visualization metamodel to represent the visualization concepts, as they relate to the OLAP model. That is, visualization meta data is constructed and linked to the various OLAP model objects that are to be visualized (e.g., Cube, Dimension). Now, the new visualization tool needs to be integrated at the meta data-level with the CWM Visualization model just created. Although the visualization tool is JMI/MOF-compliant, it does not understand CWM. It is driven by its own MOF metamodel. The data warehouse modeler uses the visualization tool's set-up screen to first build the tool's meta data. The modeler then reconciles both types of meta data by constructing a programmatic script (written in Java) that connects to the JMI service and uses JMI Reflective programming to acquire equivalent objects from the CWM model. In this manner, the visualization tool is driven by its own meta data, but can dynamically map to an equivalent CWM Visualization model, and therefore indirectly to the OLAP model elements that are to be rendered. This level of indirection provided by JMI Reflection enables the advanced visualization/reporting tool to perform meta data-directed processing of OLAP data, even though its metamodel is different from CWM. Data Warehouse Information Flow The final diagram in Figure 6 shows the overall flow of information through the integrated data warehouse. The clear arrows represent the general flow or progression of data through the data warehouse, all the way from the ODS to the advanced visualization/reporting software. This data flow is inherently meta data-driven, and meta data in this environment has a single and

9 centralized representation in terms of JMI. Shared meta data is defined by a MOF-compliant metamodel (i.e., CWM), but the JMI-enabled meta data service is not tied to any particular metamodel, and is capable of loading the CWM metamodel and dynamically generating an internal implementation of CWM. Communication of shared meta data is achieved through JMI interfaces. XmiReader and XmiWriter interfaces are used to transfer complete models or specific packages of models in a bulk format for loading into tools. On the other hand, metamodel-specific (CWM, in this case) JMI interfaces are used by client tools for browsing and possibly creating or modifying existing meta data structures. Finally, JMI Reflection is used to facilitate meta data integration between tools whose metamodels differ, but are otherwise MOF-compliant. General direction of the flow of data Lineage tracing and drill-back JMI-Enabled Metadata Service Operational Data Store Analysis Store ETL OLAP Visualization (star schema) Figure 6: Data Warehouse Information Flow Summary This paper has provided an extensive example of model-driven data warehouse management, in which JMI provides a common access mechanism for meta data, and CWM provides the common metamodel. We've demonstrated that a complete data warehouse can be defined, initialized, and then placed into operation relatively easily through the use of centralized meta data and common meta data access mechanisms. The model-driven approach to integration

10 greatly enhances the return-on-investment (ROI) of any data warehouse or supply chain, since it reduces the costs of integrating best of breed implementation tools considerably.