Striving towards Near Real-Time Data Integration for Data Warehouses
|
|
|
- Toby Leonard
- 9 years ago
- Views:
Transcription
1 Striving towards Near Real-Time Data Integration for Data Warehouses Robert M. Bruckner 1, Beate List 1, and Josef Schiefer 2 1 Institute of Software Technology Vienna University of Technology Favoritenstr. 9 / 188, A-1040 Vienna, Austria {bruckner, list}@ifs.tuwien.ac.at 2 IBM Watson Research Center 19 Skyline Drive Hawthorne, NY [email protected] Abstract. The amount of information available to large-scale enterprises is growing rapidly. While operational systems are designed to meet well-specified (short) response time requirements, the focus of data warehouses is generally the strategic analysis of business data integrated from heterogeneous source systems. The decision making process in traditional data warehouse environments is often delayed because data cannot be propagated from the source system to the data warehouse in time. A real-time data warehouse aims at decreasing the time it takes to make business decisions and tries to attain zero latency between the cause and effect of a business decision. In this paper we present an architecture of an ETL environment for real-time data warehouses, which supports a continual near real-time data propagation. The architecture takes full advantage of existing J2EE (Java 2 Platform, Enterprise Edition) technology and enables the implementation of a distributed, scalable, near real-time ETL environment. Instead of using vendor proprietary ETL (extraction, transformation, loading) solutions, which are often hard to scale and often do not support an optimization of allocated time frames for data extracts, we propose in our approach ETLets (spoken et-lets ) and Enterprise Java Beans (EJB) for the ETL processing tasks. 1. Introduction The amount of information available to large-scale enterprises is growing rapidly. New information is being generated continuously by operational sources such as order processing, inventory control, and customer service systems. Organizations without instant information delivery capabilities will surrender their competitive advantage. In order to support efficient analysis and mining of such diverse, distributed information, a data warehouse (DWH) collects data from multiple, heterogeneous sources and stores integrated information in a central repository. Traditional DWHs need to be updated periodically to reflect source data updates. The observation of real-world events in operational source systems is characterized by a propagation delay. The update patterns (daily, weekly, etc.) for traditional data warehouses and the data integration process result in increased propagation delays. While operational systems are among other things designed to meet well-specified (short) response time requirements, the focus of data warehouses is the strategic analysis of data integrated from heterogeneous systems [3]. Traditionally, there is no Y. Kambayashi, W. Winiwarter, M. Arikawa (Eds.): DaWaK 2002, LNCS 2454, pp , Springer-Verlag Berlin Heidelberg 2002
2 318 R.M. Bruckner, B. List, and J. Schiefer real-time connection between a DWH and its data sources, because the write-once read-many decision support characteristics would conflict with the continuous update workload of operational systems and result in poor reponse times. Separated from operational systems, data warehouse and business intelligence applications are used for strategic planning and decision-making. As these applications have matured, it has become apparent that the information and analyses they provide are vital to tactical day-to-day decision-making, and many organizations can no longer operate their businesses effectively without them. Consequently, there is a trend towards integrating decision processing into the overall business process. The advent of e-business is also propelling this integration because organizations need to react much faster to changing business conditions in the e-business world [5]. The goal of a near real-time data warehouse (RTDWH, part of a so called zerolatency DWH environment [2]) is to allow organizations to deliver relevant information as fast as possible to knowledge workers or decision systems who rely on it to react in near real-time to new information captured by an organization s computer system. Therefore, it supports an immediate discovery of abnormal data conditions in a manner not supported by an OLTP system. Up till recently, timeliness requirements (the relative availability of data to support a given process within the time frame required to perform the process) were postponed to mid-term or long-term. While a total real-time enterprise DWH might still be the panacea, we will present an approach to enable DWHs to integrate particular data just-in-time to changing customer needs, supply chain demands, and financial concerns. As e-business erodes switching costs and brand loyalty, customers will consider instant service fulfillment and information access the primary relationship differentiator. The remainder of this paper is organized as follows. In section 2, we discuss the contribution of this paper and related work. In section 3, we give an overview of requirements for real-time DWHs and discuss differences of operational data stores. In section 4 we discuss the ETL process of real-time DWHs and introduce a J2EE architecture for an ETL environment which supports a near real-time data propagation. Finally, we discuss our future work and give a conclusion in section Contribution and Related Work The authors in [1] describe an approach which clearly separates the data warehouse refreshment process from its traditional handling as a view maintenance or bulk loading process. They provide a conceptual model of the process, which is treated as a composite workflow, but they do not describe how to efficiently propagate the data. Theodoratos and Bouzeghoub discuss in [11] data currency quality factors in data warehouses and propose a DWH design that takes these factors into account. Temporal data warehouses address the issue of supporting temporal information efficiently in data warehousing systems [14]. Keeping them up-to-date is complex, because temporal views may need to be updated not only when source data is updated, but also as time progresses, and these two dimensions of change interact in subtle ways. In [13], the authors present efficient techniques (e.g. temporal view selfmaintenance) for maintaining DWHs without disturbing source operations. An important issue for near real-time data integration is the accommodation of delays, which has been investigated for (business) transactions in temporal active databases [8]. The conclusion is that temporal faithfulness for transactions has to be provided, which preserves the serialization order of a set of business transactions.
3 Striving towards Near Real-Time Data Integration for Data Warehouses 319 Although possibly lagging behind real-time, a system that behaves in a temporally faithful manner guarantees the expected serialization order. In [12], the authors describe the ETL (extract-transform-load) tool ARKTOS, which is capable of modeling and executing practical ETL scenarios by providing explicit primitives for the capturing of common tasks (like data cleaning, scheduling, and data transformations) using a declarative language. ARKTOS offers graphical and declarative facilities for the definition of DWH transformations and tries to optimize the execution of complex sequences of transformation and cleansing tasks. Our contribution in this paper is the characterization of RTDWHs, the identification of technologies that can support it, and the composition of these technologies in an overall architecture. We provide an in-depth discussion of using the J2EE platform for ETL environments of RTDWHs. The prerequisite for a RTDWH is a continual, near real-time data propagation. As a solution for this problem, we are proposing a J2EE architecture with ETL container and ETLets, which provide an efficient way of performing, controlling and monitoring the ETL processing tasks. To the best of our knowledge, there are no other approaches, which use container managed Java components for continual data propagating. 3. Real-Time Data Warehouse Requirements A real-time data warehouse (RTDWH) aims at decreasing the time it takes to make the business decisions. In fact, there should be minimized latency between the cause and effect of a business decision. A real-time data warehouse enables analysis across corporate data sources and notifies the business of actionable recommendations, effectively closing the gap between business intelligence systems and the business processes. Business requirements may appear to be different across the various industries, but the underlying information requirements are similar integrated, current, detailed, and immediately accessible. Transforming a standard DWH using batch loading during update windows (where analytical access is not allowed) to an analytical environment providing current data involves various issues to be addressed in order to enable (near) real-time dissemination of new information across an organization. The business needs for this kind of analytical environment introduce a set of service level agreements that exceed what is typical of a traditional DWH. These service levels focus on three basic characteristics, which are described below. Continuous data integration, which enables (near) real-time capturing and loading from different operational sources. Active decision engines that can make recommendations or even (rule-driven) tactical decisions for routine, analytical decision tasks encountered [9, 10]. Highly available analytical environments based on an analysis engine that is able to consistently generate and provide access to current business analyses at any time not restricted by loading windows typical for the common batch approach. An in-depth discussion of these characteristics from the analytical viewpoint (in order to enable timely consistent analyses) is given in [2]. Near real-time integration for data warehouses does not minimize capturing and propagation delays of the operational source systems of an organization, which are responsible for capturing real world events. Data warehouses (in particular real-time DWHs) try to represent the history as accurately as possible (to enable tactical decision support). Therefore, late-arriving records are welcome because they make the information more complete.
4 320 R.M. Bruckner, B. List, and J. Schiefer However, those facts and dimension records are bothersome because they are difficult to integrate (e.g. when using surrogate keys for slowly changing dimensions). The RTDWH provides access to an accurate, integrated, consolidated view of the organizations information and helps to deliver near real-time information to its users. This requires efficient ETL (extract-transform-load) techniques enabling continuous data integration, which is the focus of this paper. Combining highly available systems with active decision engines allows near real-time information dissemination for DWHs. Cummulatively, this is the basis for zero latency analytical environments [2] Continuous Data Integration In e-business environments, information velocity is a key determinant of overall business growth capacity, or scalability. In an environment of growing volume, velocity increases as well. Greater volumes of data must be exchanged and moved in shorter time frames. For performing transactions faster and more efficient, operational systems are supported by business intelligence tools, which provide valuable information for decision makers. Real-time data warehouses help the operational systems deliver this information in near real-time. ETL environments of RTDWHs must support the same processing layers as found in traditional data warehouse systems. Critical for RTDWHs is an end-to-end automation of the ETL processes. The ETL environment must be able to complete the data extracts and transformations in allocated time frames and it must meet the service-level requirements for the data warehouse users. Therefore, a continuous data integration environment facilitates better and faster decision making, resulting in streamlined operations, increased velocity in the business processes, improved customer relationships and enhanced e-business capabilities Real-Time DWH vs. Operational Data Stores For handling real-time data propagations, companies often consider building an operational data store (ODS). An ODS bridges the information gap between operational systems and the traditional data warehouse. It contains data at the event detail level, coordinated across all relevant source systems and maintained in a current state. W.H. Inmon defines an ODS as an architectural construct, which is subject oriented, integrated, volatile, current valued, and contains detailed data [4]. But there are noteworthy differences between ODSs and RTDWHs. An ODS serves the needs of an operational environment while real-time data warehouses serve the needs of the informational community. The ODS and the RTDWH are identical when it comes to being subject oriented and integrated. There are no discernible differences between the two constructs with regard to those characteristics. However, when it comes to transaction support, volatility, currency of information, history and detail, the ODS and the real-time data warehouse differ significantly. History of Data. A data warehouse contains data that is rich in history. ODSs generally do not maintain a rich history of data, because they are used within operational environments and have strict requirements for the query response times. Consequently, an ODS is highly volatile in order to reflect the current status from operational sources. Data Propagation. For an ODS it is common that data records are updated. An ODS must stay in sync with the operational environments to be able to consistently operate within its environment. An ODS has predictable arrival rates for data and includes sync points or checkpoints for the loading process. RTDWH systems are generally read-only and track data changes by creating data snapshots. The absence
5 Striving towards Near Real-Time Data Integration for Data Warehouses 321 of frequent record-level update processing makes the load-and-access processing more efficient. Furthermore, RTDWHs have a higher variability of the size of the transactions and the variability of the rate at which the transaction arrive. Transaction Support.. Data in an ODS is subject to change every time one of its underlying details changes. An advantage of the ODS is that it is integrated and that is can support both decision support and operational transaction processing. An ODS often requires a physical organization which is optimal for the flexible update processing of data while the RTDWH system requires a physical organization which is optimal for the flexible access and analysis of data. RTDWH systems are not interwoven with operational environments and do not support operational transaction processing. Aggregated Data. The ODS contains data that is very detailed, while a RTDWH contains a lot of summary data. Data in the ODS can also be summarized, and a summarized value can be calculated. But, because the summarized value is subject to immediate change (ODSs are highly volatile), it has a short effective life. Summary data of a RTDWH is less dynamic, because aggregated values from history data are often not affected by new data. A real-time DWH is not a replacement for an ODS. An ODS is an architectural construct for a decision support system where collective integrated operational data is stored. It can complement a RTDWH with the information needs for knowledge workers. The ODS contains very detailed current data and is designed to work in an operational environment. 4. ETL Environment for RTDWHs We propose an architecture, which facilitates streamlining and accelerating the ETL process by moving data between the different layers without any intermediate file storage. We achieve this streamlining of the ETL process by using ETLets (explained in detail in section 4.1) and EJB components [17] for 1) extracting the data with highspeed J2EE connectors [15], 2) immediately parsing and converting the source data into the XML format, and 3) converting and assembling the data for the target format. This way, each layer of the ETL environment can process the source data and forward it to other layers for further processing. ETLets and EJB components are configurable Java components which perform the ETL processing tasks, such as data extraction, validation, transformation, or data assembly. They can be deployed on any Java application server and can be reused for several ETL processes. Figure 1 shows the levels of a typical ETL process. Please note, that the processing steps for each layer may be executed in a distributed environment at different locations. The data propagation from the source system and the data movement within the ETL environment is managed in our approach by containers. We use containers, which ensure that the ETL components get instantiated for the processing and that the data is passed from one component to the next. Because RTDWHs try to minimize the data latency and the time for propagating data from the source systems, the number of data extracts increases and the amount of data per data extract normally decreases. One of the most critical ETL processing tasks is an efficient data extraction. An ETL environment of a RTDWH must be able to establish high-performance connections to the source systems and optimize these connections. For instance, connections to databases or enterprise systems must be automatically pooled to optimize the resources and throughput times. In our approach,
6 322 R.M. Bruckner, B. List, and J. Schiefer containers manage the connection pools to provide an efficient access to the data sources. Containers also create, destroy, and swap in and out of memory the ETL components to optimize the ETL processing in a structured and efficient way. Data Storage Tier Data Loading Indexing Aggregating Data Key Generation Managing History... Common storage for all data to be analyzed; loaded data is optimized for the user access Transformation Tier Data Movement Cleansing Transformation Calculating Measures Consolidation Enhancement Analysis Model Conversion... Transformation of the collected data from the source system model to the target (=analysis) model Extraction / Assembling Tier Data Movement Extraction Messaging Parsing Decoding / Preformating Standardization Filtering Completion Matching / Assembling Data extraction from source systems; preparation of data for the transformation process Data Movement Data Source Tier CRM & Sales DBMS ERP Flat Files external data sources Source systems, which contain data that users want to analyze. Fig. 1. ETL processing layers of a (near) real-time data warehouse 4.1 J2EE Architecture for RTDWHs In this section we introduce a J2EE (Java 2 Platform, Enterprise Edition) architecture for an ETL environment of RTDWHs. Our architecture includes the familiar layers of a traditional data warehouse system, but also fulfills the requirements and characteristics of a RTDWH. For real-time data warehouses, a robust, scalable and high-performance data staging environment, which is able to handle a large number of near real-time data propagations from the source systems is essential. We are proposing a J2EE environment for the extraction, transformation, and movement of the data, which fulfills these additional requirements. J2EE is a Java
7 Striving towards Near Real-Time Data Integration for Data Warehouses 323 platform designed for the mainframe-scale computing typical of large enterprises. Sun Microsystems (together with industry partners such as IBM) designed J2EE to simplify application development in a thin client tiered environment and to decrease the need for programming and programmer training by creating standardized, reusable modular components and by enabling the tier to handle many aspects of programming automatically [16]. J2EE environments have a multi-tiered architecture, which provides natural access points for integration with existing and future systems and for the deployment of new sources systems and interfaces as needs arise. In J2EE environments, the container takes responsibility for system-level services (such as threading, resource management, transactions, security, persistence, and so on). This arrangement leaves the component developer with the simplified task of developing business functionality. It also allows the implementation details of the system services to be reconfigured without changing the component code, making components useful in a wide range of contexts. Instead of developing ETL scripts, which are hard to maintain, scale, and reuse, ETL developers are able to implement components for critical parts of the ETL process. In our approach, we extend this concept by adding new container services, which are useful for ETL developers. A container service is responsible for the monitoring of the data extracts and ensures that resources, workload and time-constraints are optimized. ETL developers are able to specify data propagation parameters (e.g. schedule and time constraints) in a deployment descriptor and the container will try to optimize these settings. Most important for an ETL environment is the integration tier (in J2EE environments also called enterprise information tier). The J2EE integration tier contains data and services implemented by non-j2ee resources. Databases, legacy systems, ERP (Enterprise Resource Planning) and EAI (Enterprise Application Integration) systems, process schedulers, and other existing or purchased packages reside in the integration tier. The integration tier allows the designers of an ETL environment to choose efficient mechanisms and resources for the data propagation that is best suited for their needs, and still interoperate seamlessly with J2EE. Fig. 2. J2EE ETL environment. Figure 2 shows a J2EE ETL environment with resource adapters for the data extraction, which are available for the data propagation. Source systems for the ETL environment may require different types of access connectors. Note, that J2EE environments include a standard API for high-performance connectors. Many vendors of ERP or CRM systems (e.g. SAP, PeopleSoft) offer a J2EE connector interface for
8 324 R.M. Bruckner, B. List, and J. Schiefer their systems. With our architecture, ETL developers can use existing highperformance connectors with a standard API without worrying about issues like physical data formats, performance or concurrency. ETL developers can implement reusable components, which run in a container that optimizes the processing, resources and data access. The J2EE platform includes synchronous and asynchronous communication mechanisms via J2EE connectors. Furthermore, the J2EE platform includes a standard API for accessing databases (JDBC) and for messaging (JMS) which enables the ETL developers to access queue-based source systems, which can propagate data via a messaging system. An ETL container is a part of a Java application server that provides services for the execution and monitoring of ETL tasks. It manages the lifecycle of ETLets and provides default implementations for some of the interfaces shown in Figure 3. There are three possibilities to implement an ETL container: 1) implementing a new ETL container for an exiting Java application servers, or extending the EJB container of an Java application server with ETL functionality, 2) extending a Java-enable database management system that includes a Java virtual machine, or 3) writing an own Java application server for the ETL container. By choosing option 1 or 2, the existing functionality of a Java application server or a database management system can be reused and extended. Furthermore, the ETL processing performance can significantly benefit from the architecture and infrastructure of the Java application server or the database management system. Like other Java-based components, ETLets are platform independent Java classes that are compiled to platform neutral bytecode that can be loaded dynamically into and run by a Java application server. ETLets interact with schedulers or ETL clients via a predefined interface of the ETL container. The ETL container supports filters and validators, which allow on-the-fly transformations and validations of data being processed. ETL filters can globally modify or adapt the data being processed. ETL validators are used to ensure data consistency and can be used to check business rules. Filters and validators are pluggable Java components that can be specified in the deployment descriptor for the ETL container. ETL controllers are Java classes that act as intermediaries between ETLets and the underlying ETL processing model. ETL controllers coordinate the ETL tasks by processing events with ETL actions, which use the EJB components (ExtractionBean, ConversionBean, AssemblyBean etc.) to perform the processing. Figure 3 shows the Java interfaces, which are supported by the ETL container. ETLets of the same type share an ETLet context, which is accessible to all instances of an ETLet class and can be used to find out initialization parameters or to share attributes among all ETLet instances. For instance, the ETLet context can be used for implementing caching strategies. The ETLetSession interface is used by ETLets to maintain the business data for one run of the ETL process. The method getetldata() makes the extracted data accessible to all components used for the ETL process. The ETLetConfig interface provides information about the initialization parameters for the ETLet. The ETLValidator and ETLFilter interfaces must be implemented by validator and filter components, which can be used in several ETL processes by specifying them in the deployment descriptor. ETL developers can use the ETLController and ETLAction interfaces for implementing complex control logic for the ETL process. For simple ETL processes developers can simply implement the runetl() method of the ETLet. For more complex scenarios, developers can implement ETLActions to 1) encapsulate processing steps, 2) instantiate EJB components, and 3) to invoke methods from these EJB components. The ETLActions are invoked by an ETLController, which contains the centrally managed control logic.
9 Striving towards Near Real-Time Data Integration for Data Warehouses 325 Fig. 3. Java interface supported by the ETL container. The EJB container of the J2EE environment provides portable, scalable, available, and high-performance access to data and ETL processing. The Java application server manages the efficient access to instances of the EJB components regardless of whether the components are used locally or remotely. The environment automatically manages the implementation details of multithreading, persistence, security, transactions, and concurrent data extracts. ETL developers can implement EJBs for any type of processing of the source data. 5. Conclusion and Future Work In this paper we investigated and characterized real-time data warehouses (RTDWH). We further examined technologies necessary to enable near real-time data integration, and described the composition of these technologies in an overall architecture. The ETL architecture uses ETL container and ETLets, which provide an efficient way of performing, controlling and monitoring the ETL processing tasks. The main advantages of ETLets are that they are lightweight Java components and that they allow a continuous ETL processing of the source data without using immediate file storage. Therefore ETLets are well suited for a near real-time data propagation. To our knowledge, we have seen no other approach, which uses container managed Java components for a continual and near real-time data propagating. Presently, we use ETL controllers for managing complex ETL process. In our future work, we want to develop a model to formally describe and execute ETL processes. Besides managing the lifecycle of ETLets, we want to add a set of services including caching, concurrency, security, messaging, and transformation engines. We want to use ETLets for propagating workflow audit trail data and related business data from various source systems. Workflows are typically executed on top of several business applications and are controlled by a workflow management system, which also tracks the execution. We want to give knowledge workers and decision makers an in-depth analysis of the executed business processes by providing valuable, process-oriented business metrics (cycle time, costs, quality, etc.) [6]. We
10 326 R.M. Bruckner, B. List, and J. Schiefer utilize ETLets to prepare and process the audit trail and business data. A near realtime propagation of the data is very critical, because it allows knowledge workers and decision makers an early discovery of weaknesses and problems in the process execution [7]. References 1. Bouzeghoub, M., Fabret, F., Matulovic, M.: Modeling Data Warehouse Refreshment Process as a Workflow Application. Intl. Workshop DMDW 99, Heidelberg, Germany, June Bruckner, R.M., Tjoa, A M.: Capturing Delays and Valid Times in Data Warehouses Towards Timely Consistent Analyses. To appear: Journal of Intelligent Information Systems (JIIS), forthcoming, Inmon, W.H.: Building the Data Warehouse. 2 nd ed., J.Wiley & Sons, New York, Inmon, W.H.: Building the Operational Data Store. 2 nd ed., J.Wiley & Sons, NY, Inmon, W.H., Terdeman, R.H., Norris-Montanari J., Meers, D.: Data Warehousing for E- Business. J.Wiley & Sons, New York, List, B., Schiefer, J., Bruckner, R.M.: Measuring Knowledge with Workflow Management Systems. TAKMA Workshop, in Proc. of 12 th Intl. Workshop DEXA 01, IEEE CS Press, pp , Munich, Germany, September Kueng, P., Wettstein, T., List, B.: A Holistic Process Performance Analysis through a Performance Data Warehouse. Proc. AMCIS 2001, Boston, USA, pp , Aug Roddick, J.F., Schrefl, M.: Towards an Accommodation of Delay in Temporal Active Databases. Proc. of 11 th ADC2000, IEEE CS Press, pp , Canberra, Australia, Schrefl, M., Thalhammer, T.: On Making Data Warehouses Active. Proc. of the 2 nd Intl. Conf. DaWaK, Springer, LNCS 1874, pp , London, UK, Thalhammer, T., Schrefl, M., Mohania, M.: Active Data Warehouses: Complementing OLAP with Analysis Rules. Data & Knowledge Engineering, Vol. 39(3), pp , Theodoratos, D., Bouzeghoub, M.: Data Currency Quality Factors in Data Warehouse Design. Intl. Workshop DMDW 99, Heidelberg, Germany, June Vassiliadis, P., Vagena, Z., Skiadopoulos, S., Karayannidis, N., Sellis, T.: ARKTOS: towards the Modeling, Design, Control and Execution of ETL Processes. Information Systems, Vol. 26(8), pp , Yang, J., Widom, J.: Temporal View Self-Maintenance. Proc. of the 7 th Intl. Conf. EDBT2000, Springer, LNCS 1777, pp , Konstanz, Germany, Yang, J.: Temporal Data Warehousing. Ph.D. Thesis, Department of Computer Science, Stanford University, Sun Microsystems, J2EE Connector Specification 1.0, Sun Microsystems, Designing Enterprise Applications with the Java 2 Platform, Enterprise Edition, Second Edition, Sun Microsystems, Enterprise JavaBeans Specification, Version 2.0, 2001.
Managing Continuous Data Integration Flows
Managing Continuous Data Integration Flows Josef Schiefer 1, Jun-Jang Jeng 1, Robert M. Bruckner 2 1 IBM Watson Research Center 19 Skyline Drive, Hawthorne, NY 10532 {josef.schiefer,jjjeng}@us.ibm.com
Reducing ETL Load Times by a New Data Integration Approach for Real-time Business Intelligence
Reducing ETL Load Times by a New Data Integration Approach for Real-time Business Intelligence Darshan M. Tank Department of Information Technology, L.E.College, Morbi-363642, India [email protected] Abstract
Service Oriented Data Management
Service Oriented Management Nabin Bilas Integration Architect Integration & SOA: Agenda Integration Overview 5 Reasons Why Is Critical to SOA Oracle Integration Solution Integration
Real-Time Data Warehouse Loading Methodology
Real-Time Data Warehouse Loading Methodology Ricardo Jorge Santos CISUC Centre of Informatics and Systems DEI FCT University of Coimbra Coimbra, Portugal [email protected] Jorge Bernardino
MDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
E-commerce Design and Implementation of an ETL Container
Association for Information Systems AIS Electronic Library (AISeL) ICIS 2003 Proceedings International Conference on Information Systems (ICIS) 12-31-2003 Container-Managed ETL Applications for Integrating
Data Integration Checklist
The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media
EAI vs. ETL: Drawing Boundaries for Data Integration
A P P L I C A T I O N S A W h i t e P a p e r S e r i e s EAI and ETL technology have strengths and weaknesses alike. There are clear boundaries around the types of application integration projects most
IBM WebSphere application integration software: A faster way to respond to new business-driven opportunities.
Application integration solutions To support your IT objectives IBM WebSphere application integration software: A faster way to respond to new business-driven opportunities. Market conditions and business
JOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,
BUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Improve data quality Move data in real time and
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS PRODUCT FACTS & FEATURES KEY FEATURES Comprehensive, best-of-breed capabilities 100 percent thin client interface Intelligence across multiple
A Service-oriented Architecture for Business Intelligence
A Service-oriented Architecture for Business Intelligence Liya Wu 1, Gilad Barash 1, Claudio Bartolini 2 1 HP Software 2 HP Laboratories {[email protected]} Abstract Business intelligence is a business
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS
Oracle Fusion editions of Oracle's Hyperion performance management products are currently available only on Microsoft Windows server platforms. The following is intended to outline our general product
An Architecture for Integrated Operational Business Intelligence
An Architecture for Integrated Operational Business Intelligence Dr. Ulrich Christ SAP AG Dietmar-Hopp-Allee 16 69190 Walldorf [email protected] Abstract: In recent years, Operational Business Intelligence
What Is the Java TM 2 Platform, Enterprise Edition?
Page 1 de 9 What Is the Java TM 2 Platform, Enterprise Edition? This document provides an introduction to the features and benefits of the Java 2 platform, Enterprise Edition. Overview Enterprises today
SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs
Database Systems Journal vol. III, no. 1/2012 41 SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs 1 Silvia BOLOHAN, 2
Enterprise Integration Architectures for the Financial Services and Insurance Industries
George Kosmides Dennis Pagano Noospherics Technologies, Inc. [email protected] Enterprise Integration Architectures for the Financial Services and Insurance Industries Overview Financial Services
Integrating Ingres in the Information System: An Open Source Approach
Integrating Ingres in the Information System: WHITE PAPER Table of Contents Ingres, a Business Open Source Database that needs Integration... 3 Scenario 1: Data Migration... 4 Scenario 2: e-business Application
Methods and Technologies for Business Process Monitoring
Methods and Technologies for Business Monitoring Josef Schiefer Vienna, June 2005 Agenda» Motivation/Introduction» Real-World Examples» Technology Perspective» Web-Service Based Business Monitoring» Adaptive
The IBM Cognos Platform for Enterprise Business Intelligence
The IBM Cognos Platform for Enterprise Business Intelligence Highlights Optimize performance with in-memory processing and architecture enhancements Maximize the benefits of deploying business analytics
BUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Accelerate time to market Move data in real time
Patrick Firouzian, ebay
Informatica Data Integration Platform The Informatica Data Integration Platform is the industry s leading software for accessing, integrating, and delivering data from any source, to any source. The Informatica
Gradient An EII Solution From Infosys
Gradient An EII Solution From Infosys Keywords: Grid, Enterprise Integration, EII Introduction New arrays of business are emerging that require cross-functional data in near real-time. Examples of such
By Makesh Kannaiyan [email protected] 8/27/2011 1
Integration between SAP BusinessObjects and Netweaver By Makesh Kannaiyan [email protected] 8/27/2011 1 Agenda Evolution of BO Business Intelligence suite Integration Integration after 4.0 release
Jitterbit Technical Overview : Microsoft Dynamics CRM
Jitterbit allows you to easily integrate Microsoft Dynamics CRM with any cloud, mobile or on premise application. Jitterbit s intuitive Studio delivers the easiest way of designing and running modern integrations
Data Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
ENTERPRISE EDITION ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE DATA INTEGRATOR
ORACLE DATA INTEGRATOR ENTERPRISE EDITION KEY FEATURES AND BENEFITS ORACLE DATA INTEGRATOR ENTERPRISE EDITION OFFERS LEADING PERFORMANCE, IMPROVED PRODUCTIVITY, FLEXIBILITY AND LOWEST TOTAL COST OF OWNERSHIP
Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group. Tuesday June 12 1:00-2:15
Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group Tuesday June 12 1:00-2:15 Service Oriented Architecture and the DBA What is Service Oriented Architecture (SOA)
Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc.
Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc. Introduction Abstract warehousing has been around for over a decade. Therefore, when you read the articles
A standards-based approach to application integration
A standards-based approach to application integration An introduction to IBM s WebSphere ESB product Jim MacNair Senior Consulting IT Specialist [email protected] Copyright IBM Corporation 2005. All rights
Reverse Engineering in Data Integration Software
Database Systems Journal vol. IV, no. 1/2013 11 Reverse Engineering in Data Integration Software Vlad DIACONITA The Bucharest Academy of Economic Studies [email protected] Integrated applications
Near Real-time Data Warehousing with Multi-stage Trickle & Flip
Near Real-time Data Warehousing with Multi-stage Trickle & Flip Janis Zuters University of Latvia, 19 Raina blvd., LV-1586 Riga, Latvia [email protected] Abstract. A data warehouse typically is a collection
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence Appliances and DW Architectures John O Brien President and Executive Architect Zukeran Technologies 1 TDWI 1 Agenda What
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
SAS Information Delivery Portal
SAS Information Delivery Portal Table of Contents Introduction...1 The State of Enterprise Information...1 Information Supply Chain Technologies...2 Making Informed Business Decisions...3 Gathering Business
A Framework for Developing the Web-based Data Integration Tool for Web-Oriented Data Warehousing
A Framework for Developing the Web-based Integration Tool for Web-Oriented Warehousing PATRAVADEE VONGSUMEDH School of Science and Technology Bangkok University Rama IV road, Klong-Toey, BKK, 10110, THAILAND
www.sryas.com Analance Data Integration Technical Whitepaper
Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring
The IBM Cognos Platform
The IBM Cognos Platform Deliver complete, consistent, timely information to all your users, with cost-effective scale Highlights Reach all your information reliably and quickly Deliver a complete, consistent
Virtual Operational Data Store (VODS) A Syncordant White Paper
Virtual Operational Data Store (VODS) A Syncordant White Paper Table of Contents Executive Summary... 3 What is an Operational Data Store?... 5 Differences between Operational Data Stores and Data Warehouses...
ORACLE DATA INTEGRATOR ENTERPRISE EDITION
ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION KEY FEATURES Out-of-box integration with databases, ERPs, CRMs, B2B systems, flat files, XML data, LDAP, JDBC, ODBC Knowledge
Internet Engineering: Web Application Architecture. Ali Kamandi Sharif University of Technology [email protected] Fall 2007
Internet Engineering: Web Application Architecture Ali Kamandi Sharif University of Technology [email protected] Fall 2007 Centralized Architecture mainframe terminals terminals 2 Two Tier Application
SAP Data Services 4.X. An Enterprise Information management Solution
SAP Data Services 4.X An Enterprise Information management Solution Table of Contents I. SAP Data Services 4.X... 3 Highlights Training Objectives Audience Pre Requisites Keys to Success Certification
ORACLE DATA INTEGRATOR ENTERPRISE EDITION
ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition 12c delivers high-performance data movement and transformation among enterprise platforms with its open and integrated
Enterprise Application Integration
Enterprise Integration By William Tse MSc Computer Science Enterprise Integration By the end of this lecturer you will learn What is Enterprise Integration (EAI)? Benefits of Enterprise Integration Barrier
EII - ETL - EAI What, Why, and How!
IBM Software Group EII - ETL - EAI What, Why, and How! Tom Wu 巫 介 唐, [email protected] Information Integrator Advocate Software Group IBM Taiwan 2005 IBM Corporation Agenda Data Integration Challenges and
Distributed Objects and Components
Distributed Objects and Components Introduction This essay will identify the differences between objects and components and what it means for a component to be distributed. It will also examine the Java
A Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
A Near Real-Time Personalization for ecommerce Platform Amit Rustagi [email protected]
A Near Real-Time Personalization for ecommerce Platform Amit Rustagi [email protected] Abstract. In today's competitive environment, you only have a few seconds to help site visitors understand that you
White Paper: 1) Architecture Objectives: The primary objective of this architecture is to meet the. 2) Architecture Explanation
White Paper: 1) Architecture Objectives: The primary objective of this architecture is to meet the following requirements (SLAs). Scalability and High Availability Modularity and Maintainability Extensibility
Driving workload automation across the enterprise
IBM Software Thought Leadership White Paper October 2011 Driving workload automation across the enterprise Simplifying workload management in heterogeneous environments 2 Driving workload automation across
The TransactionVision Solution
The TransactionVision Solution Bristol's TransactionVision is transaction tracking and analysis software that provides a real-time view of business transactions flowing through a distributed enterprise
Data Masking: A baseline data security measure
Imperva Camouflage Data Masking Reduce the risk of non-compliance and sensitive data theft Sensitive data is embedded deep within many business processes; it is the foundational element in Human Relations,
zen Platform technical white paper
zen Platform technical white paper The zen Platform as Strategic Business Platform The increasing use of application servers as standard paradigm for the development of business critical applications meant
Implement a unified approach to service quality management.
Service quality management solutions To support your business objectives Implement a unified approach to service quality management. Highlights Deliver high-quality software applications that meet functional
Jagir Singh, Greeshma, P Singh University of Northern Virginia. Abstract
224 Business Intelligence Journal July DATA WAREHOUSING Ofori Boateng, PhD Professor, University of Northern Virginia BMGT531 1900- SU 2011 Business Intelligence Project Jagir Singh, Greeshma, P Singh
BIRT Document Transform
BIRT Document Transform BIRT Document Transform is the industry leader in enterprise-class, high-volume document transformation. It transforms and repurposes high-volume documents and print streams such
www.ducenit.com Analance Data Integration Technical Whitepaper
Analance Data Integration Technical Whitepaper Executive Summary Business Intelligence is a thriving discipline in the marvelous era of computing in which we live. It s the process of analyzing and exploring
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
Integrating data in the Information System An Open Source approach
WHITE PAPER Integrating data in the Information System An Open Source approach Table of Contents Most IT Deployments Require Integration... 3 Scenario 1: Data Migration... 4 Scenario 2: e-business Application
Integrating SAP and non-sap data for comprehensive Business Intelligence
WHITE PAPER Integrating SAP and non-sap data for comprehensive Business Intelligence www.barc.de/en Business Application Research Center 2 Integrating SAP and non-sap data Authors Timm Grosser Senior Analyst
Client-Server Architecture & J2EE Platform Technologies Overview Ahmed K. Ezzat
Client-Server Architecture & J2EE Platform Technologies Overview Ahmed K. Ezzat Page 1 of 14 Roadmap Client-Server Architecture Introduction Two-tier Architecture Three-tier Architecture The MVC Architecture
Cisco Tidal Enterprise Scheduler
Cisco Tidal Enterprise Scheduler Introduction to Automated Enterprise Job Scheduling Automated job scheduling is essential to complex data centers, because it helps them operate more efficiently and reliably.
Oracle Real Time Decisions
A Product Review James Taylor CEO CONTENTS Introducing Decision Management Systems Oracle Real Time Decisions Product Architecture Key Features Availability Conclusion Oracle Real Time Decisions (RTD)
BI and ETL Process Management Pain Points
BI and ETL Process Management Pain Points Understanding frequently encountered data workflow processing pain points and new strategies for addressing them What You Will Learn Business Intelligence (BI)
Attunity Integration Suite
Attunity Integration Suite A White Paper February 2009 1 of 17 Attunity Integration Suite Attunity Ltd. follows a policy of continuous development and reserves the right to alter, without prior notice,
Semarchy Convergence for Data Integration The Data Integration Platform for Evolutionary MDM
Semarchy Convergence for Data Integration The Data Integration Platform for Evolutionary MDM PRODUCT DATASHEET BENEFITS Deliver Successfully on Time and Budget Provide the Right Data at the Right Time
Enterprise Architecture For Next Generation Telecommunication Service Providers CONTACT INFORMATION:
Enterprise Architecture For Next Generation Telecommunication Service Providers CONTACT INFORMATION: phone: +1.301.527.1629 fax: +1.301.527.1690 email: [email protected] web: www.hsc.com PROPRIETARY NOTICE
WebSphere Business Modeler
Discovering the Value of SOA WebSphere Process Integration WebSphere Business Modeler Workshop SOA on your terms and our expertise Soudabeh Javadi Consulting Technical Sales Support WebSphere Process Integration
Informatica PowerCenter Data Virtualization Edition
Data Sheet Informatica PowerCenter Data Virtualization Edition Benefits Rapidly deliver new critical data and reports across applications and warehouses Access, merge, profile, transform, cleanse data
BENEFITS OF AUTOMATING DATA WAREHOUSING
BENEFITS OF AUTOMATING DATA WAREHOUSING Introduction...2 The Process...2 The Problem...2 The Solution...2 Benefits...2 Background...3 Automating the Data Warehouse with UC4 Workload Automation Suite...3
Pervasive Software + NetSuite = Seamless Cloud Business Processes
Pervasive Software + NetSuite = Seamless Cloud Business Processes Successful integration solution between cloudbased ERP and on-premise applications leveraging Pervasive integration software. Prepared
A Guide Through the BPM Maze
A Guide Through the BPM Maze WHAT TO LOOK FOR IN A COMPLETE BPM SOLUTION With multiple vendors, evolving standards, and ever-changing requirements, it becomes difficult to recognize what meets your BPM
Technology-Driven Demand and e- Customer Relationship Management e-crm
E-Banking and Payment System Technology-Driven Demand and e- Customer Relationship Management e-crm Sittikorn Direksoonthorn Assumption University 1/2004 E-Banking and Payment System Quick Win Agenda Data
Contents. Client-server and multi-tier architectures. The Java 2 Enterprise Edition (J2EE) platform
Part III: Component Architectures Natividad Martínez Madrid y Simon Pickin Departamento de Ingeniería Telemática Universidad Carlos III de Madrid {nati, spickin}@it.uc3m.es Introduction Contents Client-server
Increasing IT flexibility with IBM WebSphere ESB software.
ESB solutions White paper Increasing IT flexibility with IBM WebSphere ESB software. By Beth Hutchison, Katie Johnson and Marc-Thomas Schmidt, IBM Software Group December 2005 Page 2 Contents 2 Introduction
CA Workload Automation Agents for Mainframe-Hosted Implementations
PRODUCT SHEET CA Workload Automation Agents CA Workload Automation Agents for Mainframe-Hosted Operating Systems, ERP, Database, Application Services and Web Services CA Workload Automation Agents are
What You Need to Know About Transitioning to SOA
What You Need to Know About Transitioning to SOA written by: David A. Kelly, ebizq Analyst What You Need to Know About Transitioning to SOA Organizations are increasingly turning to service-oriented architectures
DATA MINING AND WAREHOUSING CONCEPTS
CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation
Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram
Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram Cognizant Technology Solutions, Newbury Park, CA Clinical Data Repository (CDR) Drug development lifecycle consumes a lot of time, money
SERVICE ORIENTED ARCHITECTURE
SERVICE ORIENTED ARCHITECTURE Introduction SOA provides an enterprise architecture that supports building connected enterprise applications to provide solutions to business problems. SOA facilitates the
Service Oriented Architectures
8 Service Oriented Architectures Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) [email protected] http://www.iks.inf.ethz.ch/ The context for SOA A bit of history
JSLEE and SIP-Servlets Interoperability with Mobicents Communication Platform
JSLEE and SIP-Servlets Interoperability with Mobicents Communication Platform Jean Deruelle Jboss R&D, a division of Red Hat [email protected] Abstract JSLEE is a more complex specification than SIP
A Grid Architecture for Manufacturing Database System
Database Systems Journal vol. II, no. 2/2011 23 A Grid Architecture for Manufacturing Database System Laurentiu CIOVICĂ, Constantin Daniel AVRAM Economic Informatics Department, Academy of Economic Studies
TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS
9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence
CONSUMER DEMAND MONITORING AND SALES FORECASTING (CDMFS) SYSTEM
CONSUMER DEMAND MONITORING AND SALES FORECASTING (CDMFS) SYSTEM Rahul Goela School of Electrical and Electronics Engineering (3 rd Year) Nanyang Technological University (NTU) Matriculation Number: 001105a03
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
SQL Server 2012 Business Intelligence Boot Camp
SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations
IT Automation: Evaluate Job Scheduling and Run Book Automation Solutions
IT Automation: Evaluate Job Scheduling and Run Book Automation Solutions What You Will Learn Businesses increasingly use IT automation tools to meet the challenge of managing their data centers. This document
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
Task definition PROJECT SCENARIOS. The comprehensive approach to data integration
Data Integration Suite Your Advantages Seamless interplay of data quality functions and data transformation functions Linking of various data sources through an extensive set of connectors Quick and easy
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT POINT-AND-SYNC MASTER DATA MANAGEMENT 04.2005 Hyperion s new master data management solution provides a centralized, transparent process for managing critical
