Enterprise Information Integration (EII) A Technical Ally of EAI and ETL Author Bipin Chandra Joshi Integration Architect Infosys Technologies Ltd Page 1 of 8
TU1UT TUENTERPRISE TU2UT TUREFERENCESUT TABLE OF CONTENTS INFORMATION INTEGRATION (EII)UT... 4 TU1.1UT TUBACKGROUNDUT... 4 TU1.2UT TUWHAT IS EII?UT... 4 TU1.3UT TUHOW IT WORKS?UT... 4 TU1.4UT TUEII VS EAI VS ETLUT... 5 TU1.5UT TULEADING PRODUCTS OF EIIUT... 6 TU1.6UT TUBENEFITS OF EIIUT... 7 TU1.7UT TULIMITATIONS OF EIIUT... 7 TU1.8UT TUAUTHORS NOTEUT... 7... 8 Page 2 of 8
Page 3 of 8
1 ENTERPRISE INFORMATION INTEGRATION (EII) 1.1 BACKGROUND Newbie technologies like Enterprise Information Integration (EII) provide new capabilities for solving business intelligence and reporting challenges. EII is generally complementary to integration technologies such as ETL and EAI. Companies are still exploring on the features of this technology and evaluating and analyzing against the already established ones. Some common questions are what are EII technology designed to do? What is the need of another technology when we already have others? When is it better to use it versus the other? The following sections are the attempt to find answers to those doubts. 1.2 WHAT IS EII? In most enterprises, information is stored in separate databases, data warehouses and applications. EII products make it possible to combine information from these different data sources on demand. They do this by establishing an intermediate data services layer that makes it possible to access the data in a standardized way, instead of having to interact directly with each separate back-end data source. EII solves a set of business problems that share several common characteristics. Integration Architects should consider EII if they need to accomplish the tasks from the following:- Generate Reports with information stored in variety of formats in widely distributed data warehouses Implement a path to a service-oriented architecture with a minimal impact to existing IT infrastructure Access data distributed across multiple sources (relational databases, enterprise applications, data warehouses, documents, XML) Combine data in different formats (relational databases, flat files, Word or Excel documents, XML) Merge static data with messages, web services, or other data streams Perform queries that include archived data with live information 1.3 HOW IT WORKS? Applications address queries to the EII layer. EII acts as a PULL engine which waits for the requests, splits the query across multiple heterogeneous data source systems gathers transactional data sets, merges them together and then pushes them out to the requesting applications. These requesting applications can be a Web-Service, Excel or some other frontend. A typical example will be an employee exercising his Stock Options. The Finance Department needs to check:- Eligibility Details like date of joining, designation etc Employee Records like current address, his individual limit for stocks Company Accounts like the current year s allocation for stocks, budgets etc However this involves querying separate disparate system, which doesn t talk to each other. EII is a correct fit in this case. The department just needs to send simple query Can Page 4 of 8
Employee <EMP No.> exercise <no. of stocks> Stocks. Rest of the above mentioned checks is done by EII at background using federated query, all transparent to the user. Fig 1: EII 1.4 EII VS EAI VS ETL A key distinction of EII compared to other integration technologies is that data is not permanently moved or replicated into a new location or server; rather the source data remains where they are and results persist in the server only as needed for caching. In other words, EII is clearly for combining information assets, not for scheduling data flows between applications. Most EAI systems are PUSH driven. A transaction happens in one of your Enterprise Application, and an EAI listener "sees" it and pushes it out over the bus or to a centralized Page 5 of 8
queue for distribution to other applications. Most EAI engines are more "workflow" and "process flow" driven rather than on-demand. ETL on other hand feed massive amounts of data from one data source to another data source in a timely fashion. They are responsible for performing that task on a consistent and repeatable basis. They handle massive transformations (sometimes in the database, sometimes in stream).etl finds its major implementation alongwith OLAP or Data-mining systems. Following table provides a comparative understanding of EII with EAI and ETL. Parameter EII (Enterprise Information Integration) Primary Reporting and Focus Analytics EAI (Enterprise Application Integration) Business Process Automation ETL (Extraction Transformation and Loading) Data Synchronization Technology PULL PUSH PUSH Trigger Query Driven (Data flow at query time) Event Driven (Data flow at transaction time) Time Driven ( Data flow at scheduled time) Output Resultsets Messages Set of Messages Mode of Realtime Realtime/Batch Batch Data Flow Data Support Access to structured (Oracle, SQL Server), semi structured (Spreadsheets, Emails) and unstructured (Word, PDF ) formats Message brokers can convert between several different data formats understood by applications. Data integration often addresses only structured information. Integrated information is stored in a data warehouse. 1.5 LEADING PRODUCTS OF EII EII products fall into two categories, those that grew from an RDBMS background and those that emerged from the XML world. Some major EII products are:- 1. Composite Information Server (CIS 2.5 ) from Composite Software 2. XML Intelligence Product (XIP 3.5 ) from Ipedo 3. DB2 Information Integrator ( DB2II 8.1) from IBM 4. Totally Integrated Enterprise (Tiger) from Cincom systems 5. FDX Information Server ( FIS 3.5 ) from SnapBridge Software 6. XA-iServer 3.52 from XAware 7. MetaMatrix Server 8. Aracne (ARN) from Denodo Technologies This difference in origin affects the product s features and capabilities.on the server side these are caching capability, data management, metadata and modeling, and queryoptimization features, and on the client side it can be even ODBC or JDBC connectivity. Page 6 of 8
1.6 BENEFITS OF EII Enterprise information integration's claim to fame is its ability to federate data. It provides a single point of access to disparate information sources. This reduces the complexity inherent in client applications attempting to join various data sources while offering another way to access that information. Some major benefits of using this technology are:- EII Shields applications from the details such as location and format of the information, protocols and query languages supported by the information sources, and the programming interfaces supported by the database servers. It also allows the applications to process data independent of changes to the underlying data management infrastructure. EII can act a virtual database that insulates the data warehouses from the impact of unmanaged raw queries Automate operations and data extraction from any type of data system: Web sources, relational databases, XML, Web Services, files (flat, PDF, Word, Excel, logs, etc.), data warehouses, applications, etc. Support to SOA architectures, as data can be accessed through Web Services and predefined exchange formats can be generated (e.g. XML). 1.7 LIMITATIONS OF EII The EII model has a number of limitations. These are:- The fact that querying Data sources for reporting, without degrading their performance is still an issue. It is an especially important issue when the source systems are older legacy systems. EII also tends to suffer from data quality problems, to an even greater extent than the data warehouse model. Unlike data maintained in a data warehouse, which is extracted from operational sources and then standardized and cleansed, EII tools generate a virtual view of the data they assemble from various operational sources, which generally contains mismatched data. 1.8 AUTHORS NOTE There is no doubt EII is an important new tool for data architects. It fits right-in-with and complements existing environments to fill the gap in integration of disparate data sources to end-users. Using EII and an XML data model, organizations can then take the next step to ondemand intelligence and deliver a valuable business advantage to their users. Given the need of on-demand information and the advantages of EII, more companies will explore their options towards this technology. They won't just want it, they'll demand it. Page 7 of 8
2 REFERENCES TUhttp://www.ipedo.comUT The EII zone TUhttp://www.denodo.comUT Discover the EII Integration TUhttp://www.cutter.comUT Cutter Consortium TUhttp://java.sys-con.comUT The Worlds Leading Java Resource TUhttp://www.networkcomputing.comUT - Network Computing for IT by IT Page 8 of 8