Composite Data Virtualization Data Virtualization Platform Maturity Model



Similar documents
Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

Composite Data Virtualization Composite Data Virtualization Platform Technical Overview

Jitterbit Technical Overview : Microsoft Dynamics CRM

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Ten Things You Need to Know About Data Virtualization

Jitterbit Technical Overview : Salesforce

Cisco Data Virtualization Technical Overview

Cisco Information Server 7.0

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

SQL Server 2005 Features Comparison

Data Integration Checklist

The IBM Cognos Platform

Jitterbit Technical Overview : Microsoft Dynamics AX

Cisco Information Server 7.0

BUSINESS INTELLIGENCE ANALYTICS

Denodo Data Virtualization Security Architecture & Protocols

Microsoft SQL Server 2012: What to Expect

Informatica PowerCenter Data Virtualization Edition

AtScale Intelligence Platform

Cisco Data Preparation

Data Virtualization and ETL. Denodo Technologies Architecture Brief

Integrating SAP and non-sap data for comprehensive Business Intelligence

SOLUTION BRIEF CA ERWIN MODELING. How Can I Manage Data Complexity and Improve Business Agility?

A Perspective on the Benefits of Data Virtualization Technology

EVALUATING INTEGRATION SOFTWARE

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

BEA AquaLogic Integrator Agile integration for the Enterprise Build, Connect, Re-use

XpoLog Center Suite Data Sheet

IBM Tivoli Directory Integrator

XpoLog Competitive Comparison Sheet

QLIKVIEW IN THE ENTERPRISE

By Makesh Kannaiyan 8/27/2011 1

BUSINESSOBJECTS DATA INTEGRATOR

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

SOA REFERENCE ARCHITECTURE: SERVICE TIER

BUSINESSOBJECTS DATA INTEGRATOR

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

A Tipping Point for Automation in the Data Warehouse.

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica

Integrating Ingres in the Information System: An Open Source Approach

Enabling Better Business Intelligence and Information Architecture With SAP Sybase PowerDesigner Software

SOLUTION BRIEF CA ERwin Modeling. How can I understand, manage and govern complex data assets and improve business agility?

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

MOVING TO THE NEXT-GENERATION MEDICAL INFORMATION CALL CENTER

Vistara Lifecycle Management

Enabling Better Business Intelligence and Information Architecture With SAP PowerDesigner Software

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

IBM WebSphere application integration software: A faster way to respond to new business-driven opportunities.

Your Data, Any Place, Any Time.

Data Virtualization. Paul Moxon Denodo Technologies. Alberta Data Architecture Community January 22 nd, Denodo Technologies

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

Implementing Oracle BI Applications during an ERP Upgrade

Pervasive Software + NetSuite = Seamless Cloud Business Processes

IBM Cognos 8 Business Intelligence Reporting Meet all your reporting requirements

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to:

Solving the Problem of Data Silos: Process and Architecture

Federated single sign-on (SSO) and identity management. Secure mobile access. Social identity integration. Automated user provisioning.

Data Virtualization Overview

Efficient Data Access and Data Integration Using Information Objects Mica J. Block

Enterprise Enabler and the Microsoft Integration Stack

Oracle Warehouse Builder 10g

Infosys GRADIENT. Enabling Enterprise Data Virtualization. Keywords. Grid, Enterprise Data Integration, EII Introduction

Contents. Introduction... 1

Find the Information That Matters. Visualize Your Data, Your Way. Scalable, Flexible, Global Enterprise Ready

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

MarkLogic Enterprise Data Layer

See the Big Picture. Make Better Decisions. The Armanta Technology Advantage. Technology Whitepaper

ENZO UNIFIED SOLVES THE CHALLENGES OF OUT-OF-BAND SQL SERVER PROCESSING

EAI vs. ETL: Drawing Boundaries for Data Integration

Data Virtualization Usage Patterns for Business Intelligence/ Data Warehouse Architectures

Accelerating the path to SAP BW powered by SAP HANA

SOLUTION BRIEF. JUST THE FAQs: Moving Big Data with Bulk Load.

What does SAS Data Management do? Why is SAS Data Management important? For whom is SAS Data Management designed? Key Benefits

Unified Data Integration Across Big Data Platforms

White Paper. Unified Data Integration Across Big Data Platforms

Composite Software Data Virtualization Turbocharge Analytics with Big Data and Data Virtualization

High Performance Data Management Use of Standards in Commercial Product Development

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

a division of Technical Overview Xenos Enterprise Server 2.0

Cost Savings THINK ORACLE BI. THINK KPI. THINK ORACLE BI. THINK KPI. THINK ORACLE BI. THINK KPI.

SOLUTION BRIEF. Advanced ODBC and JDBC Access to Salesforce Data.

IBM Cognos Performance Management Solutions for Oracle

Oracle Database 11g Comparison Chart

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

The ESB and Microsoft BI

NEW FEATURES ORACLE ESSBASE STUDIO

IBM Cognos 8 Business Intelligence Analysis Discover the factors driving business performance

ORACLE MOBILE SUITE. Complete Mobile Development Solution. Cross Device Solution. Shared Services Infrastructure for Mobility

SQL Server 2008 Business Intelligence

ORACLE SOA SUITE. Product Overview

Key Attributes for Analytics in an IBM i environment

SQL Server 2012 Performance White Paper

Lofan Abrams Data Services for Big Data Session # 2987

Master Data Management Enterprise Architecture IT Strategy and Governance

SQL Server 2012 Business Intelligence Boot Camp

Transcription:

Composite Data Virtualization Data Virtualization Platform Maturity Model Composite Software September 2010

TABLE OF CONTENTS INTRODUCTION... 3 EVOLVING NEEDS, EVOLVING SOLUTIONS... 4 HOW TO MEASURE DATA VIRTUALIZATION PLATFORM MATURITY... 5 MATURITY DIMENSION... 6 FUNCTIONALITY DIMENSION... 7 COMBINING DIMENSIONS... 8 QUERY PROCESSING... 8 CACHING... 8 DATA ACCESS (FROM SOURCES)... 9 TRANSFORMATION (INCLUDES DATA QUALITY)... 10 DATA DELIVERY (TO CONSUMERS)... 10 SECURITY... 10 MODELING AND METADATA MANAGEMENT... 11 ENTERPRISE-SCALE OPERATION... 12 CONCLUSION... 13 Composite Software 2

INTRODUCTION At an ever accelerating pace, enterprises and government agencies are discovering innovative ways to leverage information to meet progressively more challenging financial and service level objectives. Yet, to fulfill this explosive demand, data professionals face increasing difficult data integration challenges including: Constant business change necessitating immediate and evolving IT response; Growing data volumes and complexity that increase business risk and reduce agility; and Operational and financial constraints necessitating easy-to-adopt, cost-effective IT solutions that leverage prior investments. Traditional approaches such as data consolidation and replication alone have not kept pace. As a result, data virtualization, an integration method that leverages modern virtualization principles, has evolved to complement these earlier investments and fill the business and IT gap. In an environment of ever evolving needs and solutions, enterprises and government agencies must select the right data virtualization offering to meet their needs. To provide a systematic assessment approach, Composite Software has developed a Data Virtualization Platform Maturity Model. This Data Virtualization Leadership Series white paper describes this model and how it can be used for both initial evaluation and on-going optimization of data virtualization platforms. Composite Software 3

EVOLVING NEEDS, EVOLVING SOLUTIONS Originally deployed to meet light data federation requirements in BI environments, today s data virtualization use cases span a range of consuming applications including customer experience management, risk management and compliance, supply chain management, mergers and acquisitions support, and more. Further, the range of data supported has grown beyond relational to include semi-structured XML, dimensional MDX, and the new NoSQL data types. Along the way, adoption has evolved from initial project-level deployments to enterprise-scale data virtualization layers that share data from multiple sources across multiple applications and uses. At the same time, the data virtualization offerings themselves have evolved. From a vendor point of view, many of the early Enterprise Information Integration (EII) companies who entered the market in the early 2000s have been acquired or exited the market, leaving a short list of suppliers able to meet today s more advanced data virtualization requirements. To fill this gap between supply and demand, new entrants from adjacent markets such as BI and Extract-Transform-Load (ETL) have recently announced data virtualization products that leverage these vendors existing offerings. Finally, within this vendor landscape, the functionality of the offerings has also evolved dramatically, across a range or functional categories with various levels of capability from entrylevel to mature. Composite Software 4

HOW TO MEASURE DATA VIRTUALIZATION PLATFORM MATURITY In an environment of ever evolving needs and solutions, enterprises and government agencies find determining the right data virtualization offering to meet their needs a significant challenge. To provide a systematic assessment approach, we developed a Data Virtualization Platform Maturity Model. This model has two critical dimensions. The first uses a five-stage maturity timeline to provide a common framework for measuring the various phases typical in software innovation. The second dimension looks at key functionality categories that, when successfully combined, create viable data virtualization platforms. Once a data virtualization offering has been selected, the comprehensive detail within the Data Virtualization Platform Maturity Model continues to provide value to IT strategists and enterprise architectures during on-going deployment. It can be applied When developing a data virtualization capabilities adoption roadmap; When aligning staff development, release deployment and related plans with the adoption roadmap; and When measuring the viability of the selected data virtualization offering over time. Composite Software 5

MATURITY DIMENSION The first dimension in the Data Virtualization Maturity Model measures the five stages of product maturity as follows: Entry Level First product release that implements a minimal set of functionality to credibly enter the market. Limited Follow-on product release(s), aimed at satisfying initial customer demands within narrow (often vertical market) use-cases. Intermediate Product releases where the functionality expands rapidly based on traction in the marketplace. Feature rich, these releases address a growing market and an expanding set of use cases. Advanced Product releases addressing more complex use cases as well as supporting large-scale, enterprise-wide infrastructure requirements. Mature Mature product releases that increase functional depth and expand market penetration. Often products incorporate functionality from adjacent functionality areas. The relation between data virtualization platform maturity and time can be seen in Figure 1. Mature Product Maturity Intermediate Limited Advanced Entry Level Time in Years Figure 1. Product Maturity over Time Composite Software 6

FUNCTIONALITY DIMENSION The second dimension in the Data Virtualization Maturity Model is functionality. Derived from Composite Software s millions of hours of operational deployment at Global 2000 enterprises and government agencies and hundreds of man years of R&D, the following eight functional categories combine to form a viable enterprise-level data virtualization platform. These categories include: Query Processing Caching Data Access (from Sources Transformation (includes Data Quality) Data Delivery (to Consumers) Security Modeling and Metadata Management Enterprise-scale Operation Composite Software 7

COMBINING DIMENSIONS By overlaying the five stages in the maturity dimension across the eight categories on the functionality dimension, enterprises and government agencies can use the Data Virtualization Platform Maturity Model to gain a comprehensive understanding of key capabilities by stage. Query Processing At its core, data virtualization s primary purpose is on-demand query of widely dispersed enterprise data. Consequently, data virtualization platforms must ensure these queries are efficient and responsive. If the high-performance query processing engine is immature or poorly architected, the rest of the functionality is of little consequence. Maturity is typically measured by the breadth and efficiency of optimization algorithms. Entry Level Limited Intermediate Advanced Mature Process mainstream queries correctly Limited join algorithms Projection pruning Full implementation of relational algebra semantics Multi-threaded and parallel query processing Complete set of relational join algorithms Automatic rule-based optimizations Push predicates down to underlying data sources Data source specific optimizations Dynamic memory management for large data set support Single-source updates Complete standard SQL support Complete federated query plan support, including plan caching Support for multiple data shapes (i.e., scalar, tabular, hierarchical), including limited XML support Advanced join techniques (e.g., distributed semi-join) User-provided query hints Limited cost-based optimizations Transformations between data shapes Complete standard XML support, including XSLT and XQuery manipulation Advanced cost-based optimizations based on statistics (including query plan rearrangement) Exotic use-case-specific optimizations (e.g., automatic UNION-JOIN inversion) Multi-source updates, including support for transactions Scripting environment for procedural logic Platform-specific query pass-through Star- and snowflakeschema optimizations Adaptive optimizations driven by learned patterns User-defined functions Caching Traditional data integration solutions periodically consolidate data in physical stores. In contrast, data virtualization platforms dynamically combine data in-memory, on-demand. Caching addresses the middle ground between these two approaches by enabling optional prematerialization of queries result sets. This flexibility can improve query performance, work around unavailable sources, reduce source system loads, and more. Maturity is measured by the breadth of caching options across factors such as triggering, storage, distribution, update, etc. Composite Software 8

Entry Level Limited Intermediate Advanced Mature Materialization of tabular data sets Local storage Basic cache refresh policies (i.e., periodic) Consider caches in optimization decisions Multiple cache refresh policies (periodic, aging, external events) Relational database cache storage, including DDL support Procedure result caching, including web service result caching Incremental cache refresh (leveraging change data capture) Cluster-shared caches Multi-cluster edge caching Adaptive dynamic caching based on learned patterns Distributed cache storage In-memory cache storage Data Access (from Sources) There are a wide variety of structured and semi-structured data sources in a typical large enterprise. Data virtualization platforms must reach and extract data efficiently from all of them. Further, they must include methods to programmatically extend data source access to handle unique, non-standard data sources. Maturity is measured in the breadth of data source formats and protocols supported. Entry Level Limited Intermediate Advanced Mature Limited set of relational databases Tabular files Expanded set of relational databases Basic web services over HTTP Excel spreadsheets XML files Data-source specific query pass-through Packaged application data access (e.g., SAP, Siebel) Data warehouse support LDAP data source support Stored procedure support in relational databases Message-based web service access (e.g., JMS) Multi-dimensional data source access (including MDX code generation) NOSQL data source integration (e.g., Hadoop/Hbase) Cloud-based data source integration Industry-specific data structure support (e.g., geospatial, molecular) Pass-through authentication to underlying data sources Complete web service support, including REST Legacy mainframe support (e.g., VSAM) Custom developed data source drivers leveraging native APIs Composite Software 9

Transformation (includes Data Quality) Because source data is rarely a 100 percent match with data consumer needs, data virtualization platforms must transform and improve data, typically abstracting disparate source data into standardized canonical models for easier sharing by multiple consumers. Maturity is measured in the ease of use, breadth, flexibility and extensibility of transformation functions. Entry Level Limited Intermediate Advanced Mature Basic SQL functions Aliasing All standard SQL functions Derived values Value standardization* Value enrichment* Support for multiple data shapes (scalar, tabular, and hierarchical) Complete tabular and hierarchical data type representation Transformations between data shapes Scripting environment for procedural logic SQL/XML 2007 support Third-party validation* Third-party enrichment* User-defined functions Universal data type conversion * Denotes functionality traditionally associated with Data Quality. Data Delivery (to Consumers) Enterprise end-users consume data using a wide variety of applications, visualization tools and analytics. Data virtualization platforms must deliver data to the consumers using the standardsbased data access mechanisms these consumers require. Further, they must enable delivery of common data to different consumers via different methods. Examples include as an XML document via SOAP, and as a relational view via ODBC. Maturity is measured in the breadth of data consumer formats and protocols supported. Entry Level Limited Intermediate Advanced Mature Basic ODBC or JDBC connectivity Full ODBC and JDBC standard support Full web services support, including REST Contract-first web service implementation Message-based data delivery Lightweight solutions, embedded in the client Basic web services support ADO.NET support Prepared statement support Analytical functions Scheduled queries with e-mailed results Result set pagination Security Data virtualization platforms must secure the data that passes through them. Deploying data virtualization should not force reinvention of existing well-developed security policies; it should leverage the standards and security frameworks already implemented in the enterprise. Maturity is Composite Software 10

measured in the breadth of authentication, authorization and encryption standards supported as well as a high degree of transparency. Entry Level Limited Intermediate Advanced Mature Built-in user authentication Basic access privileges Support groups and/or roles Support standard CRUD privileges for groups and individuals Leverage external LDAP authentication systems (e.g., Active Directory) Support GRANT privilege model Pass-through user credentials to underlying data sources Support web service security standards (e.g., SSL, WS-Security) Token-based authentication, including SSO, Kerberos, and NTLM Data encryption in wire protocols Policy-based authentication Column-level security Row-level security (i.e., redaction and masking) Modeling and Metadata Management Modeling and development productivity with its concomitant faster time to solution is one of data virtualization s biggest benefits. To ensure data modeler and developer adoption, the tools must be intuitive to use and standards-based. Further, they must automate key work steps including data discovery, code generation, in-line testing and more. Further they must provide tight links to the source control system, metadata repositories, and more. Maturity is measured by the degree that the data virtualization platform makes easy things easy and hard things possible. Entry Level Limited Intermediate Advanced Mature Basic drag-and-drop query editor Interactive testing and debugging Model import and export Graphical query editor with support for all major SQL constructs Textual editors for hierarchical (XML) data transformations Graphical tools to examine data lineage Interactive data source metadata introspection Metadata export/import Metadata migration utilities Metadata search and query Third-party metadata repositories Query plan visualizer with live monitoring Hierarchical (XML) to tabular graphical transformation editor Tabular to hierarchical (XML) graphical transformation editor Graphical editor to combine data of multiple shapes Integration with source code control systems Rule-based triggers Multi-user resource management Metadata management API Graphical transformation editor for multidimensional data Graphical editors for complex data types (e.g., XML schemas) Data profiling tools with multi-source relationship discovery (i.e., inter-silo schema discovery) Integration with adjacent data manipulation tools Any-to-any graphical transformation editor Scripting debugger Composite Software 11

Enterprise-scale Operation Because data virtualization serves critical business needs 7x24x365, operational support is a core requirement in enterprise data virtualization deployments. Data virtualization platforms must be highly deployable, reliable, available, scalable, manageable and maintainable. Maturity is measured in the breadth and depth of operational support capabilities. Entry Level Limited Intermediate Advanced Mature Basic logging Basic management consoles Support for all major enterprise operating system platforms (Windows, Solaris, Linux, AIX, HP-UX) Logging of all major activity Management consoles for all major functionality Unicode and internationalization Clustering for horizontal scaling, fail over, and disaster recovery Automatic and dynamic synchronization of metadata across cluster Complete management consoles Integration with SNMP monitoring systems Support for 64-bit processor architectures Cluster-wide visibility and monitoring consoles Cluster-shared caching Real-time monitoring of running queries Integration with thirdparty NOC tools and infrastructure (including DAM solutions) Geographically distributed cooperating clusters Integration with adjacent data management infrastructures (e.g., data governance solutions) Management and administration API Composite Software 12

CONCLUSION Data virtualization platform functionality has evolved to meet changing IT demands. The Data Virtualization Platform Maturity Model described in this paper provides a detailed, systematic approach that supports the initial evaluation of a data virtualization platform. This model may also be applied during the development of a data virtualization adoption roadmap; during the alignment process of executing the adoption roadmap; and/or to measure over time the viability of the selected data virtualization offering. Composite Software 13

ABOUT COMPOSITE SOFTWARE Composite Software, Inc. is the only company that focuses solely on data virtualization. Global organizations faced with disparate, complex data environments, including ten of the top 20 banks, six of the top ten pharmaceutical companies, four of the top five energy firms, major media and technology organizations as well as government agencies, have chosen Composite s proven data virtualization platform to fulfill critical information needs, faster with fewer resources. Scaling from project to enterprise, Composite s middleware enables data federation, data warehouse extension, enterprise data sharing, real-time and cloud computing data integration. Founded in 2002, Composite Software is a privately held, venture-funded corporation based in Silicon Valley. For more information, please visit www.compositesw.com.