INTRODUCTION TO K2VIEW FABRIC

Similar documents
DATA MASKING A WHITE PAPER BY K2VIEW. ABSTRACT K2VIEW DATA MASKING

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Data Doesn t Communicate Itself Using Visualization to Tell Better Stories

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Parallel Data Warehouse

Customer Insight Appliance. Enabling retailers to understand and serve their customer

Actian SQL in Hadoop Buyer s Guide

Big Data Analytics with IBM Cognos BI Dynamic Query IBM Redbooks Solution Guide

Integrating Ingres in the Information System: An Open Source Approach

ORACLE FINANCIAL SERVICES ANALYTICAL APPLICATIONS INFRASTRUCTURE

Netezza and Business Analytics Synergy

An Oracle White Paper October Oracle Data Integrator 12c New Features Overview

Elastic Application Platform for Market Data Real-Time Analytics. for E-Commerce

SharePlex for SQL Server

High-Volume Data Warehousing in Centerprise. Product Datasheet

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes

SQL Server 2008 Performance and Scale

In-memory databases and innovations in Business Intelligence

Should Costing Version 1.1

Move Data from Oracle to Hadoop and Gain New Business Insights

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Server Consolidation with SQL Server 2008

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

Qlik Sense Enabling the New Enterprise

Online Transaction Processing in SQL Server 2008

Dell One Identity Manager Scalability and Performance

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

High performance ETL Benchmark

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box)

Object Level Authentication

SQL Server 2012 Performance White Paper

An Oracle White Paper March Best Practices for Real-Time Data Warehousing

Report Model (SMDL) Alternatives in SQL Server A Guided Tour of Microsoft Business Intelligence

Semarchy Convergence for Data Integration The Data Integration Platform for Evolutionary MDM

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Big Data on the Open Cloud

ETPL Extract, Transform, Predict and Load

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

Enabling Better Business Intelligence and Information Architecture With SAP Sybase PowerDesigner Software

ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS

Reporting Services. White Paper. Published: August 2007 Updated: July 2008

Enabling Better Business Intelligence and Information Architecture With SAP PowerDesigner Software

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Data Modeling for Big Data

SAP HANA SPS 09 - What s New? HANA IM Services: SDI and SDQ

Integrating data in the Information System An Open Source approach

Oracle Database 12c Plug In. Switch On. Get SMART.

CA Workload Automation Agents Operating System, ERP, Database, Application Services and Web Services

Jitterbit Technical Overview : Microsoft Dynamics CRM

CA Workload Automation Agents for Mainframe-Hosted Implementations

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Jitterbit Technical Overview : Microsoft Dynamics AX

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

RS MDM. Integration Guide. Riversand

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

The Methodology Behind the Dell SQL Server Advisor Tool

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

Jitterbit Technical Overview : Salesforce

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

HGST Virident Solutions 2.0

CA Process Automation

a division of Technical Overview Xenos Enterprise Server 2.0

In-Memory Analytics for Big Data

BW362 SAP NetWeaver BW, powered by SAP HANA

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

Bringing Big Data into the Enterprise

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

QLIKVIEW IN THE ENTERPRISE

EMC Virtual Infrastructure for Microsoft Applications Data Center Solution

ENTERPRISE EDITION ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE DATA INTEGRATOR

Access to easy-to-use tools that reduce management time with Arcserve Backup

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

ORACLE PRODUCT DATA HUB

QLIKVIEW DATA FLOWS TECHNICAL BRIEF

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Phire Architect Hardware and Software Requirements

Big Data and Big Data Modeling

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Reimagining Business with SAP HANA Cloud Platform for the Internet of Things

INTEROPERABILITY OF SAP BUSINESS OBJECTS 4.0 WITH GREENPLUM DATABASE - AN INTEGRATION GUIDE FOR WINDOWS USERS (64 BIT)

Accelerating Business Intelligence with Large-Scale System Memory

Ignite Your Creative Ideas with Fast and Engaging Data Discovery

Lowering the Total Cost of Ownership (TCO) of Data Warehousing

The IBM Cognos Platform for Enterprise Business Intelligence

Highly Available Unified Communication Services with Microsoft Lync Server 2013 and Radware s Application Delivery Solution

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Optimizing the Performance of Your Longview Application

Trafodion Operational SQL-on-Hadoop

Oracle BI 10g: Analytics Overview

Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics. An Oracle White Paper October 2013

Enterprise Reporter Report Library

OBIEE 11g Analytics Using EMC Greenplum Database

Analytic Modeling in Python

ERDAS ADE Enterprise Suite Products Overview and Position

Data Virtualization A Potential Antidote for Big Data Growing Pains

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Testing Big data is one of the biggest

An Oracle White Paper November Upgrade Best Practices - Using the Oracle Upgrade Factory for Siebel Customer Relationship Management

Transcription:

INTRODUCTION TO K2VIEW FABRIC A WHITE PAPER BY K2VIEW. ABSTRACT Across industries, the amount of data to be managed is exponentially growing, and with it the need for modern, fully-distributed and scalable data management systems - often referred as big data architectures. As this new market opens, many solutions arise, providing fully distributed and scalable systems to manage Big Data. These solutions do answer the volume and administration problems but still present some caveats: They often require a lot of effort to integrate into an existing mature environment. They still use the same type of outdated data representation that is the relational database model (see details next page). K2View Fabric is designed to solve today s big data problem while alleviating these caveats, as this white paper will introduce. K2VIEW FABRIC K2View Fabric is an innovative and revolutionary database management system, residing on top of Apache Cassandra. It provides an easy, secure and reliable way to consolidate your data and distribute it over your network for high availability.

AT THE HEART OF K2VIEW FABRIC: THE LOGICAL UNIT K2View Fabric uses a game-changing data model to retrieve and store data: the Logical Unit. Most database management systems store data based on the type of data being stored (e.g. customer data, financial data, address data, device data); this model translates into very large tables that must be queried using complex joins every time one wants to access business relevant data (e.g. how many payments has this customer made within the past three months?). K2View s solutions look at data a different way: storing and retrieving it based on business logic, hence the name Logical Unit. This allows the business to easily design K2View Fabric s base schema based on their needs, as opposed to try to fit them into a pre-defined structure. Indeed, in K2View Fabric, every business related object (e.g. Customer, Merchant) is represented by a Logical Unit Type. schema. This schema defines the relevant input objects associated with one Logical Unit Type. This process is either automated using K2View Fabric Auto-Discovery module or performed manually using K2View s drag-and-drop style graphical configuration dashboard, K2View Fabric Studio. The result is a business oriented structure containing tables and objects from as many systems as needed (e.g. for a Customer Logical Unit Type, 3 tables from the CRM system running on MySQL and 5 tables from the billing system residing on Oracle). This schema is used every time data is accessed in K2View Fabric: using embedded migration (ETL) capabilities, the data is processed, stored and distributed as Logical Unit Instances. Managing data as these logical, compressed and encrypted mini-databases enables incredible performance, enhanced security, high availability and customizable data synchronization. As such, the Logical Unit concept is a bridge between scattered, hard to maintain data and highly available, business-oriented data. Each Logical Unit Type is then associated with a

ARCHITECTURE The diagram below illustrates an overview of the K2View Fabric s architecture: INSIDE K2VIEW FABRIC CONFIGURATION: This layer contains the versioned configuration of every Logical Unit Type. This layer is accessed through our administration tools (K2 Admin Manager, K2View Fabric Studio and Web Admin interfaces). WEB/DATABASE SERVICES: This layer is used to communicate with user applications: either via direct queries (database services) or via web services. AUTHENTICATION ENGINE: This layer manages user access control and restrictions. MASKING LAYER: This layer is an optional layer that allows real time masking of sensitive data. PROCESSING ENGINE: This layer is where every data computation is managed. It uses the principles of massive parallel processing and map-reduce in order execute operations. SMART DATA CONTROLLER: This layer drives the real-time synchronization of data to K2View Fabric. ETL LAYER: This layer is K2View Fabric s embedded migration layer, allowing for automated ETL on retrieval. ENCRYPTION ENGINE: This layer manages the granular encryption of each data set. LU STORAGE MANAGER: This layer compresses and send data to the distributed database for storage. K2View Fabric leverages Cassandra as the distributed database. The communication between the distributed database is very straight forward, making K2View Fabric a flexible solution that can be adapted to any other distributed database.

BIG DATA FEATURES As presented above, K2View Fabric s architecture is built to address the challenges of Big Data. Therefore, it features state-of-the-art capabilities such as: In-Memory distributed performance Linear scalability on commodity hardware Consistency, Durability and High Availability Full SQL support and DB standard connectors This section will give a brief overview on how K2View Fabric provides this features. For more details about K2View Fabric features, please refer to our Technical White Paper. PERFORMANCE K2View Fabric s principal performance feature is its inherent Logical Unit representation running every query on small amount of data: this feature makes K2View Fabric the fastest database on the market. On top of this inherent design, K2View Fabric ensures performance using the two following major principles: Every query is executed in-memory. For analytics queries running across several Logical Unit Instances, K2View Fabric implements a proprietary map-reduce algorithm that breaks down this analytic query in small jobs distributed against K2View Fabric s nodes. Every computation is driven by K2View Fabric processing engine, which allows it to be executed and distributed across any node, thus offering Massive Parallel Processing (MPP). LINEAR SCALABILITY/LOW TCO As opposed to many big data solutions offering high-end in memory performances, K2View Fabric does not require storage of all data in memory or expensive hardware for scaling up performance. Thus K2View Fabric offers a very low Total Cost of Ownership (TCO). It relies on three very simple cornerstones: In-Memory performance on commodity hardware: only the computations are done in memory, the data is compressed and stored on disk. Complete linear scalability: driven by the distributed database. Risk-Free integration: see details in the next section. CONSISTENCY, DURABILITY, AVAILABILITY K2View Fabric ensures full consistency, guaranteed durability and high availability of the data it contains. Consistency is ensured by the Processing engine of K2View Fabric, using an internal and distributed transaction table to determine if a concurrent transaction is occurring and if the write should be put on hold. Durability and highavailability are inherent features of the distributed database layer (Cassandra). FULL SQL/STANDARD CONNECTORS The K2View Fabric Processing Engine uses two query methods depending on the type of data on which the query is executed: Query on single Logical Unit Instance (around 95% of overall queries): simple ANSI SQL query. Query across Logical Unit Instance (analytics): Map-Reduce engine reproducing SQL protocol. Both methods support everything that is supported in ANSI SQL. It also provides a proprietary indexing functionalities that not only allows indexing for faster performances but also regulating user access. Finally, K2View Fabric provides full JDBC support, and features connectors to all the most common databases on the market (e.g. Oracle, MySQL, PostgreSQL, Netezza, SQLServer, etc.).

KEY DIFFERENTIATORS While K2View Fabric offers the best features of big data architectures, it also provides unique functionalities that differentiate it from any other solution on the market, including: Embedded ETL/Data Masking Embedded Web Services Flexible Synchronization Row-level security EMBEDDED ETL/DATA MASKING K2View s industry proven ETL capabilities are embedded into K2View Fabric. The principles of the ETL are based on the logical unit data representation: by simply defining its schema, K2View Fabric automatically creates a migration path from all sources into a logical unit. Any type of enrichment (adding field, masking fields, etc.) can be applied during this definition. The ETL layer is triggered automatically if needed by the smart data controller, alleviating any need for external ETL tools or costly migration projects. EMBEDDED WEB SERVICES K2View Fabric offers an out-of-the-box configuration graphical interface to define web services: any function (which can be as simple as a query) can be created and registered as web service. Once the function is defined, K2View Fabric automatically ensures user access, distribution, updates due to schema changes, etc. The gain in time and effort is tremendous compared to traditional database management systems that require developing, distributing and maintaining a communication layer between them and your applications. The figure above illustrates the conceptual difference between an integration of a traditional solution (regardless of its architecture) and K2View Fabric. In a traditional solution, multiple complex custom elements must be developed in order to retrieve data from pre-existing systems. K2View Fabric on the other hand gets rid of any need for custom upstream or downstream development.

FLEXIBLE SYNCHRONIZATION K2View Fabric flexible data synchronization features are driven by its Smart Data Controller: any time data is accessed in K2View Fabric, the Smart Data Controller compares the current state of the data in K2View Fabric versus the synchronization parameters and update the data if needed whether it s a change in the K2View Fabric schema or triggered by one of the synchronization mode described below: ON-DEMAND SYNC K 2 V i e w F a b r i c a l l o w s d a t a synchronization to be triggered by ondemand calls. These calls can be triggered by web services, batch scripts or directly querying K2View Fabric (administrative mode). allows complete control over your data encryption. It relies on three set of keys: Master Key: Generated during K2View Fabric installation, this is the main key allowing access to every resource of K2View Fabric. Type Keys: These keys restrict access at the Logical Unit Type level and are a hash of the Master Key. Instance Keys: These keys restrict access at the Logical Unit Instance level and are a hash of their corresponding type key. EVENT-BASED SYNC Alternatively, synchronization can be triggered using the principles of Change Data Capture (CDC). Using this mode, K2View Fabric automatically captures changes in the source systems that are part of its schema. ALWAYSYNC K2View Fabric features an intelligent and flexible way to synchronize data: AlwaySync. This mode allows complete granularity over the data that needs to be synchronized with source systems. Using AlwaySync, K2View Fabric allows you to configure what data needs to be refreshed automatically, and how frequently. For each element of the K2View Fabric schema, an AlwaySync timer that will be driving the K2View Fabric synchronization is set (e.g. if the usage information from the Customer table needs to be updated every 5 minutes, a timer of 5 minutes). ROW-LEVEL SECURITY K2View Fabric features a proprietary algorithm Hierarchical Encryption-Key Schema (HEKS) that In the figure above, you can see how HEKS is implemented for two LU types. Indeed, you can see the following keys: 1 Master Key allowing full access 2 Type Keys restricting access to 2 different LU Types 6 Instance Keys, 3 for each LU Types restricting access at the LU Instance level Using this hierarchical encryption, K2View Fabric allows complete control over the stored data and significantly the risk of data leaks: even if one Instance Key were to be hacked, only the data of one instance would be leaked; all other instances data is still safely encrypted. Therefore, this design makes K2View Fabric the most secure database on the market, essentially rendering massive data breaches to be impossible.

SUPPORTED FEATURES Traditional Big-Data Fabric No SPoF Consistency, Durability and High-Availability Low TCO and in-memory performance Embedded ETL Embedded Data Masking Embedded Web-Service layer Row Level security FREQUENTLY ASKED QUESTIONS What is the main difference between security in K2View Fabric versus a traditional RDBMS? Traditional RDBMS can t restrict and encrypt access at an instance level. You either have access to the full table containing customer information or you don t. Using K2View Fabric, you can define row-level security. With such rich synchronization features, how do you ensure performance? K2View Fabric provides high-end performances by first processing only the data related to one Logical Unit Instance, hence reducing the amount of data. Moreover, the processing layer only execute actions in memory, and maintains a data cache for frequent use. Finally, for processing across Logical Unit Instances, K2View Fabric uses map-reduce to implement fast queries. What is the difference between fully migrating to K2View Fabric or a traditional RDBMS? Migration is a feature of K2View Fabric. Migrations to traditional RDBMS require the development, testing and deployment of a specific migration tool. How many processing/sync/data storage layers are there in K2View Fabric? There are as many layers as there are Cassandra nodes in your deployment. This allows for full parallel execution between nodes.

CONFIDENTIALITY This document contains copyrighted work and proprietary information belonging to K2View. This document and information contained herein are delivered to you as is, and K2View makes no warranty whatsoever as to its accuracy, completeness, fitness for a particular purpose, or use. Any use of the documentation and/or the information contained herein, is at the user's risk, and K2View is not responsible for any direct, indirect, special, incidental, or consequential damages arising out of such use of the documentation. Technical or other inaccuracies, as well as typographical errors, may occur in this Guide. CONTACT INFORMATION www.k2view.com info@k2view.com +1-844-438-2443 This document and the information contained herein and any part thereof are confidential and proprietary to K2View. All intellectual property rights (including, without limitation, copyrights, trade secrets, trademarks, etc.) evidenced by or embodied in and/or attached, connected, or related to this Guide, as well as any information contained herein, are and shall be owned solely by K2View. K2View does not convey to you an interest in or to this Guide, to information contained herein, or to its intellectual property rights, but only a personal, limited, fully revocable right to use the Guide solely for reviewing purposes. Unless explicitly set forth otherwise, you may not reproduce by any means any document and/or copyright contained herein. Information in this Guide is subject to change without notice. Corporate and individual names and data used in examples herein are fictitious unless otherwise noted. Copyright 2015 K2View Ltd./K2VIEW LLC. All rights reserved. The following are trademark of K2View: K2View logo, K2View's platform. K2View reserves the right to update this list from time to time. Other company and brand products and service names in this Guide are trademarks or registered trademarks of their respective holders.