INTRODUCTION TO K2VIEW FABRIC A WHITE PAPER BY K2VIEW. ABSTRACT Across industries, the amount of data to be managed is exponentially growing, and with it the need for modern, fully-distributed and scalable data management systems - often referred as big data architectures. As this new market opens, many solutions arise, providing fully distributed and scalable systems to manage Big Data. These solutions do answer the volume and administration problems but still present some caveats: They often require a lot of effort to integrate into an existing mature environment. They still use the same type of outdated data representation that is the relational database model (see details next page). K2View Fabric is designed to solve today s big data problem while alleviating these caveats, as this white paper will introduce. K2VIEW FABRIC K2View Fabric is an innovative and revolutionary database management system, residing on top of Apache Cassandra. It provides an easy, secure and reliable way to consolidate your data and distribute it over your network for high availability.
AT THE HEART OF K2VIEW FABRIC: THE LOGICAL UNIT K2View Fabric uses a game-changing data model to retrieve and store data: the Logical Unit. Most database management systems store data based on the type of data being stored (e.g. customer data, financial data, address data, device data); this model translates into very large tables that must be queried using complex joins every time one wants to access business relevant data (e.g. how many payments has this customer made within the past three months?). K2View s solutions look at data a different way: storing and retrieving it based on business logic, hence the name Logical Unit. This allows the business to easily design K2View Fabric s base schema based on their needs, as opposed to try to fit them into a pre-defined structure. Indeed, in K2View Fabric, every business related object (e.g. Customer, Merchant) is represented by a Logical Unit Type. schema. This schema defines the relevant input objects associated with one Logical Unit Type. This process is either automated using K2View Fabric Auto-Discovery module or performed manually using K2View s drag-and-drop style graphical configuration dashboard, K2View Fabric Studio. The result is a business oriented structure containing tables and objects from as many systems as needed (e.g. for a Customer Logical Unit Type, 3 tables from the CRM system running on MySQL and 5 tables from the billing system residing on Oracle). This schema is used every time data is accessed in K2View Fabric: using embedded migration (ETL) capabilities, the data is processed, stored and distributed as Logical Unit Instances. Managing data as these logical, compressed and encrypted mini-databases enables incredible performance, enhanced security, high availability and customizable data synchronization. As such, the Logical Unit concept is a bridge between scattered, hard to maintain data and highly available, business-oriented data. Each Logical Unit Type is then associated with a
ARCHITECTURE The diagram below illustrates an overview of the K2View Fabric s architecture: INSIDE K2VIEW FABRIC CONFIGURATION: This layer contains the versioned configuration of every Logical Unit Type. This layer is accessed through our administration tools (K2 Admin Manager, K2View Fabric Studio and Web Admin interfaces). WEB/DATABASE SERVICES: This layer is used to communicate with user applications: either via direct queries (database services) or via web services. AUTHENTICATION ENGINE: This layer manages user access control and restrictions. MASKING LAYER: This layer is an optional layer that allows real time masking of sensitive data. PROCESSING ENGINE: This layer is where every data computation is managed. It uses the principles of massive parallel processing and map-reduce in order execute operations. SMART DATA CONTROLLER: This layer drives the real-time synchronization of data to K2View Fabric. ETL LAYER: This layer is K2View Fabric s embedded migration layer, allowing for automated ETL on retrieval. ENCRYPTION ENGINE: This layer manages the granular encryption of each data set. LU STORAGE MANAGER: This layer compresses and send data to the distributed database for storage. K2View Fabric leverages Cassandra as the distributed database. The communication between the distributed database is very straight forward, making K2View Fabric a flexible solution that can be adapted to any other distributed database.
BIG DATA FEATURES As presented above, K2View Fabric s architecture is built to address the challenges of Big Data. Therefore, it features state-of-the-art capabilities such as: In-Memory distributed performance Linear scalability on commodity hardware Consistency, Durability and High Availability Full SQL support and DB standard connectors This section will give a brief overview on how K2View Fabric provides this features. For more details about K2View Fabric features, please refer to our Technical White Paper. PERFORMANCE K2View Fabric s principal performance feature is its inherent Logical Unit representation running every query on small amount of data: this feature makes K2View Fabric the fastest database on the market. On top of this inherent design, K2View Fabric ensures performance using the two following major principles: Every query is executed in-memory. For analytics queries running across several Logical Unit Instances, K2View Fabric implements a proprietary map-reduce algorithm that breaks down this analytic query in small jobs distributed against K2View Fabric s nodes. Every computation is driven by K2View Fabric processing engine, which allows it to be executed and distributed across any node, thus offering Massive Parallel Processing (MPP). LINEAR SCALABILITY/LOW TCO As opposed to many big data solutions offering high-end in memory performances, K2View Fabric does not require storage of all data in memory or expensive hardware for scaling up performance. Thus K2View Fabric offers a very low Total Cost of Ownership (TCO). It relies on three very simple cornerstones: In-Memory performance on commodity hardware: only the computations are done in memory, the data is compressed and stored on disk. Complete linear scalability: driven by the distributed database. Risk-Free integration: see details in the next section. CONSISTENCY, DURABILITY, AVAILABILITY K2View Fabric ensures full consistency, guaranteed durability and high availability of the data it contains. Consistency is ensured by the Processing engine of K2View Fabric, using an internal and distributed transaction table to determine if a concurrent transaction is occurring and if the write should be put on hold. Durability and highavailability are inherent features of the distributed database layer (Cassandra). FULL SQL/STANDARD CONNECTORS The K2View Fabric Processing Engine uses two query methods depending on the type of data on which the query is executed: Query on single Logical Unit Instance (around 95% of overall queries): simple ANSI SQL query. Query across Logical Unit Instance (analytics): Map-Reduce engine reproducing SQL protocol. Both methods support everything that is supported in ANSI SQL. It also provides a proprietary indexing functionalities that not only allows indexing for faster performances but also regulating user access. Finally, K2View Fabric provides full JDBC support, and features connectors to all the most common databases on the market (e.g. Oracle, MySQL, PostgreSQL, Netezza, SQLServer, etc.).
KEY DIFFERENTIATORS While K2View Fabric offers the best features of big data architectures, it also provides unique functionalities that differentiate it from any other solution on the market, including: Embedded ETL/Data Masking Embedded Web Services Flexible Synchronization Row-level security EMBEDDED ETL/DATA MASKING K2View s industry proven ETL capabilities are embedded into K2View Fabric. The principles of the ETL are based on the logical unit data representation: by simply defining its schema, K2View Fabric automatically creates a migration path from all sources into a logical unit. Any type of enrichment (adding field, masking fields, etc.) can be applied during this definition. The ETL layer is triggered automatically if needed by the smart data controller, alleviating any need for external ETL tools or costly migration projects. EMBEDDED WEB SERVICES K2View Fabric offers an out-of-the-box configuration graphical interface to define web services: any function (which can be as simple as a query) can be created and registered as web service. Once the function is defined, K2View Fabric automatically ensures user access, distribution, updates due to schema changes, etc. The gain in time and effort is tremendous compared to traditional database management systems that require developing, distributing and maintaining a communication layer between them and your applications. The figure above illustrates the conceptual difference between an integration of a traditional solution (regardless of its architecture) and K2View Fabric. In a traditional solution, multiple complex custom elements must be developed in order to retrieve data from pre-existing systems. K2View Fabric on the other hand gets rid of any need for custom upstream or downstream development.
FLEXIBLE SYNCHRONIZATION K2View Fabric flexible data synchronization features are driven by its Smart Data Controller: any time data is accessed in K2View Fabric, the Smart Data Controller compares the current state of the data in K2View Fabric versus the synchronization parameters and update the data if needed whether it s a change in the K2View Fabric schema or triggered by one of the synchronization mode described below: ON-DEMAND SYNC K 2 V i e w F a b r i c a l l o w s d a t a synchronization to be triggered by ondemand calls. These calls can be triggered by web services, batch scripts or directly querying K2View Fabric (administrative mode). allows complete control over your data encryption. It relies on three set of keys: Master Key: Generated during K2View Fabric installation, this is the main key allowing access to every resource of K2View Fabric. Type Keys: These keys restrict access at the Logical Unit Type level and are a hash of the Master Key. Instance Keys: These keys restrict access at the Logical Unit Instance level and are a hash of their corresponding type key. EVENT-BASED SYNC Alternatively, synchronization can be triggered using the principles of Change Data Capture (CDC). Using this mode, K2View Fabric automatically captures changes in the source systems that are part of its schema. ALWAYSYNC K2View Fabric features an intelligent and flexible way to synchronize data: AlwaySync. This mode allows complete granularity over the data that needs to be synchronized with source systems. Using AlwaySync, K2View Fabric allows you to configure what data needs to be refreshed automatically, and how frequently. For each element of the K2View Fabric schema, an AlwaySync timer that will be driving the K2View Fabric synchronization is set (e.g. if the usage information from the Customer table needs to be updated every 5 minutes, a timer of 5 minutes). ROW-LEVEL SECURITY K2View Fabric features a proprietary algorithm Hierarchical Encryption-Key Schema (HEKS) that In the figure above, you can see how HEKS is implemented for two LU types. Indeed, you can see the following keys: 1 Master Key allowing full access 2 Type Keys restricting access to 2 different LU Types 6 Instance Keys, 3 for each LU Types restricting access at the LU Instance level Using this hierarchical encryption, K2View Fabric allows complete control over the stored data and significantly the risk of data leaks: even if one Instance Key were to be hacked, only the data of one instance would be leaked; all other instances data is still safely encrypted. Therefore, this design makes K2View Fabric the most secure database on the market, essentially rendering massive data breaches to be impossible.
SUPPORTED FEATURES Traditional Big-Data Fabric No SPoF Consistency, Durability and High-Availability Low TCO and in-memory performance Embedded ETL Embedded Data Masking Embedded Web-Service layer Row Level security FREQUENTLY ASKED QUESTIONS What is the main difference between security in K2View Fabric versus a traditional RDBMS? Traditional RDBMS can t restrict and encrypt access at an instance level. You either have access to the full table containing customer information or you don t. Using K2View Fabric, you can define row-level security. With such rich synchronization features, how do you ensure performance? K2View Fabric provides high-end performances by first processing only the data related to one Logical Unit Instance, hence reducing the amount of data. Moreover, the processing layer only execute actions in memory, and maintains a data cache for frequent use. Finally, for processing across Logical Unit Instances, K2View Fabric uses map-reduce to implement fast queries. What is the difference between fully migrating to K2View Fabric or a traditional RDBMS? Migration is a feature of K2View Fabric. Migrations to traditional RDBMS require the development, testing and deployment of a specific migration tool. How many processing/sync/data storage layers are there in K2View Fabric? There are as many layers as there are Cassandra nodes in your deployment. This allows for full parallel execution between nodes.
CONFIDENTIALITY This document contains copyrighted work and proprietary information belonging to K2View. This document and information contained herein are delivered to you as is, and K2View makes no warranty whatsoever as to its accuracy, completeness, fitness for a particular purpose, or use. Any use of the documentation and/or the information contained herein, is at the user's risk, and K2View is not responsible for any direct, indirect, special, incidental, or consequential damages arising out of such use of the documentation. Technical or other inaccuracies, as well as typographical errors, may occur in this Guide. CONTACT INFORMATION www.k2view.com info@k2view.com +1-844-438-2443 This document and the information contained herein and any part thereof are confidential and proprietary to K2View. All intellectual property rights (including, without limitation, copyrights, trade secrets, trademarks, etc.) evidenced by or embodied in and/or attached, connected, or related to this Guide, as well as any information contained herein, are and shall be owned solely by K2View. K2View does not convey to you an interest in or to this Guide, to information contained herein, or to its intellectual property rights, but only a personal, limited, fully revocable right to use the Guide solely for reviewing purposes. Unless explicitly set forth otherwise, you may not reproduce by any means any document and/or copyright contained herein. Information in this Guide is subject to change without notice. Corporate and individual names and data used in examples herein are fictitious unless otherwise noted. Copyright 2015 K2View Ltd./K2VIEW LLC. All rights reserved. The following are trademark of K2View: K2View logo, K2View's platform. K2View reserves the right to update this list from time to time. Other company and brand products and service names in this Guide are trademarks or registered trademarks of their respective holders.