REAL-TIME BIG DATA ANALYTICS



Similar documents
How To Handle Big Data With A Data Scientist

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Cloud Scale Distributed Data Storage. Jürmo Mehine

Enterprise Operational SQL on Hadoop Trafodion Overview

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Luncheon Webinar Series May 13, 2013

How To Use Big Data For Telco (For A Telco)

Making Sense of Big Data in Insurance

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

Big Data Technologies Compared June 2014

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Data Integration Checklist

TRAINING PROGRAM ON BIGDATA/HADOOP

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

Applications for Big Data Analytics

Oracle Big Data SQL Technical Update

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Thomas Baumann Swiss Mobiliar Bern, Switzerland

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Can the Elephants Handle the NoSQL Onslaught?

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

Large Scale/Big Data Federation & Virtualization: A Case Study

The Internet of Things and Big Data: Intro

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Putting Apache Kafka to Use!

Next-Generation Cloud Analytics with Amazon Redshift

How to Enhance Traditional BI Architecture to Leverage Big Data

Cloud Big Data Architectures

Traditional BI vs. Business Data Lake A comparison

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Real Time Big Data Processing

Data Modeling for Big Data

Ganzheitliches Datenmanagement

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

Transforming the Telecoms Business using Big Data and Analytics

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

Big Data Management in the Clouds. Alexandru Costan IRISA / INSA Rennes (KerData team)

Big Data Architectures. Tom Cahill, Vice President Worldwide Channels, Jaspersoft

Teradata s Big Data Technology Strategy & Roadmap

The Future of Data Management

NoSQL Databases. Nikos Parlavantzas

IoT and Big Data- The Current and Future Technologies: A Review

Sentimental Analysis using Hadoop Phase 2: Week 2

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ANALYTICS BUILT FOR INTERNET OF THINGS

Databases 2 (VU) ( )

Three Open Blueprints For Big Data Success

#TalendSandbox for Big Data

Choosing The Right Big Data Tools For The Job A Polyglot Approach

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

How to Choose Between Hadoop, NoSQL and RDBMS

SAP and Hortonworks Reference Architecture

Hadoop & Spark Using Amazon EMR

Business Intelligence and Column-Oriented Databases

Bringing Big Data to People

Why NoSQL databases are needed for the Internet of Things

BIG DATA TOOLS. Top 10 open source technologies for Big Data

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

CIO Guide How to Use Hadoop with Your SAP Software Landscape

Integrating a Big Data Platform into Government:

Big Data Analytics - Accelerated. stream-horizon.com

Crack Open Your Operational Database. Jamie Martin September 24th, 2013

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

NoSQL in der Cloud Why? Andreas Hartmann

Data Warehousing in the Age of Big Data

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

The Top 10 7 Hadoop Patterns and Anti-patterns. Alex

Big Data Success Step 1: Get the Technology Right

SAP HANA From Relational OLAP Database to Big Data Infrastructure

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Information Architecture

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Roadmap Talend : découvrez les futures fonctionnalités de Talend

W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract

Big Data and Data Science: Behind the Buzz Words

Transcription:

www.leanxcale.com info@leanxcale.com REAL-TIME BIG DATA ANALYTICS Blending Transactional and Analytical Processing Delivers Real-Time Big Data Analytics

2 ULTRA-SCALABLE FULL ACID FULL SQL DATABASE LeanXcale is an integral ultra-scalable big data management platform What is LeanXcale LeanXcale provides a full ACID database that can scale in any dimension an enterprise might need (the three Vs of Big Data) in terms of data Volume (to terabytes), data Velocity (to million transactions per second) and data Variety (SQL, key-value, graphs, documents, Hadoop files, event streams). The main distinctive feature of LeanXcale with respect to existing products is that it provides an ultra-scalable highly available full SQL and full ACID database. It is able to do so thanks to a radical new approach (with a patent filed) to process transactions that enables to reach very large update transaction rates (e.g. a million update transactions per second). LeanXcale provides also added value by integrating its ultra-scalable transactional technology with other data management techniques such as key-value data stores, graph databases, document oriented databases, etc.

REAL-TIME BIG DATA ANALTYICS 3 Business Analytics The cost of ETLs Today companies require agile business analytics. Companies lose a lot of agility because they use two different kinds of database systems for their data management. They use transactional SQL databases (On Line Transactional Processing OLTP) for their production database to guarantee that data is consistent despite failures and concurrent updates relying on full ACID transactions. At the same time, On Line Analytical Processing (OLAP) technology (i.e., data warehouses) is used to run their analytical queries. These queries are heavy and require traversing lots of data. LeanXcale provides an ultra-scalable full ACID and full SQL database Because these two databases are different systems, it is required to copy the data from the OLTP database to the OLAP database in a complex and expensive process called Extract- Transform-Load or ETL. This process is very expensive, ETLs are estimated to be 80% of the cost of doing business analytics, what is a total nonsense.

4 REAL-TIME BIG DATA ANALYTICS What is even worse, this process requires carefully planning so it does not interfere with the OLTP database. It also requires to plan for contingencies for those cases in which the ETL process fails to complete in the time window it should be performed (e.g. by night). Moreover, the data used for the analytics is not the latest one but, typically a copy from the previous day. That is, ETLs are not only expensive, but they also create many organizational headaches. But most importantly, they reduce business agility in many cases because analytics cannot be performed over the real-time data. LeanXcale provides a full ACID HBase, combining its scalability with full ACID transactional consistency LeanXcale thanks to its ultra-scalable transactional processing system blends the two worlds into a single database delivering Real-Time Big Data Analytics, that is, answering analytical queries over the current operational company s data in a real-time manner saving the cost and overhead of today s infrastructure based on ETLs.

FULL ACID NOSQL 5 5 Two separate databases The polyglot persistence Many companies resort to different data management technologies such as NoSQL (e.g. key-value data stores, document-oriented data stores and graph databases), traditional SQL and/or event processing. The reason for this trend, so called polyglot persistence, is that some applications require new data models (e.g. documents or graphs) and specialized query languages/apis for them (e.g. graph or document query language). However, this approach comes with a price. By keeping data on different data stores two important properties are lost. The first one is data consistency. If a business transaction requires updating multiple data stores, now data consistency is not guaranteed due to the lack of ACID transactions across data stores. The second loss is that querying data across data stores becomes expensive. It either requires 1) moving all the data from all data stores to a data warehouse or 2) having to write a program to perform the query (as opposed to write declaratively the query in SQL, for instance). Companies, therefore, find themselves in an uncomfortable position. LeanXcale is also currently integrating MongoDB (the leader document-oriented data store) and Neo4J (the leader graph database). Therefore, LeanXcale platform will provide consistent polyglot persistence across the major NoSQL data stores. LeanXcale provides ultra-scalable full ACID holistic transactions across NoSQL data stores and SQL tables LeanXcale combines all in one LeanXcale removes the two major pains introduced by polyglot persistence. On one hand, LeanXcale is integrating its ultra-scalable transactional management with different NoSQL data stores. LeanXcale provides a full ACID HBase, combining its scalability with full ACID transactional consistency.

64 ULTRA-SCALABLE CORRELATION OF EVENTS On the other hand, LeanXcale is integrating its SQL engine to enable the execution of queries across multiple NoSQL data stores and LeanXcale SQL tables. This integration enables to write subqueries in the native language/api of the data stores and observe the result as SQL tables that are queried by a SQL query. In this way, LeanXcale is blending the best of two worlds, NoSQL and SQL, in a single unified data management framework that reduces the development and maintenance costs by a high percentage. LeanXcale enables to correlate events at large volumes with data stored in its ultra-scalable database Complex event processing (CEP) has become the leading technology in areas such as Machine-to-Machine (M2M) and Internet of Things (IoT) where applications have to deal with streaming events or data. Today, CEP technology is able to scale event processing. However, most applications do not only need to scale event processing, but also require correlating events with stored data at large rates and/or updating stored data with information from the events. LeanXcale combines all in one LeanXcale has integrated its technology with scalable CEP technology, such as Storm, enabling the correlation of massive streams of events (e.g. millions of events per second) with our ultra-scalable SQL database by means of arbitrary SQL statements that query and/or update the stored data. This feature converts LeanXcale in the ideal data management platform to develop M2M and IoT applications with the guarantee that scalability of data management will not be any more a problem to avoid but a feature to leverage and exploit. Companies are also now embracing the idea of Hadoop data lakes to store all their files at an affordable price. LeanXcale has integrated its OLAP query engine with Hadoop files, so by defining the metadata and parsing of a Hadoop file the same way one would do to exploit them with Hive, now it becomes a LeanXcale SQL table that can be queried and correlated with any other LeanXcale SQL table. In this way, LeanXcale also integrates and simplifies access to Hadoop files. LeanXcale is also significantly more efficient than map-reduce to query data resulting in a reduction of the footprint in your data management system.

LeanXcale www.leanxcale.com info@leanxcale.com