SQream Technologies Ltd - Confiden7al



Similar documents
Big Data and Its Impact on the Data Warehousing Architecture

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

Performance Management in Big Data Applica6ons. Michael Kopp, Technology

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc.

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Scaling Your Data to the Cloud

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Cost-Effective Business Intelligence with Red Hat and Open Source

Big Data Analytics Nokia

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

Turning Data Into Answers With HP Vertica

Big Data Analytics: Today's Gold Rush November 20, 2013

BIG DATA What it is and how to use?

Performance and Scalability Overview

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

NextGen Infrastructure for Big DATA Analytics.

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Real Time Big Data Processing

Using RDBMS, NoSQL or Hadoop?

The Rise of Industrial Big Data. Brian Courtney General Manager Industrial Data Intelligence

Main Memory Data Warehouses

Introducing Oracle Exalytics In-Memory Machine

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Capacity Management for Oracle Database Machine Exadata v2

So What s the Big Deal?

Big Data Technologies Compared June 2014

Performance and Scalability Overview

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Big Data: Are You Ready? Kevin Lancaster

Oracle Database In-Memory The Next Big Thing

DNS Big Data

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Il mondo dei DB Cambia : Tecnologie e opportunita`

Large scale processing using Hadoop. Ján Vaňo

The Future of Data Management

SAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence

SAP HANA In-Memory Database Sizing Guideline

Testing Big data is one of the biggest

Database Scalability and Oracle 12c

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

SQL Server PDW. Artur Vieira Premier Field Engineer

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Big Data Use Case. How Rackspace is using Private Cloud for Big Data. Bryan Thompson. May 8th, 2013

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Hadoop implementation of MapReduce computational model. Ján Vaňo

Cisco and Splunk: Under the Hood of Cisco IT

Big Data Analytics - Accelerated. stream-horizon.com

The Flash Transformed Data Center & the Unlimited Future of Flash John Scaramuzzo Sr. Vice President & General Manager, Enterprise Storage Solutions

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BIG DATA AND INVESTIGATIVE ANALYTICS

Oracle Big Data SQL Technical Update

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

2009 Oracle Corporation 1

The Internet of Things and Big Data: Intro

Oracle MulBtenant Customer Success Stories

How To Handle Big Data With A Data Scientist

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

White Paper - GPU-Based SQL Database. SQream Technologies. SQream DB GPU-Based SQL Database Technical Overview White Paper

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Big Data Are You Ready? Thomas Kyte

An Oracle White Paper October Oracle: Big Data for the Enterprise

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Integrating Apache Spark with an Enterprise Data Warehouse

System Architecture. In-Memory Database

TUT NoSQL Seminar (Oracle) Big Data

SQLstream Blaze and Apache Storm A BENCHMARK COMPARISON

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

The Flash-Transformed Financial Data Center. Jean S. Bozman Enterprise Solutions Manager, Enterprise Storage Solutions Corporation August 6, 2014

Building your Big Data Architecture on Amazon Web Services

Maximizing Hadoop Performance with Hardware Compression

Architecting for the Internet of Things & Big Data

ANALYTICS IN BIG DATA ERA

I/O Considerations in Big Data Analytics

Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud

Analyzing Big Data with AWS

SQL Server 2012 Parallel Data Warehouse. Solution Brief

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

The Enterprise Data Hub and The Modern Information Architecture

An Oracle White Paper June Oracle: Big Data for the Enterprise

The 3 questions to ask yourself about BIG DATA

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

ANALYTICS BUILT FOR INTERNET OF THINGS

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Transcription:

SQream Technologies Ltd - Confiden7al 1

Ge#ng Big Data Done On a GPU- Based Database Ori Netzer VP Product 26- Mar- 14

Analy7cs Performance - 3 TB, 18 Billion records SQream Database 400x More Cost Efficient!

Big Data 3V s - Volume 4

Big Data 3V s Vareity 5

Big Data 3V s - velocity 6

Structured Data 7

Un- Structured Data 8

Big Data Challenges Volume Growth rate, history Work Data Modeling, ETL Big Data Performance Batch, In- rest, In- Mo7on Variety Data Types, logs, video Velocity Throughput Transac7on/Seconds Time Work, storage, ETL, computa7on 9

Big Data Challenges Hardware Disk Space Processing Power Network Skill Set Large clusters New querying language New methods 10

GPU advantages in a nutshell Massive parallel compu7ng power Aggressive data crunching Extreme scalability Low power consump7on

GPU Challenges in a nutshell PCI- E Limited GRAM New Algorithms SIMD

SQream Top Level Architecture

The Product - under the hood 14

SQream Data Storage 100T Data in rows + Indices + Views 1 2 3 4 2 a 1 c 3 d 1 C 4 d 1 c 2 4 a d d c d j 50T Columnar Data a 1 c d 1 C d 1 c 9T Crunched Data + Smart Meta Data a d 1 c

Storage Foot Print Before 100T AIer 9T

DBMS - Query Run Time No Op7miza7ons Select T, sum(a) from Example group by T having (A) > 1000 Read ALL data Process ALL data Query output A B C D E F 1010 N 1 a 10 25 66 CPU 900 N 2 b 54 55 55 2 x = 16 cores Long execu7on 7me 5000 N 3 c 754 54 58 500 N 4 f 748 554 85 800 N 5 a 47 55 88 600 N 6 r 58 555 478

SQream - Query Run Time Select T, sum(a) from Example group by T having (A) > 1000 Read Pin Point data Process Relevant data Query output A B C D E F 1010 N 1 a 10 25 66 900 N 2 b 54 55 55 5000 N 3 c 754 54 58 500 N 4 f 748 554 85 800 N 5 a 47 55 88 2 x GPU = ~5600 cores GPUs Shortest execu7on 7me CPUs 600 N 6 r 58 555 478

SQream as Data Warehouse

CEP Cluster Data stream: web logs Real time stream processing Ad hoc query processing Direct GPU SQREAM Web App Direct GPU Data stream: CDR InfiniBand Direct GPU SQREAM SQREAM PBX Apply new query on Data in Motion Apply Real time Ad Hoc query on all data in rest

High Availability Cluster

Analytics as a Services (AaaS)

USE CASES & BENCHMARKS 23 23

Lab Results 100GB TPC H 450 400 390 350 Time in Seconds 300 250 200 150 100 106 50 0 17 Sqream Infobright MSSQL 24

SQream vs. Redshii vs. Hadoop* Query Run 7me based on Hapyrus benchmark SQream vs. Redshii vs. Hadoop 1600 1491 1400 1200 Time in Sec 1000 800 600 Sqream Redshii Hadoop 400 200 0 29 155 Amount of Data hkp://www.hapyrus.com/blog/posts/behind- amazon- redshii- is- 10x- faster- and- cheaper- than- hadoop- hive- slides 25

SQream vs. Redshii vs. Hadoop* Data Load Time based on Hapyrus benchmark 18 SQream vs. Redshii vs. Hadoop 16 14 12 Time in Hours 10 8 6 Sqream Redshii 4 2 0 Amount of Data hkp://www.hapyrus.com/blog/posts/behind- amazon- redshii- is- 10x- faster- and- cheaper- than- hadoop- hive- slides 26

Cyber Analysis Pakern Matching Use Case load and analyze massive network traffic The Test Load 6000 IP CDR per second Run queries under 10 seconds

Cyber Use Case Exis7ng Status Hardware HP Superdome 32 Cores EMC Storage Total Hardware cost: ~$150,000 SoIware Oracle 11g

Cyber Use Case SQream Solu7on Hardware Dell T7600 Total Hardware cost: ~$7,000 SoIware SQream DB Dell T7600

Results for 8.5B Records ~17TB Run7me query results for IP Traffic Analysis by query number 12 11 10 10 9 Time in Seconds 8 6 4 8 2 0 0.8 1 0.6 0.4 0.1 1 2 3 4 5 2 Sqream Oracle

SQream POC on Databases with Orange Silicon Valley Soumik Sinharoy Sr. Product Manager at Orange

SQream s DB Saves ~$6,000,000 on Big Data insights $250,000 Price list $6,300,000 Price list X40 cost performance

Data Is Growing World Wide IP Traffic Growth

Orange Group

Our Data Is Growing Orange France IP Traffic Growth Monthly CDR Volume Today : 60 Billion Q4 2014 : 150 Billion

What We Plan To Do With Big Data Airborne Datacenters - SuperCompu_ng Real_me processing of sensor data Massive database consolida_on : 400 X savings in CapEX 37

Current Challenges Problem Statement - Massive Data Store - Mul7 Dimensional Queries - 4-9 Billion CDR - Reduce execu7on 7me - Reduce TCO IP CDR IPTV VoD Fixed Mul_ Media

Revenue Assurance POC vs. Leading RDBMS Scan & Join 35 Billion CDR and 30M Subscribers Challenges: Require massive investment in Hardware DW system Require have manual process for op7mizing the soiware Doesn t scale Time in minutes 70 60 50 40 30 Loading Time 300M CDRs 60 20 10 9 0 39

Query Run Time 300 Million CDRs Q2 Map load on the network by hour, total call, minutes Q3 Map by hour and call type, break down for the load on the network by call type. Sms, data, voice Q4 Map load on the network in specific 7me Q5 - Map load on the network in specific 7me and specific service: voice, sms, data SQream vs. leading RDBMS run7me 300 million leading RDBMS

Mul_ Media Repor_ng IP Traffic Analysis on anonymized IP CDR from Orange France Data coming from MMT are monthly received on MMDM server Data are processed for a dynamic repor7ng applica7on Challenges: Require massive investment in Hardware DW system Require have manual process for op7mizing the soiware Time in minutes 70 60 50 40 30 20 10 Loading Time 300M CDRs 9 60 0 41

~4,300,000,000 Call records ~30,000,000 Subscribers 4 Months 1 Operator - Orange 14 Servers 168 CPU cores 1,344GB RAM 67,200GB SSD 403,200GB HDD 875 Seconds

~4,300,000,000 Call records ~30,000,000 Subscribers 4 Months 1 Operator - Orange 1x K40 GPU 1 Server 8 CPU cores 128GB RAM 3,200GB SSD 10,000GB HDD $250,000 Price list 636 Seconds 18 month of data was also tested Execu7on Time = 2484 Seconds

Less is More SQream vs. Leading EDW Q1 Aggrega7on query for IP u7liza7on by users Q2 Group By query for 1 dimension 800 700 600 500 Time(Sec) 400 300 200 100 0 Q1 Q2 Q1 Q2 1 Month 4 Months

SQream Yes, We Scale Linearly! 3000 Grouping for all costumer segments - linear scaling 1 month to 18 months 2500 2000 Seconds 1500 1000 500 0 1 Month 4 Months 9 Months 13 Months 18 Months Months

QUESTIONS??? 46

Why SQream? Technology Data Crunching: Faster compression 7me x20 Faster decompression 7me x50-70 Higher compression ra7o x5-15 Compute: Faster MPP in a node x20 Higher scalability x1 node x3000 cores Lower hardware cost $7,000,000 > $15K

Why SQream? Hassle free Solu7on TCO Lower power consump7on 1 node vs. 10 nodes Lower storage and license cost x10 x50 Lower footprint 2U vs. 20U Deployment advantages No indexes, no cubicles, no projec7ons. Built- in full scan ad- hoc queries support. No code changes. Simple SQL.

SQream the fastest analy_cs ever Thank you