From Spark to Ignition:



Similar documents
Databricks. A Primer

Databricks. A Primer

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

SQLstream Blaze and Apache Storm A BENCHMARK COMPARISON

How To Make Data Streaming A Real Time Intelligence

CitusDB Architecture for Real-Time Big Data

Real Time Big Data Processing

Streaming Big Data Performance Benchmark for Real-time Log Analytics in an Industry Environment

Streaming Big Data Performance Benchmark. for

An Industrial Perspective on the Hadoop Ecosystem. Eldar Khalilov Pavel Valov

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

In-memory computing with SAP HANA

Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015

How Companies are! Using Spark

SEIZE THE DATA SEIZE THE DATA. 2015

Dell* In-Memory Appliance for Cloudera* Enterprise

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Use Case: Business Analytics

Where is... How do I get to...

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success

INTRODUCTION TO CASSANDRA

Zynga Analytics Leveraging Big Data to Make Games More Fun and Social

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

Putting Apache Kafka to Use!

Hadoop & Spark Using Amazon EMR

HDP Hadoop From concept to deployment.

Big Data Analytics - Accelerated. stream-horizon.com

Cisco Data Preparation

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Building Real-Time Data Pipelines

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

BASHO DATA PLATFORM SIMPLIFIES BIG DATA, IOT, AND HYBRID CLOUD APPS

Ali Ghodsi Head of PM and Engineering Databricks

Making big data simple with Databricks

Internet of Things. Opportunity Challenges Solutions

SQLstream 4 Product Brief. CHANGING THE ECONOMICS OF BIG DATA SQLstream 4.0 product brief

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Dell In-Memory Appliance for Cloudera Enterprise

The Future of Data Management

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

BIG DATA ANALYTICS For REAL TIME SYSTEM

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP

The Inside Scoop on Hadoop

How To Use Hp Vertica Ondemand

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

Dashboard Engine for Hadoop

So What s the Big Deal?

Unified Batch & Stream Processing Platform

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Big Data Processing: Past, Present and Future

The Internet of Things and Big Data: Intro

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

STREAM ANALYTIX. Industry s only Multi-Engine Streaming Analytics Platform

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Interactive data analytics drive insights

Modernizing Your Data Warehouse for Hadoop

In-Memory Analytics for Big Data

Understanding the Value of In-Memory in the IT Landscape

Native Connectivity to Big Data Sources in MSTR 10

Shark Installation Guide Week 3 Report. Ankush Arora

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

How To Handle Big Data With A Data Scientist

Il mondo dei DB Cambia : Tecnologie e opportunita`

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Performance and Scalability Overview

Tableau Visual Intelligence Platform Rapid Fire Analytics for Everyone Everywhere

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Hadoop Ecosystem B Y R A H I M A.

Luncheon Webinar Series May 13, 2013

The Sierra Clustered Database Engine, the technology at the heart of

Cloudera Enterprise Data Hub in Telecom:

Big Data Web Analytics Platform on AWS for Yottaa

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Processing and Analyzing Streams. CDRs in Real Time

CAPTURING & PROCESSING REAL-TIME DATA ON AWS

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Big Data With Hadoop

3 Reasons Enterprises Struggle with Storm & Spark Streaming and Adopt DataTorrent RTS

Scalable Architecture on Amazon AWS Cloud

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

INTRODUCING APACHE IGNITE An Apache Incubator Project

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Cloud Big Data Architectures

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Transcription:

From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA

What s in Store For This Presentation? 1. MemSQL: A real-time database for transactions and analytics 2. Spark Use Cases 3. Example: Geospatial Enhancements

MemSQL Story The real-time database for transactions and analytics

MemSQL at a Glance! Experienced leadership from Facebook, SQL Server, Oracle, Fusion-io! In-Memory, distributed, relational database! Solving the Enterprise Architecture Gap! Horizontal scale-out with modern database innovation! $50 million in funding

Four Ways Your DBMS is Holding You Back! ETL (Extract, Transform, Load)! Analytic Latency! Synchronization! Copies of data Source: Gartner Hybrid/Transactional/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation

The Real-Time Database for Transactions and Analytics In-Memory Distributed Relational Software Data Center Cloud 6

The Real-Time Database for Transactions and Analytics Data Loading and Queries Aggregator Nodes Availability Group 1 Availability Group 2 Transactions Highest Value Hot Data High Value Cold Data Cluster Analytics

Gartner Identifies Emerging Category: HTAP (Hybrid Transactional/Analytical Processing) HTAP will enable business leaders to perform much more advanced and sophisticated real-time analysis of their business data than with traditional architectures. Download at: memsql.com/gartner

Simple Standard SQL! Transactions and analytics in one database! Behind the firewall or on the cloud! Flexible integrations (Hadoop, Spark, SQL)!

Fast Extremely low-latency queries! Massive parallel transaction capacity! Lock-free, shared-nothing architecture!

Scalable Scales out on cloud and commodity hardware! Deploys to thousands of machines! True linear scaling!

MemSQL Product Ecosystem Dashboards In-Memory Applications Transactions and Analytics Advanced Analytics Streaming Wire-protocol compatibility ODBC, JDBC,.NET Connectors MemSQL Loader Databases and Data Warehouses Amazon S3 Hadoop

Spark Use Cases

Spark Data Processing Framework Intuitive, concise, and expressive operations needed for analytics Spark SQL Spark Streaming Mllib (machine learning) GraphX (graph) Apache Spark

Understanding MemSQL and Spark Cluster-wide Parallelization Bi-Directional

Spark with MemSQL MemSQL Spark Connector enables the real-time trinity Message Queue Transformation Data Serving Programming libraries Persistence Application platform End-to-End Data Pipeline Under One Second

MemSQL and Spark Use Cases! Operationalize models built in Spark! Stream and event processing! Live dashboards and automated reports! Extend MemSQL analytics

Operationalize Models Built in Spark! Process in Spark, persist to MemSQL! Go to production and iterate faster Data into Spark Enterprise Consumption Model Creation Model Persistence

Stream and Event Processing! Structure event data on the fly! Pass to MemSQL for persistent, queryable format Real-time Streaming Data Enterprise Consumption Data Transformation Persistent, Queryable Format

Real-Time Analytics at Pinterest! Higher performance event logging! Reliable log transport and storage! Faster query execution on real-time data App Kafka Spark, MemSQL, Application Message Queue Enterprise Consumption 50,000 pins/sec Singer Secor Transform RT Analytics, Data Serve

Live Dashboards and Automated Reports! Serve live dashboards from MemSQL! Run custom reports on live data with Spark Custom Reporting Live Dashboards Access to Live Production Data SQL Transactions and Analytics

Extend MemSQL Analytics! The freshest data for analysis in Spark! Load from MemSQL to Spark and write results on return Interactive Analytics, Machine Learning Applications, Data Streams Access to Live Production Data Real-time Replica

MemCity! Capturing energy consumption data from 1.4 million households! 8 devices per household! 186,000 events per minute! AWS hardware costs at $2.35 per hour

Geospatial Enhancements

Geospatial Challenge! Commercial applications now geo-enabled! Location is everywhere! Lots of insight possible! Traditionally geo is processed separately! Real need for integrated geospatial at scale

MemSQL Geospatial! Points, Lines, and Polygons! Topological filters! Measurement functions

MemSQL Geospatial! BILLIONS of objects! Sub-second latency! Geo data is first-class citizen! Geo + Simplicity + Speed + Scale

Real-Time Geospatial Location Intelligence! Sample from 170 million taxi trips! Real-time ingest! Concurrent queries in fractions of a second! Unlimited number of geographic views! Simple queries while simultaneously ingesting data

MemSQL 4 Community Edition A database so scalable that everyone can use it. UNLIMITED scale and capacity Free FOREVER

Thank You! Visit the MemSQL Booth #4 * MemCity Showcase Games Giveaways