Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect



Similar documents
Native Connectivity to Big Data Sources in MSTR 10

The Future of Data Management

Big Data Technologies Compared June 2014

Tap into Hadoop and Other No SQL Sources

HDP Enabling the Modern Data Architecture

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Introducing Oracle Exalytics In-Memory Machine

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Turn Big Data to Small Data

HDP Hadoop From concept to deployment.

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

The Future of Data Management with Hadoop and the Enterprise Data Hub

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Please give me your feedback

Data Management in SAP Environments

Big Data Integration: A Buyer's Guide

Understanding the Value of In-Memory in the IT Landscape

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

How Companies are! Using Spark

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

Big Data Big Data/Data Analytics & Software Development

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Big Data Explained. An introduction to Big Data Science.

BIG DATA What it is and how to use?

Bringing Big Data to People

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

How To Handle Big Data With A Data Scientist

SAP and Hortonworks Reference Architecture

Il mondo dei DB Cambia : Tecnologie e opportunita`

Modernizing Your Data Warehouse for Hadoop

BIG DATA TRENDS AND TECHNOLOGIES

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Luncheon Webinar Series May 13, 2013

Hadoop: Distributed Data Processing. Amr Awadallah Founder/CTO, Cloudera, Inc. ACM Data Mining SIG Thursday, January 25 th, 2010

Talend Big Data. Delivering instant value from all your data. Talend

How To Scale Out Of A Nosql Database

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

The Enterprise Data Hub and The Modern Information Architecture

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Reference Architecture, Requirements, Gaps, Roles

SAP Real-time Data Platform. April 2013


Ali Ghodsi Head of PM and Engineering Databricks

Hadoop Ecosystem B Y R A H I M A.

Cloudera Enterprise Data Hub in Telecom:

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Big Data and Data Science: Behind the Buzz Words

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Bringing the Power of SAS to Hadoop. White Paper

Tableau Visual Intelligence Platform Rapid Fire Analytics for Everyone Everywhere

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Presenters: Luke Dougherty & Steve Crabb

Apache Hadoop's Role in Your Big Data Architecture

Qlik Sense Enabling the New Enterprise

Achieving Business Value through Big Data Analytics Philip Russom

Comprehensive Analytics on the Hortonworks Data Platform

Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud

Three Reasons Why Visual Data Discovery Falls Short

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Are You Ready for Big Data?

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Hadoop implementation of MapReduce computational model. Ján Vaňo

WHITE PAPER. Four Key Pillars To A Big Data Management Solution

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

High-Performance Analytics

Are You Ready for Big Data?

Upcoming Announcements

Navigating the Big Data infrastructure layer Helena Schwenk

The QlikView Business Discovery platform

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

Big Data Success Step 1: Get the Technology Right

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

The BIg Picture. Dinsdag 17 september 2013

Transforming the Telecoms Business using Big Data and Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

ANALYTICS CENTER LEARNING PROGRAM

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Virtualizing Apache Hadoop. June, 2012

This Symposium brought to you by

The 4 Pillars of Technosoft s Big Data Practice

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Large scale processing using Hadoop. Ján Vaňo

TURN YOUR DATA INTO KNOWLEDGE

Big Data. Lyle Ungar, University of Pennsylvania

Analyzing Big Data with AWS

Transcription:

Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect

TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate to actually finding useful business information? Why is Qlik unique in leading the industry in solving Big Data solutions? Demo

TDWI Vancouver Agenda What really is Big Data? Most people think of Hadoop. How do we separate hype from reality? How does that relate to actually finding useful business information? Why is Qlik unique in leading the industry in solving Big Data solutions? Demo

A Brief History of Hadoop Google releases a paper on GFS, based on a distributed search platform called Nutch Cutting joins Yahoo, estimates a billion pg index will cost $500k and $30k/mos to support Hadoop promoted to top level Apache project, predictive search index creation time reduced from 12days to 8hrs A 1400n Yahoo cluster sorts 500GB in 59s. Cloudera launches Yahoo spins remaining Hadoop folks out into Hortonworks Cloudera adds real-time search, based on Lucene, also created by Cutting 3 rd Hadoop World conf attracts 2300 developers, up from 275 in 2010 2005 2008 2011 2013

Example Apache Hadoop or Next-Gen Components HDFS MapReduce Pig Zookeeper Hive HBase Mahout Spark Shark Cassandra Hadoop Distributed File System Processing framework for writing scalable data applications Procedural language that abstracts lower level MapReduce Highly reliable distributed coordination System for querying data on top of HDFS (SQL-like query) Database for random, real time read/write access Scalable machine learning libraries In-memory large-scale data processing 100x faster than Hadoop SQL engine on top of Spark Scalable multi-master database with no single points of failure And on, and on Hadoop

Big Data: Expanding on 3 fronts Data Velocity Real Time Near Real Time PB Data Volume Periodic TB Batch GB Table MB Database Web XML Audio Video Social Data Variety

What is Big Data? Big Data is: Nebulous Big Data is: Really Big or Not Big Data is: Mostly Useless Noise Big Data is: Slow Big Data is: Difficult

Big Data Ecosystem Much More Than Just Hadoop Big Insights & Streams Big Data Appliance HANA Data Visualization, Statistical & In-memory Analytics Big Data Analytic Appliances Splunk > Packaged Mapreduce platforms Massively Parallel Processing Platforms 8 Open source Distributed Processing Frameworks Big data Integration

Who What Why Telecom Financial Services Some uses of Big Data today Usage and Location Analysis Call Detail Records (CDRs) Next Product to Buy (NPTB) Real-time Bandwidth Allocation New Account Risk Screens Fraud Detection Trading Risk Real-Time P&L Portfolio Analysis Operational Excellence Customer Retention Profitability Improve Profit Minimize Risk Utilities Smart Metering Analysis Operational Excellence Retail Manufacturing 360 o Customer View Brand Sentiment Analysis Up Sell/Cross Sell Clickstream Analysis Supply Chain & Logistics Assembly Line QA Proactive Maintenance Increase Revenues Customer Loyalty Brand Awareness Operational Excellence Profitability Source: Gartner 50 Real World Examples of Big Data and Analytics, 2013

TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate to actually finding useful business information? Why is Qlik unique in leading the industry in solving Big Data solutions? Demo

Popular Big Data Myths You need to have Ga-zinga-bytes of data to deploy a Big Data solution Typical Cloudera Cluster is 15-20 nodes, < 10TB of data Hadoop storage is 3-400% cheaper than an EDW Hadoop is all you need Hadoop is an enabling technology that provides the foundation for Big Data solutions Focus today is on data management The RDBMS is dead RDBMS is still critical but not for high volume, low quality analytics QlikView can t handle Big Data Reality is a Human can t handle Big Data It s all about the use case

Big Data is rapidly shifting from how much data you can handle to how quickly you can deliver value Volume of Data is just one, less and less critical factor Context is key and difficult to pinpoint Big Data: Hadoop is designed to support petabytes and beyond Fast Data: Big Data vs. Fast Data vs. Right Data Teradata, SAP HANA, Netezza, Hbase, MongoDB, ParStream, etc Big Data is slow & cheap, Fast Data is neither A Big Data Solution requires components that address both Hadoop is the data system that combines Fast and Big platform QlikView is the platform that supports both scenarios simultaneously

Where Big Data fits today: The new BI architecture Data Accelerator??? Big Data Repository Data Warehouse??? Web data Docs & text data Audio/Video data Machine data Operational systems Unstructured/Semi-structured data Structured data

Big Data comes with big challenges The Big Data bottleneck Reports Data Scientists Big Data Business Users many organizations lack the skills required to exploit big data most of these skills are in short supply and rare in the market at large Source: Gartner Big Data Hype Cycle Report 2013 data science encompasses hard skills

Big Data comes with big challenges Obstacles to Big Data Analytics Organizations are challenged in staffing and training Staffing Training Real-Time License Cost Integration 79% 77% 67% 64% 64% Organizations have trouble finding qualified professionals to manage big data and providing training to those already on board Source: Ventana Research, The Challenge of Big Data Benchmark Research, November 2013

TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate to actually finding useful business information? Why is Qlik unique in leading the industry in solving Big Data solutions? Demo

Insight Comes from Data, in Context Data warehouse Machine data, web data, cloud data Hadoop cluster Google BigQuery Operational systems

Big Data Business Needs Descriptive Analytics Predictive Analytics Prescriptive Analytics DATA Clinical, Claims, Monitoring, others How are we doing? How many claims did we pay today? What might happen in the future? Which of tomorrow s claims might be requesting an Emergency Room (ER) admission? Best course of action given objectives, requirements & constraints What would be effective steps to reduce probability of ER admission?

TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate to actually finding useful business information? Why is Qlik unique in leading the industry in solving Big Data solutions? Demo

Who are we - QlikView What Is QlikView? QlikView is a Business Discovery platform User-driven BI supporting the creation and consumption of dynamic apps for analyzing information QlikView apps allow non-technical users to explore visual views of information and ask streams of questions, through simple interactions such as clicks and taps QlikView s patented software engine dynamically calculates new views of information, instantly, based on user selections

QlikView - A New Kind of Software Company Leader in Business Discovery user-driven BI Broad Base of 28,000 Customers 28,000+ customers in 100 countries 1,500 global partners 1,500 employees across 28 offices in 23 countries No. 1 fastest-growing enterprise technology company (ZDNet) Gartner Magic Quadrant Leader for 3 consecutive years

These are Tools And this is How BI has been done

This is a Platform

Analytical Quotient The Evolution of Business Intelligence Managed Reporting Ad-Hoc Reporting Dashboards / Visualization OLAP / Analysis Associative / Statistical Exploration Predictive QlikView s Sweet Spot Usefulness

What Makes QlikView Unique? 1) Associative Query Language + Full Search *not another query tool. 2) Core Technology: True In-memory, columnar database with built in visualization, analytics, and ELT in a single product. 3) Designed for Heterogeneous & Complex Data (*again not just another query tool) 4) Application / Mobile Design First (Mobile, Desktop, Tablet Design once, consume anywhere)

QlikView s Natural Analytics makes data analysis a natural part of every business process for everyone How traditional BI and visualization tools work QlikView Natural Analytics Limited view and access to data Forced down linear drill paths Need to involve IT to modify What-if and on-the-fly analysis is limited Freedom to explore data from any point in analysis in a dynamic, interactive interface Answer any question on the fly, real-time Easily see connections, and disconnects in data

The Green, The White and The Gray

The Visualization Bottleneck Query Size Tableau Spotfire Big Data Datameer MSTR Analytics Desktop Response Time

Connectivity to every Big Data Source SAP HANA MPP Warehouse NoSQL Databases Hadoop Advanced Analytics Batch Real-time SAP HANA BigQuery

The Big Data Value Chain Hard Disk Drives (HDD) Solid State Storage (SSD) Random Access Memory (RAM) Speed (t/tb) 3300s 1000-300s 1s Price $/TB $ 50 $ 500 $ 4500 Keep data in memory when the value obtained from processing it is high Leave data on disk when it is inactive or the value from processing it is low Value Size

Flexible Big Data deployment models 100 s millions rows into Memory Aggregates / Detail Billions of rows via Direct Discovery Direct Discovery

Combine Big Data and traditional data sources Combine data sources using pure In-Memory Aggregates / Detail EDW Data Data Warehouse

QlikView as a catalyst for implementing Big Data Today s challenge: What to do with Big Data? Who should do it? IT What to do with this? Business How to define requirements?

QlikView as a catalyst for implementing Big Data QlikView gives business users ability to discover with Big Data, not just data scientists IT & Business More Access > More Questions > More Use > Higher ROI of Big Data

QlikView In-Memory approach Loads compressed data into memory Enables associative search and analysis Supports 100 s millions to billions of rows of data In-Memory

QlikView Direct Discovery Approach Combines the associative capabilities of the QlikView in-memory dataset with a query model where: The aggregated query result is passed back to a QlikView object without being loaded into the QlikView data model The result set is still part of the associative experience Capability to Drill to Detail records QlikView In-Memory Data Model Batch Load QlikView Application Direct Discovery

A Hybrid Approach for Tackling Big Data 100% in-memory for: All the necessary (i.e. relevant and contextual) data can fit in-memory Users require only aggregated or summary data, i.e. hourly or daily averages, or record-level detail over a limited time period. Query performance of external source is not satisfactory Direct Discovery for: Data cannot fit in memory and document chaining is not sufficient Users require access to recordlevel of detail stored in a large fact table that will not fit in memory. Network bandwidth limits ability to copy data to QlikView server The Design of Direct Discovery lets you alternate between these approaches with absolutely no change to the application itself

DEMO