BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata



Similar documents
Artur Borycki. Director International Solutions Marketing

Welcome. Host: Eric Kavanagh. The Briefing Room. Twitter Tag: #briefr

Teradata s Big Data Technology Strategy & Roadmap

UNIFY YOUR (BIG) DATA

Teradata Unified Big Data Architecture

Big Data and Your Data Warehouse Philip Russom

Investor Presentation. Second Quarter 2015

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

HDP Hadoop From concept to deployment.

Ganzheitliches Datenmanagement

INVESTOR PRESENTATION. First Quarter 2014

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Harnessing the Value of Big Data Analytics

INVESTOR PRESENTATION. Third Quarter 2014

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE

SAP and Hortonworks Reference Architecture

The Future of Data Management

Harnessing the Value of Big Data Analytics

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

HIGH PERFORMANCE ANALYTICS FOR TERADATA

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Luncheon Webinar Series May 13, 2013

HDP Enabling the Modern Data Architecture

SAS and Teradata Partnership

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

The Future of Data Management with Hadoop and the Enterprise Data Hub

Extend your analytic capabilities with SAP Predictive Analysis

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

TERADATA QUERY GRID. Teradata User Group September 2014

How To Use Big Data For Business

CERULIUM TERADATA COURSE CATALOG

How To Analyze Data In A Database In A Microsoft Microsoft Computer System

Modern Data Architecture for Predictive Analytics

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Best Practices for Hadoop Data Analysis with Tableau

Beyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations

Oracle Big Data SQL Technical Update

Comprehensive Analytics on the Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

IBM BigInsights for Apache Hadoop

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Real-Time Data Access Using Restful Framework for Multi-Platform Data Warehouse Environment

Oracle Database 12c Plug In. Switch On. Get SMART.

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Navigating Big Data business analytics

Big Data, Start Small! Dr. Frank Säuberlich, Director Advanced Analytics (Teradata International) 26 th May 2015

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

Ramesh Bhashyam Teradata Fellow Teradata Corporation

Bringing Big Data to People

Integrating a Big Data Platform into Government:

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Big Data Analytics Nokia

Data Integration Checklist

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Apache Hadoop: The Big Data Refinery

Analance Data Integration Technical Whitepaper

How To Turn Big Data Into An Insight

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

Data Refinery with Big Data Aspects

Discovering Business Insights in Big Data Using SQL-MapReduce

Bringing the Power of SAS to Hadoop. White Paper

Please give me your feedback

SQL Server 2012 Parallel Data Warehouse. Solution Brief

The Enterprise Data Hub and The Modern Information Architecture

Native Connectivity to Big Data Sources in MSTR 10

Driving Value From Big Data

Analance Data Integration Technical Whitepaper

How To Handle Big Data With A Data Scientist

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Advanced In-Database Analytics

End Small Thinking about Big Data

How To Make Data Streaming A Real Time Intelligence

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

MapR: Best Solution for Customer Success

GO BIG WITH DATA PLATFORMS: HADOOP AND TERADATA 1700

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Open Source in Financial Services: Meet the challenges of new business models and disruption

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Cloudera Enterprise Data Hub in Telecom:

Big Data Integration: A Buyer's Guide

Virtualizing Apache Hadoop. June, 2012

Big Data and Apache Hadoop Adoption:

Why Big Data in the Cloud?

Bringing Intergalactic Data Speak (a.k.a.: SQL) to Hadoop Martin Willcox Director Big Data Centre of Excellence (Teradata

Introducing Oracle Exalytics In-Memory Machine

A Modern Data Architecture with Apache Hadoop

Transcription:

BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata

Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING WHAT WILL happen? Automated Linkages REPORTING WHAT happened? ANALYZING WHY did it happen? Predictive Models Link to Operational Systems 2 Batch Reports Ad Hoc, BI Tools Does this process change with Big? No. Do the tools you need change? Yes.

Big is an Evolution not a Revolution Flow DATA -> INSIGHTS -> ACTIONS Predictions Events Patterns Hypothesis Testing Strategic Actions Operational Actions Flow BIG DATA -> INSIGHTS -> ACTIONS Is the Ultimate USE of Big Different? No. 3

Evolution: What s Not New? #1 ECONOMICS > You still need to determine what data has value, how long to store it, and where to store it. And all your systems still need to prove ROI #2 TOOLS > Exploit parallelism, even on new data types. > Need for tools to interoperate easily > Ease of use, ease of access #3 ARCHITECTURE > Need flexibility, so you can add more tools > Need to leverage technologies you already have > Overall system must be production-ready, tested, reliable 4

Evolution: What s Different? Disruptive? #1 New ECONOMICS > Increased amounts of data you can afford to capture #2 New TOOLS > INSIGHTS FROM NEW DATA TYPES: quickly find the signal in the noise, integrate > DISCOVERY PROCESS : Fast! Capture and analyze data without the usual rigor, because much of it will not go into the EDW #3 New ARCHITECTURE FRAMEWORK > A hybrid ecosystem that makes it easy to use both old and new tools, on old and new data 5

Gartner Recommends: Shift from a Single Platform to an Ecosystem "Logical" Warehouse We will abandon the old models based on the desire to implement for high-value analytic applications. 6

Big Analytics The Problem Warehouse/ Intelligence Advanced Analytics Proliferation of Big analytics environments has resulted in fragmented data, higher costs, expensive skills, longer time to insight 7

Discovery (Advanced Analytics) The Problem The Solution Warehouse/ Intelligence Advanced Analytics SQL Framework Access Layer Integrated Discovery Platform (IDP) Proliferation of Big analytics environments has resulted in fragmented data, higher costs, expensive skills, longer time to insight Pre-Built Analytics Functions Integrated discovery analytics provides deeper insight, integrated access, ease of use, lower costs, better insight 8

UNIFIED DATA ARCHITECTURE System Conceptual View ERP MOVE MANAGE ACCESS Marketing Marketing Executives SCM CRM INTEGRATED DATA WAREHOUSE Applications Operational Systems Images DATA PLATFORM Intelligence Customers Partners Audio and Video Mining Frontline Workers Machine Logs DISCOVERY PLATFORM Math and Stats Analysts Text Scientists Languages Web and Social Engineers SOURCES 9 ANALYTIC TOOLS & APPS USERS

PROCESS FLOW What s Changed: Architecture Framework ERP Marketing Marketing Executives SCM CRM DATA INSIGHTS ACTION Applications Operational Systems Images Fast Loading Discovery Reports Dashboards Intelligence Frontline Workers Audio and Video Machine Logs Text Filtering and Processing Online Archival Pattern Detection: Path, Graph, Time-series analysis Real-time Recommendations Operational Insights Mining Math and Stats Customers Partners Engineers Scientists Web and Social New Models And Model Factors Rules Engines Languages Analysts SOURCES ANALYTIC TOOLS USERS

Integrated Analytics Operationalizing Insights in the Enterprise Single view of your business Marketing Executives Cross-functional analysis Applications Operational Systems Shared source of relevant, consistent, integrated data INTEGRATED DATA WAREHOUSE Intelligence Frontline Workers Load once, use many times Mining Customers Partners Lowest cost of ownership Math and Stats Executives Fast new applications time-to-market Analysts USERS 11 APPLICATIONS

Capture, Store, Refine Capturing for Storage and Processing Raw data capture History or long term storage Low cost archival Transformations Structured, semi-structured Sessionize, remove XML tags, extract key words Simple math at scale Batch processing DATA PLATFORM Mining Machine Learning Languages Programmer Scientists 12 APPLICATIONS USERS

Hadoop: Requirements for Staging, Preprocessing, Simple Analytics Land/source operational data > Only one extract from source system History or long term storage > Low cost storage Preprocess data > Sessionize data, remove XML tags Transformations > Structured and semi-structured Exploration > Investigate value of new data sources Batch scoring Single subject reporting Cost Low cost/value equation for data size Depth More data/raw data for small user community Multi-Structure Raw data (typically web logs) stored for later parsing Non-SQL Analytics Workload requires procedural programming or Map Reduce Flexibility Access to raw data, no prod constraints, no IT governance Parallel App Applications that require MPP Application Environment 13

SQL-MapReduce Analytics Unlocking Hidden Value in (Any) Interactive data discovery > Web clickstream, social > Set-top box analysis > CDRs, sensor logs, JSON Flexible evolving schema MapReduce, Graph, SQL, statistics, text, ASTER DISCOVERY PLATFORM Languages Intelligence Mining Marketing Executives Operational Systems Scientists Customers Partners Structured and multistructured data Patented SQL- MapReduce Math and Stats Analysts 100+ packaged functions USERS 14 APPLICATIONS

DATA INSIGHTS : Discovery ACTION ERP SCM CRM Images GOVERNANCE & INTEGRATION TOOLS DATA MANAGE DISCOVERY MOVE Raw data acquisition transformation - nuggets visualization Combine new with old DATA New PLATFORM insight generation Generate hypotheses ACCESS INTEGRATED DATA WAREHOUSE Marketing Applications Intelligence Marketing Executives Operational Systems Frontline Workers Audio and Video Machine Logs Text Web and Social HYPOTHESIS TESTING Use of new data, insights to Augment predictive models Try process and action changes Experiment design, testing Results analysis Fast fail, or move into production DISCOVERY PLATFORM Mining Math and Stats Languages Customers Partners Engineers Scientists Analysts SOURCES ANALYTIC TOOLS USERS

Teradata Aster s SNAP Framework TEXT T STATS PATH SQL MAP REDUCE GRAPH SNAP FRAMEWORK INTEGRATED OPTIMIZER INTEGRATED EXECUTER UNIFIED SQL INTERFACE STORAGE SYSTEM AND SERVICES ROW STORE COLUMN STORE FILE STORE 16

A New Analytical Approach A Single SQL Statement to Acquire, Prepare, Analyze & Visualize Social Media ERP Text CRM Hado op EDW Acquisition Preparation Analysis Visualization Teradata Aster Discovery Platform Users Single SQL statement: SELECT * FROM npathviz( on SELECT * FROM npath ( ON (SELECT * FROM SESSIONIZE ( ON SELECT * FROM LOAD_FROM_TD_HADOOP) PARTITION BY sba_id SYMBOLS ( event LIKE '%EXTERIOR LIGHTING%' AS START_EVENT, event NOT LIKE '%BRAKE SYSTEM%' AS NEXT_EVENT) RESULT ( ) ) n; Benefits: Single solution & workflow, single skill set Shared metadata, data, insights Fastest time to value, easy iterations & speed of analysis 17

Organizations Face Several Obstacles Building Big Systems on Their Own Difficulty deploying and integrating new systems Difficulty managing multiple systems, new types of data Difficulty providing accessibility to fast insights on big data Difficulty finding skilled analysts e.g., data scientists 18 Source: Big Analytics 2012 Survey, Teradata

UNIFIED DATA ARCHITECTURE System Conceptual View ERP MOVE MANAGE ACCESS Marketing Marketing Executives SCM CRM INTEGRATED DATA WAREHOUSE Applications Operational Systems Images DATA PLATFORM Intelligence Customers Partners Audio and Video Mining Frontline Workers Machine Logs DISCOVERY PLATFORM Math and Stats Analysts Text Scientists Languages Web and Social Engineers SOURCES 19 ANALYTIC TOOLS & APPS USERS

Filtering Teradata SQL-H Give business users on-the-fly access to data in Hadoop Trusted: Use existing tools/skills and enable self-service BI with granular security Teradata SQL-H Aster SQL-H Standard: 100% ANSI SQL access to Hadoop data Hadoop MR HCatalog Hive Fast: Queries run on Teradata or Aster, data accessed from Hadoop Pig Efficient: Intelligent data access leveraging the Hadoop HCatalog Hadoop Layer: HDFS 20

Fabric Based Computing Optimized for BI The backbone of UDA > High performance infrastructure > Aggregate Teradata IDW, Aster Discovery and Hadoop > Industry approach optimized for Big Analytics use Key Teradata Elements > BYNET V5 software protocoal on InfiniBand interconnect > Teradata Managed Servers > System management across all of the FBC 21

Teradata Viewpoint Single Operational View (SOV) for Teradata, Aster, & Hadoop Creation of new portlets: > Node Monitor (Aster & Hadoop) > Aster Completed Processes > Hadoop Services Integration into existing: > Monitoring: System Health, Metrics Analysis, Metrics Graph, Capacity Heatmap, Space Usage, Query Monitor (TDB & Aster) > Admin: Alert Viewer, Alert Setup, Teradata Systems, Role Manager 22

Teradata Aster Big Analytics Appliance First Deeply Integrated SQL, MapReduce and Hadoop Appliance UNIQUE FEATURES 1. Modular Aster and 100% open-source Hortonworks nodes 2. First ANSI SQL & HCatalog integration via SQL-H 3. Only ANSI SQL & MapReduce integration: SQL-MapReduce 4. Most manageable Hadoop: Teradata Viewpoint & TVI 5. Comprehensive Discovery Portfolio: 100+ pre-built functions 6. Fully-engineered and supported by Teradata, backed by Hortonworks world-class Hadoop team 7. Cascading InfiniBand switches, hot node/ cluster expansion Benefits Leverage existing investments in standard BI, ETL tools & people with SQL skills Industry s highest performance platform for Big Analytics Lowest TCO (technology + people), highest ROI, and fastest time to value 23

Thank You! Questions and Answers 24