Bringing Intergalactic Data Speak (a.k.a.: SQL) to Hadoop Martin Willcox Director Big Data Centre of Excellence (Teradata

Size: px
Start display at page:

Download "Bringing Intergalactic Data Speak (a.k.a.: SQL) to Hadoop Martin Willcox [@willcoxmnk], Director Big Data Centre of Excellence (Teradata"

Transcription

1 Bringing Intergalactic Data Speak (a.k.a.: SQL) to Hadoop Martin Willcox Director Big Data Centre of Excellence (Teradata International) 4 th June 2015

2 Agenda A (very!) short history of Teradata The new Big Data and the emergence of the Logical Data Warehouse Hadoop and the Data Lake Intergalactic Data Speak to the rescue Conclusions and final thoughts 2

3 A (very!) short history of Teradata Big Data before there was Big Data In 1979, four academics and software engineers quit their days jobs, maxed-out their credit cards and built the world s first MPP scaleout Relational Database Computer in a garage in California. 3

4 A (very!) short history of Teradata 1986: Teradata ships the first commercial 100-node MPP system 4

5 The new Big Data From transactions and events - to interactions and observations Simple computing devices are now so inexpensive that increasingly everything is instrumented Instead of capturing transactions and events in the Data Warehouse and inferring behaviour, we can increasingly measure it directly Organisations making the transactions, to interactions journey need to address five key challenges 5

6 The new Big Data The big 5 challenges of making the transactions, to interactions journey #1: The requirement to manage multistructured data and data whose structure changes continuously means that there is no single Information Management strategy that works equally well across the entire Big Data space. #3: The economic challenge of capturing, storing, managing and exploiting Big Data sets that may be large; getting larger quickly; noisy; of (as yet) unproven value; and infrequently accessed. 6 #2: Understanding Interactions requires path / graph / time-series Analytics in addition to traditional set-based Analytics, so that there isn t a single parallel processing framework or technology that works equally well across the entire Big Data space. #4: There might be a needle in one of these haystacks - but if it takes 6-12 months and $1M just to go look, I ll never know. #5: Getting past so what to drive real business value (because old business process + expensive new technology = expensive, old business process)

7 The Logical Data Warehouse is the industry s adaptation to Big Data How will you deploy? How many / which platforms will you need? How will you integrate them? And which data need to be centralised and integrated? The Enterprise Data Warehouse Era The Logical Data Warehouse (a.k.a.: Unified Data Architecture) Era 1 Multi-structured data 2 Interaction / observation Analytics Flat / falling IT budgets, exploding data volumes Agile Exploration & Discovery Give me integrated, high quality data. 5 Operationalisation Centralise and integrate the data that are widely reused and shared, but integrate all of the analytics. 7

8 Big Idea #1: store all data (whatever all means) Big Idea #2: un-washed, raw data (NoETL / late-binding) Hadoop and the Data Lake (Data Warehouse professionals can be excused a certain sense of déjà vu where #4 is concerned!) Big Idea #3: leverage multiple technologies to support processing flexibility Big Idea #4: resolve the nagging problem of accessibility and data integration 8

9 The Data Lake will be ubiquitous, but Working in the Hadoop ecosystem is the province of uniquely trained engineers, people Maguire calls unicorns. Companies may have talented data teams, he says, but they should expect to supplement and rebuild their teams to make Hadoop successful. The talent gap is huge, says Maguire. What you need is somebody who knows 15 different technologies That drives up TCO. Walter Maguire, Chief Technologist, HP Big Data, quoted in a blog post on 9

10 Intergalactic Data Speak* to the rescue! *v (with apologies to Rick van der Lans and Chris Date, respectively) It s messy and imperfect; There are (already) many different dialects; Most implementations are a superset of a subset v of the standard; But it s also The Data Lingua Franca; Declarative, rather than imperative / procedural. 10

11 SQL-based Query Processing on Hadoop RDBMS HDFS QUERY ENGINE HDFS RDBMS HADOOP HADOOP RDBMS DATA VIRTUALIZATION RDBMS On Top Of Hadoop Query Engine Using HDFS Files RDBMS Orchestrating Queries With Remote Access to Hadoop/Hive Virtualization Layer Over All Data Sources 11

12 Query Processing on Hadoop RDBMS On Top of Hadoop RDBMS HDFS RDBMS on Hadoop cluster Proprietary data dictionary/meta data Proprietary data format within HDFS files Data types may be limited SQL query engine SQL language, but standards compatibility varies Query engine maturity varies Data not portable, can not be read by other systems/ engines Example: Pivotal HAWQ 12

13 Query Processing on Hadoop Query Engine Using HDFS Files QUERY ENGINE HDFS SQL query engine on Hadoop cluster Standard data dictionary/meta data (e.g., Hive) Standard data format within HDFS files (e.g., ORC files) Data types may be limited SQL query engine SQL language, but standards compatibility varies Query engine maturity varies Data portable and can be read by other systems/ engines Examples: IBM Big SQL, Cloudera Impala 13

14 Query Processing on Hadoop RDBMS Orchestrating Queries With Remote Access to Hadoop/Hive RDBMS HADOOP External RDBMS sends (part of) queries to engine on Hadoop Standard data dictionary/meta data within Hadoop cluster (e.g., Hive) Standard data format within HDFS files (e.g., ORC) Data types may be limited by engine on Hadoop and external RDBMS SQL query engine capabilities combination of external and internal Hadoop engines Combines data and analytics in two systems SQL language, standards compatibility generally high Query engine generally mature Data in Hadoop portable and can be read by other systems/engines Example: Teradata QueryGrid 14

15 Query Processing on Hadoop Virtualization Layer Over All Data Sources HADOOP RDBMS DATA VIRTUALIZATION External virtualization software sends (part of) queries to engine on Hadoop Standard data dictionary/meta data within Hadoop cluster (e.g., Hive) Standard data format within HDFS files (e.g., ORC) Data types may be limited by engine on Hadoop and external virtualization software SQL query engine capabilities combination of external and Hadoop engines and virtualization layer limitations Combines data and analytics in two systems Extra layer and/or data movement SQL language, standards compatibility generally high Query engine maturity and utilization of engines varies Data in Hadoop portable, can be read by other engines Example: Cisco Data Virtualization Platform (formerly Composite Software) 15

16 Teradata QueryGrid Optimize, simplify, and orchestrate processing across and beyond the Teradata UDA Run the right analytic on the right platform Take advantage of specialized processing engines while operating as a cohesive analytic environment Integrated processing; within and outside the UDA Easy access to data and analytics through existing SQL skills and tools Automated and optimized work distribution through push-down processing across platforms Minimize data movement, process data where it resides Minimize data duplication Transparently automate analytic processing and data movement between systems Bi-directional data movement Teradata

17 Years 1-5 Deep History QueryGrid Teradata Use Case SELECT Trans.Trans_ID,Trans.Trans_Amount FROM TD_Transactions Trans WHERE Trans_Amount > 5000 TERADATA DATABASE UNION SELECT * FROM FOREIGN TABLE (SELECT Trans_ID,Trans_Amount FROM Transaction_Hist WHERE Trans_Amount > 5000)@Hadoop Hist; Years 5-10 HADOOP Push "Foreign Table" Select to Hive to execute the query Provides import to Teradata of just the required columns. Allows predicate processing of conditions on non-partitioned columns. The Hadoop cluster resources are used for data qualification. 17

18 18 Adaptive Optimizer Incremental planning & execution of smaller query fragments Most efficient overall query plan derived from reliable statistics Statistics dynamically collected from foreign data Incremental query plans generated for single and multi-system queries Consistent Optimizer approach for queries within and between systems Teradata systems transfer query plans between systems A fully automatic optimizer feature users don t have to change anything Better Query Plan Foreign and Sub-Queries Why? Unreliable statistics can result in less-thanoptimal query plans Some analytic systems, like Hadoop, don t keep data statistics Statistics not designed for compatibility between databases How? Pulls out remote server requests and single-row and scalar non-correlated subqueries from a main query Plans and executes them Plugs the results into the main query Plans and executes the main query

19 Teradata Summary & conclusions

20 Analysts agree that the Logical Data Warehouse is the future of Enterprise Analytical Architecture Gartner Logical Data Warehouse even if they can t agree what to call it Forrester Enterprise Data Hub We will abandon the old models based on the desire to implement for high-value analytic applications. Raw data in an affordable distributed data hub Firms that get this concept realise all data does not need first-class seating Teradata

21 There are (already) 12+ different SQL interfaces for Hadoop Source: Gartner Market Guide for Hadoop Distributions, 6 th January 2015 Apache Drill Apache Phoenix Apache Tajo IBM BigSQL Pivotal Hawq Splice Machine Teradata QueryGrid Apache Hive Apache Spark SQL Cloudera Impala Oracle Big Data SQL Presto SQLstream Broad industry consensus that SQL is a key enabler in making the Hadoop Ecosystem accessible to mere mortals; The different technologies have very different strengths and weaknesses and you may struggle to standardise on only one of them, but Teradata

22 at least right now, the sweet-spot is in the middle of the spectrum Not open enough Both sound architectural choices, depending on use-case Not fast / scalable enough RDBMS HDFS QUERY ENGINE HDFS RDBMS HADOOP HADOOP RDBMS DATA VIRTUALIZATION RDBMS On Top Of Hadoop Query Engine Using HDFS Files RDBMS Orchestrating Queries With Remote Access to Hadoop/Hive Virtualization Layer Over All Data Sources 22

23 Final thoughts What makes Hadoop special is all the things that it can do that parallel RDBMS technologies cannot; Industry focus on SQL interfaces is a rational way of addressing accessibility / TCO issues but risk is that we re-invent (lowestcommon denominator) parallel RDBMS technologies; Your goal should not be to try and recreate your IDW on Hadoop (you will likely fail), but to build a Data Lake to capture new data and support new processing... 23

24 so start with a business goal, not with a technology Web / clickstream Who navigates to the website, what do they do in each session and then afterwards within other channels? Voice / text Who is complaining to the call center & about what? Teradata / Graph Which brokers are colluding to rig markets and with whom? Sentiment What are customers saying about the company / products / services on social media sites? Process / Path Analytics What s the optimal process for claims or collections activity?

25 Teradata

What is a Data Lake, anyway? Alec Gardner, GM Advanced Analytics, Teradata ANZ Wednesday 10 th June 2015

What is a Data Lake, anyway? Alec Gardner, GM Advanced Analytics, Teradata ANZ Wednesday 10 th June 2015 What is a Data Lake, anyway? Alec Gardner, GM Advanced Analytics, Teradata ANZ Wednesday 10 th June 2015 A large objectbased repository that holds data in its native format store all data present and future

More information

Key Trends in Big Data and Analytics

Key Trends in Big Data and Analytics Key Trends in Big Data and Analytics Martin Willcox, Director Big Data Centre of Excellence (Teradata International) October 2015 2015 Teradata Agenda Motivating examples from an old industry From transactions

More information

Big Data, Start Small! Dr. Frank Säuberlich, Director Advanced Analytics (Teradata International) 26 th May 2015

Big Data, Start Small! Dr. Frank Säuberlich, Director Advanced Analytics (Teradata International) 26 th May 2015 Big Data, Start Small! Dr. Frank Säuberlich, Director Advanced Analytics (Teradata International) 26 th May 2015 Agenda Introduction Big Data And The Emergence Of The Logical Data Warehouse Architecture

More information

Teradata s Big Data Technology Strategy & Roadmap

Teradata s Big Data Technology Strategy & Roadmap Teradata s Big Data Technology Strategy & Roadmap Artur Borycki, Director International Solutions Marketing 18 March 2014 Agenda > Introduction and level-set > Enabling the Logical Data Warehouse > Any

More information

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING

More information

TERADATA QUERY GRID. Teradata User Group September 2014

TERADATA QUERY GRID. Teradata User Group September 2014 TERADATA QUERY GRID Teradata User Group September 2014 2 9/15/2014 Teradata Confidential Teradata s View Big Data and Data in General DATA enables INSIGHTS which drive ACTIONS to provide BUSINESS ADVANTAGE

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

TECHNOLOGY TRANSFER PRESENTS OCTOBER 16 2012 OCTOBER 17 2012 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY)

TECHNOLOGY TRANSFER PRESENTS OCTOBER 16 2012 OCTOBER 17 2012 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY) TECHNOLOGY TRANSFER PRESENTS RICK VAN DER LANS Data Virtualization for Agile Business Intelligence Systems New Database Technology for Data Warehousing OCTOBER 16 2012 OCTOBER 17 2012 RESIDENZA DI RIPETTA

More information

I/O Considerations in Big Data Analytics

I/O Considerations in Big Data Analytics Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very

More information

Are You Big Data Ready?

Are You Big Data Ready? ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain

More information

6.0, 6.5 and Beyond. The Future of Spotfire. Tobias Lehtipalo Sr. Director of Product Management

6.0, 6.5 and Beyond. The Future of Spotfire. Tobias Lehtipalo Sr. Director of Product Management 6.0, 6.5 and Beyond The Future of Spotfire Tobias Lehtipalo Sr. Director of Product Management Key peformance indicators Hundreds of Records Visual Data Discovery Millions of Records Data Mining or Data

More information

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved. EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics

More information

Big Data SQL and Query Franchising

Big Data SQL and Query Franchising Big Data SQL and Query Franchising An Architecture for Query Beyond Hadoop Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor

More information

Bringing the Power of SAS to Hadoop. White Paper

Bringing the Power of SAS to Hadoop. White Paper White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

Trafodion Operational SQL-on-Hadoop

Trafodion Operational SQL-on-Hadoop Trafodion Operational SQL-on-Hadoop SophiaConf 2015 Pierre Baudelle, HP EMEA TSC July 6 th, 2015 Hadoop workload profiles Operational Interactive Non-interactive Batch Real-time analytics Operational SQL

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

Integrate Master Data with Big Data using Oracle Table Access for Hadoop

Integrate Master Data with Big Data using Oracle Table Access for Hadoop Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler

More information

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015 Data Governance in the Hadoop Data Lake Kiran Kamreddy May 2015 One Data Lake: Many Definitions A centralized repository of raw data into which many data-producing streams flow and from which downstream

More information

UNIFY YOUR (BIG) DATA

UNIFY YOUR (BIG) DATA UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:

More information

The Logical Data Warehouse

The Logical Data Warehouse TECHNOLOGY TRANSFER PRESENTS RICK VAN DER LANS The Logical Data Warehouse Design, Architecture, and Technology Incorporating Big Data, Hadoop and NoSQL in Data Warehouse and Business Intelligence Systems

More information

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Data Governance in the Hadoop Data Lake. Michael Lang May 2015 Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales

More information

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

Big Data must become a first class citizen in the enterprise

Big Data must become a first class citizen in the enterprise Big Data must become a first class citizen in the enterprise An Ovum white paper for Cloudera Publication Date: 14 January 2014 Author: Tony Baer SUMMARY Catalyst Ovum view Big Data analytics have caught

More information

Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization

Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy February 2015 Sponsored

More information

Data Warehouse Hadoop. Shimpei Kodama 2015/9/29

Data Warehouse Hadoop. Shimpei Kodama 2015/9/29 Data Warehouse Hadoop Shimpei Kodama 2015/9/29 of DWH 1979 Founded 77+ Counties 2,600+ Customers 11,000+ Employees GNo1 L 95% Top 20 Communications 90% Top 20 Finance 75% Top 20 Retail 70% Top 20 Travel

More information

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache

More information

Exploring the Synergistic Relationships Between BPC, BW and HANA

Exploring the Synergistic Relationships Between BPC, BW and HANA September 9 11, 2013 Anaheim, California Exploring the Synergistic Relationships Between, BW and HANA Sheldon Edelstein SAP Database and Solution Management Learning Points SAP Business Planning and Consolidation

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

Practical Approaches to Big Data & Analytics: From Infrastructure to

Practical Approaches to Big Data & Analytics: From Infrastructure to 2014 Cisco and/or its affiliates. All rights reserved. Practical Approaches to Big Data & Analytics: From Infrastructure to Applications Kapil Bakshi Distinguished Architect, Cisco System Digital Government

More information

Emerging Requirements and DBMS Technologies:

Emerging Requirements and DBMS Technologies: Emerging Requirements and DBMS Technologies: When Is Relational the Right Choice? Carl Olofson Research Vice President, IDC April 1, 2014 Agenda 2 Why Relational in the First Place? Evolution of Databases

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Big Data Too Big To Ignore

Big Data Too Big To Ignore Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Cloudera Enterprise Data Hub in Telecom:

Cloudera Enterprise Data Hub in Telecom: Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer

More information

Discovering Business Insights in Big Data Using SQL-MapReduce

Discovering Business Insights in Big Data Using SQL-MapReduce Discovering Business Insights in Big Data Using SQL-MapReduce A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy July 2013 Sponsored by Copyright 2013

More information

Data Warehouse Optimization

Data Warehouse Optimization Data Warehouse Optimization Embedding Hadoop in Data Warehouse Environments A Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy September 2013 Sponsored by Copyright

More information

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap

More information

The Celebrus v8 Big Data Engine. Powering real-time personalisation, one-to-one data-driven marketing & advanced customer analytics.

The Celebrus v8 Big Data Engine. Powering real-time personalisation, one-to-one data-driven marketing & advanced customer analytics. The Celebrus v8 Big Data Engine Powering real-time personalisation, one-to-one data-driven marketing & advanced customer analytics. Celebrus v8 Big Data Engine The Celebrus v8 Big Data Engine The Celebrus

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

WHAT S NEW IN SAS 9.4

WHAT S NEW IN SAS 9.4 WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

THE JOURNEY TO A DATA LAKE

THE JOURNEY TO A DATA LAKE THE JOURNEY TO A DATA LAKE 1 THE JOURNEY TO A DATA LAKE 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA ACCORDING TO IDC, AS MUCH AS 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA,

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)

Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database) Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database) Presented By: Mike Ferguson Intelligent Business Strategies Limited 2 Day Workshop : 25-26 September 2014 : 29-30 September 2014 www.unicom.co.uk/bigdata

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

TOP 8 TRENDS FOR 2016 BIG DATA

TOP 8 TRENDS FOR 2016 BIG DATA The year 2015 was an important one in the world of big data. What used to be hype became the norm as more businesses realized that data, in all forms and sizes, is critical to making the best possible

More information

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com

More information

Cisco IT Hadoop Journey

Cisco IT Hadoop Journey Cisco IT Hadoop Journey Srini Desikan, Program Manager IT 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions G-Cloud Big Data Suite Powered by Pivotal December 2014 G-Cloud service definitions TABLE OF CONTENTS Service Overview... 3 Business Need... 6 Our Approach... 7 Service Management... 7 Vendor Accreditations/Awards...

More information

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at spoozhikala@stratapps.com.

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

WHITE PAPER Business Process Management: The Super Glue for Social Media, Mobile, Analytics and Cloud (SMAC) enabled enterprises?

WHITE PAPER Business Process Management: The Super Glue for Social Media, Mobile, Analytics and Cloud (SMAC) enabled enterprises? WHITE PAPER Business Process Management: The Super Glue for Social Media, Mobile, Analytics and Cloud (SMAC) enabled enterprises? Business managers and technology leaders are being challenged to make faster

More information

How To Turn Big Data Into An Insight

How To Turn Big Data Into An Insight mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

The Enterprise Data Hub and The Modern Information Architecture

The Enterprise Data Hub and The Modern Information Architecture The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved. Cloudera Overview The Leader

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Next-Generation Cloud Analytics with Amazon Redshift

Next-Generation Cloud Analytics with Amazon Redshift Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional

More information

Integrating SQL and Hadoop

Integrating SQL and Hadoop Integrating SQL and Hadoop Jean-Pierre Dijcks and Martin Gubar Abstract There is little doubt that big data is here to stay; over the last couple of years, it has grown in importance as a critical element

More information

Internet of Things. Opportunity Challenges Solutions

Internet of Things. Opportunity Challenges Solutions Internet of Things Opportunity Challenges Solutions Copyright 2014 Boeing. All rights reserved. GPDIS_2015.ppt 1 ANALYZING INTERNET OF THINGS USING BIG DATA ECOSYSTEM Internet of Things matter for... Industrial

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling

More information

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES [ Consumer goods, Data Services ] TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES QUICK FACTS Objectives Develop a unified data architecture for capturing Sony Computer Entertainment America s (SCEA)

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Integrating Apache Spark with an Enterprise Data Warehouse

Integrating Apache Spark with an Enterprise Data Warehouse Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software

More information

Unifying the Enterprise Data Hub and the Integrated Data Warehouse

Unifying the Enterprise Data Hub and the Integrated Data Warehouse Unifying the Enterprise Data Hub and the Integrated Data Warehouse CONTENTS Encompassing All of the Big Data Universe 1 The Ideal Structure 2 The Enterprise Data Hub: Refining Raw Data 3 The Integrated

More information

Driving Value From Big Data

Driving Value From Big Data Big Data Executive Forum Data Discovery, Modern Architecture & Visualization Driving Value From Big Data Bill Franks Chief Analytics Officer, Teradata It s Not So Much Big Data As it is different data.

More information

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc. Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT WHITEPAPER OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT A top-tier global bank s end-of-day risk analysis jobs didn t complete in time for the next start of trading day. To solve

More information

Dashboard Engine for Hadoop

Dashboard Engine for Hadoop Matt McDevitt Sr. Project Manager Pavan Challa Sr. Data Engineer June 2015 Dashboard Engine for Hadoop Think Big Start Smart Scale Fast Agenda Think Big Overview Engagement Model Solution Offerings Dashboard

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business

More information

Data Virtualization A Potential Antidote for Big Data Growing Pains

Data Virtualization A Potential Antidote for Big Data Growing Pains perspective Data Virtualization A Potential Antidote for Big Data Growing Pains Atul Shrivastava Abstract Enterprises are already facing challenges around data consolidation, heterogeneity, quality, and

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

6 Steps to Faster Data Blending Using Your Data Warehouse

6 Steps to Faster Data Blending Using Your Data Warehouse 6 Steps to Faster Data Blending Using Your Data Warehouse Self-Service Data Blending and Analytics Dynamic market conditions require companies to be agile and decision making to be quick meaning the days

More information

The Internet of Things and Big Data: Intro

The Internet of Things and Big Data: Intro The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific

More information

Artur Borycki. Director International Solutions Marketing

Artur Borycki. Director International Solutions Marketing Artur Borycki Director International Solutions Agenda! Evolution of Teradata s Unified Architecture Analytical and Workloads! Teradata s Reference Information Architecture Evolution of Teradata s" Unified

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

How to Navigate Big Data with Ad Hoc Visual Data Discovery Data technologies are rapidly changing, but principles of 30 years ago still apply today

How to Navigate Big Data with Ad Hoc Visual Data Discovery Data technologies are rapidly changing, but principles of 30 years ago still apply today How to Navigate Big Data with Ad Hoc Visual Data Discovery Data technologies are rapidly changing, but principles of 30 years ago still apply today INTRODUCTION Data is the heart of TIBCO Spotfire. It

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

A Whole New World. Big Data Technologies Big Discovery Big Insights Endless Possibilities

A Whole New World. Big Data Technologies Big Discovery Big Insights Endless Possibilities A Whole New World Big Data Technologies Big Discovery Big Insights Endless Possibilities Dr. Phil Shelley Query Execution Time Why Big Data Technology? Days EDW Hours Hadoop Minutes Presto Seconds Milliseconds

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information