GO BIG WITH DATA PLATFORMS: HADOOP AND TERADATA 1700

Similar documents
Teradata s Big Data Technology Strategy & Roadmap

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Artur Borycki. Director International Solutions Marketing

INVESTOR PRESENTATION. First Quarter 2014

Investor Presentation. Second Quarter 2015

Teradata Unified Big Data Architecture

INVESTOR PRESENTATION. Third Quarter 2014

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Welcome. Host: Eric Kavanagh. The Briefing Room. Twitter Tag: #briefr

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

The Future of Data Management

HDP Hadoop From concept to deployment.

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Advanced In-Database Analytics

Oracle Big Data SQL Technical Update

Actian SQL in Hadoop Buyer s Guide

Implement Hadoop jobs to extract business value from large and varied data sets

HDP Enabling the Modern Data Architecture

UNIFY YOUR (BIG) DATA

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Oracle Database 12c Plug In. Switch On. Get SMART.

Please give me your feedback

ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE

TERADATA QUERY GRID. Teradata User Group September 2014

#TalendSandbox for Big Data

Ganzheitliches Datenmanagement

Data Warehouse as a Service. Lot 2 - Platform as a Service. Version: 1.1, Issue Date: 05/02/2014. Classification: Open

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Getting Started Practical Input For Your Roadmap

Oracle Big Data Building A Big Data Management System

Modern Data Architecture for Predictive Analytics

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

HIGH PERFORMANCE ANALYTICS FOR TERADATA

Using Tableau Software with Hortonworks Data Platform

SAS and Teradata Partnership

Luncheon Webinar Series May 13, 2013

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Safe Harbor Statement

Platfora Big Data Analytics

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

Hadoop in the Hybrid Cloud

Introducing Oracle Exalytics In-Memory Machine

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Microsoft Big Data. Solution Brief

The Future of Data Management with Hadoop and the Enterprise Data Hub

Apache Hadoop: The Big Data Refinery

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Bringing Big Data to People

MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata

Microsoft Analytics Platform System. Solution Brief

Big Data must become a first class citizen in the enterprise

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

James Serra Sr BI Architect

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Oracle Big Data Handbook

<Insert Picture Here> Big Data

Navigating the Big Data infrastructure layer Helena Schwenk

Bringing the Power of SAS to Hadoop. White Paper

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Big Data: Making Sense of it all!

Big Data and Your Data Warehouse Philip Russom

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Extend your analytic capabilities with SAP Predictive Analysis

The Evolving Apache Hadoop Eco-System

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

How To Handle Big Data With A Data Scientist

Integrating a Big Data Platform into Government:

Big Data Realities Hadoop in the Enterprise Architecture

Apache Hadoop's Role in Your Big Data Architecture

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Building Your Big Data Team

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

CA Big Data Management: It s here, but what can it do for your business?

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

Beyond Lambda - how to get from logical to physical. Artur Borycki, Director International Technology & Innovations

The Enterprise Data Hub and The Modern Information Architecture

Parallel Data Warehouse

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Tap into Hadoop and Other No SQL Sources

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

CERULIUM TERADATA COURSE CATALOG

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

SQL Server 2012 Parallel Data Warehouse. Solution Brief

Innovative technology for big data analytics

SAP and Hortonworks Reference Architecture

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

A Modern Data Architecture with Apache Hadoop

IBM Netezza High Capacity Appliance

A HIGH-PERFORMANCE, SCALABLE BIG DATA APPLIANCE LAURA CHU-VIAL, SENIOR PRODUCT MARKETING MANAGER JOACHIM RAHMFELD, VP FIELD ALLIANCES OF SAP

Cost-Effective Business Intelligence with Red Hat and Open Source

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Transcription:

GO BIG WITH DATA PLATFORMS: HADOOP AND TERADATA 1700 Cesar Rojas Director of Product Marketing Data Science & Hadoop cesar.rojas@teradata.com Spring Teradata User Group Meetings: Los Angeles

AGENDA What is a Data Platform? Teradata and Hadoop Teradata Portfolio for Hadoop Integrated Big Data Platform 1700 When to use which? Q&A 2 Copyright Teradata

TERADATA UNIFIED DATA ARCHITECTURE System Conceptual View ERP MOVE MANAGE ACCESS Marketing Marketing Executives SCM CRM INTEGRATED DATA WAREHOUSE Applications Operational Systems Images DATA PLATFORM Business Intelligence Customers Partners Audio and Video PLATFORM FAMILY Data Mining Frontline Workers Machine Logs Text INTEGRATED BIG DATA PLATFORM APPLIANCE FOR HADOOP INTEGRATED DISCOVERY PLATFORM Math and Stats Business Analysts Data Scientists Languages Web and Social BIG ANALYTICS APPLIANCE Engineers SOURCES ANALYTIC TOOLS & APPS USERS

Key Trend: Data Platform Pressure on IT budgets requires companies to optimize workloads and drive down costs Data-driven insights and analytics are integral to daily operations, driving mission-critical requirements Long-term trending, year-over-year comparisons, and regulatory compliance are driving companies to retain more raw data for longer periods Companies have a Hadoop first mentality and want to see Hadoop succeed or fail before they consider other technology 4 Copyright Teradata

Requirements of a Data Platform Data Platform > History and long term storage > Transformations > Batch processing > Raw data capture > Low $/Terabyte DATA PLATFORM Data Mining Engineers Integrated Data Platform Math and Stats > Multiple uses on one platform > Current, archive and raw data > Ad hoc deep analytics on new data Data Scientists > Resource Flexibility Languages 5 Copyright Teradata ANALYTIC TOOLS USERS

Teradata and Hadoop Positioning Teradata Hadoop Characteristics High performance analytics and complex joins High concurrency SQL (ANSI and ACID compliant) Advanced workload mgmt High Availability Data Governance Emerging Late Binding One-stop support Security Use Cases Low $/TB Long-Term Raw Data Storage ETL offload Reporting Deep Analytics Characteristics Fast Data Landing and Staging MapReduce, Hive, Pig Emerging SQL/SQLlike interfaces Batch-oriented processing Low workload concurrency Multi-structured and file based data Late Binding Open Source Community 6 Copyright Teradata

TERADATA PORTFOLIO FOR HADOOP

Why are customers adopting Hadoop? Cost containment on growing data No-ETL data loading Open source Flexibility Development Community CEO read about it on plane 0 0.2 0.4 0.6 0.8 1 8 Copyright Teradata

Hadoop coming to enterprises 202 Customers Surveyed * Source: IDC's Red Hat Hadoop Usage Survey, August 2013 Hadoop is increasingly being adopted by customers to handle non-traditional data / use cases 9 Copyright Teradata

Challenges with today s Hadoop deployments Hadoop technologies lack enterprise features on Availability, Manageability, Supportability. Scarcity of Hadoop Resources (know-how, talent ), Finding & Retaining people; steep learning curve Unfamiliarity and lack of integration tools to existing and new data sources inside enterprises Hadoop world is constantly evolving with multiple players; lacks stability with software Challenges are slowing down full production 10 Copyright Teradata

Teradata Strategy for Hadoop Market Objectives - Become #1 advisor to customers in design & implementation of Hadoop - Provide complete Hadoop solutions (hardware, software, services & support) which leverage core Teradata IP and skills Product: close Hadoop enterprise gaps & invest in strengths - Provide enterprise-ready Hadoop offerings Easier to deploy, manage, secure, monitor, and service as part of ecosystem Business-friendly interfaces to our analytical platforms (SQL-H, Teradata Studio) - Develop IP and partnerships in key Hadoop use cases (e.g., data lake ) - Interoperability & connectors with popular distros (e.g., Sqoop w/cloudera) GTM and ecosystem strategies - Ride the hype of Hadoop; steer conversation to biz value - Promote Teradata Portfolio for Hadoop (products & services) but support other Hadoop distributions: provide customer choice - Message Hadoop strengths in staging/etl of Unified Data Architecture - Engage community via Hortonworks: credibility & influence product direction 11 Copyright Teradata

TERADATA PORTFOLIO FOR HADOOP Taking Hadoop from Silicon Valley to Main Street Most Trusted and Flexible Hadoop Platforms for Your Next-Generation Unified Data Architecture 1. Teradata Aster Big Analytics Appliance 2. Teradata Appliance for Hadoop 3. Teradata Commodity Offering with Dell 4. Hortonworks Data Platform software-only support resell Complete consulting and training capability > Big Analytics Services across the UDA > Data Integration Optimization ETL, ELT across the UDA > Hadoop deployment and mentoring > Teradata delivering Hortonworks training > Hadoop Managed Services operations and administration Customer Support for Hadoop > World-class Teradata customer support, backed by Hortonworks 12 Copyright Teradata

Introducing the Appliance for Hadoop Teradata Appliance for Hadoop is enterprise class > Landing area and data lake for raw files of any type > Data refining engine some transformations and simple math at scale > Archival system for histories of data with low or unknown value Teradata Enterprise Access for Hadoop > Enables business user to easily access Hadoop data with standard SQL from within the Teradata Database and BI tools > SQL-H provides on-the-fly access to data, leveraging HCatalog > Teradata Studio w/smart loader for Hadoop: ad-hoc data movement Best-of-breed Technology Partner Value Add > Hortonworks engineering relationship: SQL-H, Viewpoint integration with Ambari, and high performance Hadoop nodes > Protegrity, Informatica, Revelytix 13 Copyright Teradata

Teradata Vital Infrastructure Teradata Appliance for Hadoop Highlights Aster and Teradata SQL-H Teradata Studio with Smart Loader Value Added Software from Partners Teradata Viewpoint Teradata Connector for Hadoop (TDCH) Intelligent Start and Stop NameNode Failover Teradata Distribution for Hadoop (Based on Hortonworks HDP) Optimized hardware for Hadoop BYNET V5 40GB/s InfiniBand interconnect 14 Copyright Teradata

InfiniBand BYNET V5 Value-Add Performance > Automated network load balancing > High speed interconnect > Intra and Inter system communication High Availability > Automated network failover > Redundancy across two active fabrics > Multiple level network isolation Server Management (SM) Service-ability > Delivers automated core addressing and naming services > Gives Services org a holistic view of systems on the fabric > Provides automated hardware monitoring > Proactive phone home alerting It s not just for performance! 15 Copyright Teradata

Why Teradata Appliance for Hadoop? Building a Hadoop Cluster Teradata Multiple vendor relationships Procure, Set up, Install Updates Hardware, Software Integration test deploy Multiple consoles One vendor easier acquisition Quick to set up, Plug n play Elimination of integration complexity Predictable performance Single pane of glass management 16 Copyright Teradata

Teradata Appliance for Hadoop Teradata Open Distribution for Hadoop (TDH) Core Based on Hortonworks Leading Hadoop distribution Highest number of committers for Apache Hadoop Influence Hadoop roadmap via projects Teradata Enterprise Level Components Value added enterprise components Simplify Hadoop Operations Intelligent Hadoop Builder 17 Copyright Teradata

Teradata Appliance for Hadoop Optimized Hadoop Infrastructure Hardened, Finetuned system Hardware and Software Tuned for high Performance Preconfigured nodes in a readyto-go box BYNET Connectivity High speed network for data transfer Automatic Load Balancing Network Machine failover Teradata Vital Infrastructure 24X7 Proactive monitoring of components Automatic Alert creation and notification Reduced incidents 20 Copyright Teradata

Teradata Appliance for Hadoop Enterprise Access for Hadoop SQL Access to Hadoop Data On-the-fly SQL access from Teradata/Aster Give business analysts ANSI SQL 90+ prepackaged analytics High Speed Connectors Teradata connectors for Hadoop ( TDCH ) Smart Loader functionality via Teradata Studio Drag/drop data across systems 25 Copyright Teradata

Teradata Appliance for Hadoop Enterprise Readiness Availability Manageability World Class TD Support Critical NameNode availability Automatic failover via BYNET Redundant network access via dual networks Centralized management for multiple systems Integrated monitoring & reporting Configurable UI, Metrics analysis Industry leading support from Teradata Connected to Teradata vital infrastructure Single vendor support for all platforms TD Services for Hadoop Leadership in Data Architectures Applying Best practices & process Years of expertise in serving large customer data environments 32 Copyright Teradata

Teradata Hadoop for Active Data Archive Active data archive for better data management Situation High performance storage is expensive. A large integrated pharmacy HC provider deals with a variety of data with different business value. All data cannot be store on the same system. Ever expanding data is only adding to this challenge. Problem Long terms storage data cannot be queried and it takes a long time for retrieval. No analysis can be performed on the archived data. Losing out on business value from this valuable data. Solution Used Teradata Hadoop nodes to store all the data coming in from weblogs, medical data, JSON files. Hadoop also serves as a enrichment layer to enhance data for high-end analytics consumption. The complete solution provides easy movement of data from Hadoop, Aster and Teradata. Impact Reduced storage costs for data variety Perform adhoc analytics on the multiple versions of data Retrieve data in minutes ( vs. days with tape archives ) Reduced load and improved performance of DW/Databases 40 Copyright Teradata

Active Data Archive Different kinds of data exists at the customer s architecture > Enterprise data, web logs, medical records All of this data needs to be retained and queried Current Limitations: Storage costs are prohibitively expensive to store all data in the enterprise databases; Not all data is enterprise and has same value Overloaded production and backup systems > Necessity to keep only business and mandated data in enterprise DBs Current archival systems have very long restore times No ability to query or ad-hoc analytics on archived data 41 Copyright Teradata

New Data Architecture with Teradata Hadoop Platform Enterprise Data < 1 year old Oracle Teradata SQL Operational Queries Enterprise Data Platforms Business Analysts SQOOP Connectors HIVE Querylayer Visualization Tools Tableau Microstrategy Excel, ODBC/JDBC Unstructured Data Weblogs, voice call recordings HCATOG METADATA LAYER Node 1 Node 2 Node 3 Node N HDFS/MAP REDUCE Cluster Teradata Hadoop Platform HIVE Queries on data > 1 year old Enterprise data < 1 year goes to high-end databases Data > 1 year and unstructured data goes to Hadoop 42 Copyright Teradata

New Data Journey Realization Business Value Efficient use of higher-end systems > Reduce network traffic, IO, CPU consumption > Reduce load of traditional systems > Ease of backup of enterprise data > Integrated architecture for data management Unlocking business value from historical data > HiveQL queries to query Hadoop data > SQL-H queries to query combined data > Archived data access in minutes (vs. days) Reduced spend on Enterprise DBs > Store and analyze the data where it belongs 43 Copyright Teradata

TERADATA INTEGRATED BIG DATA PLATFORM 1700

INTEGRATED BIG DATA PLATFORM 1700 45 Copyright Teradata

Paradigm Shift One Platform for Many Uses Extreme Data Appliance 1700 Integrated Big Data Platform 1700 Single Use Platform > Analytical Archiving Low $/TB Large Capacity Low / Slow Performance Low Concurrency Configured for Data Capacity Only Multiple Usages > Contextual Analytics > Resource Flexibility > Always On > Corporate Memory Lower $/TB > Comparable to Hadoop Configure based on Nodes for Higher Performance across Multiple Workloads 46 Copyright Teradata

One Platform, Many Uses Contextual Analytics Resource Flexibility Always On Corporate memory Unrefined Multi-structured data Current data Archival data Raw data IDW data years 1-5 IDW data years 5-10 Unrefined structured data 47 Copyright Teradata

Integrated Big Data Platform Contextual Analytics Resource Flexibility Always On Corporate memory Deep analytics Data Labs Data refinery Hadoop integration Ad hoc projects Peak workload assist Disaster recovery High availability Archive reporting & retrieval Audit and compliance 48 Copyright Teradata

Differentiation Use case Integrated Big Data Platform Differentiation Contextual Analytics Resource Flexibility Always On Corporate Memory Deep analytics Data Labs Data refinery Hadoop integration Ad hoc projects Peak workloads assist Disaster Recovery High Availability Archive reporting and retrieval Audit and compliance Join big data to context in IDW Best big data SQL and performance Self-service sandboxes and Hadoop queries Push-down transformers Easy query balancing across systems Workload management 49 Copyright Teradata Automated failover Sync two systems with robust tools Query rerouting Full security, trustworthy data Easy, selfservice queries No spinning tapes, no programming

Value-add Use case Integrated Big Data Platform Value Added Contextual Analytics Resource Flexibility Always On Corporate Memory Deep analytics Data Labs Data refinery Hadoop integration Ad hoc projects Peak workloads assist Disaster Recovery High Availability Archive reporting and retrieval Audit and compliance QueryGrid and Smart Loader for Hadoop Data Labs selfservice exploration Workload management for SLAs Unity Director select query routing of apps and users Workload management surge controls 50 Copyright Teradata Unity Loader for dual loading Unity Director for automatic failover Workload management for outage SLAs Unity Director for query routing based on data depth

CONTEXTUAL ANALYTICS Contextual Analytics Resource Flexibility Always On Corporate memory Deep analytics Data Labs Data refinery Hadoop integration Ad hoc projects Peak workload assist Disaster recovery High availability Archive reporting & retrieval Audit and compliance 51 Copyright Teradata

Contextual Analytics Deep Analytics Contextual Analytics xdr analytics > Analyze xdr, and smart phone logs > Calling patterns, fraud, usage patterns Consumer sentiment analytics > Brand and products likes/dislikes Clickstream analytics > Optimize website, digital spend, web site design Sensor/machine analytics > Proactive maintenance, provisioning > Healthcare, telematics, > Utilities (water, electricity, etc.) Location based analytics > Manage operations where they occur 52 Copyright Teradata

Contextual Analytics Data Refinery Contextual Analytics Consider 1700 when offloading ELT Benefits > Lower cost system > Little to no ETL rewrite > Continue using favorite transformation tools and scripts > Reference data available for transformations > Preserve security and access rights > Teradata Unity automates data sync ELT offload X Considerations > SLA s for data availability on IDW > System-to-system dependencies > Available CPU resources on IDW 53 Copyright Teradata Integrated Big Data Platform Hadoop

Handling Multi-structured data with SQL Store data objects in database > Weblogs, JSON, XML, CSV, etc. > VarChar, CLOB, or BLOB Teradata Data Warehouse Built-in functions > Name value pair functions > String handlers, REGEX > JSONpath operators XML XML 41521390 2013-01- 0100:25:4 22.111.94. 18Mozilla/5.0(Macintos h; U; Intel weblogs JSON > XML and Xquery Table Operators > Dynamic input schema, output schema > Use C++/Java to unravel complex objects into columns Late-binding flexibility 54 Copyright Teradata

RESOURCE FLEXIBILITY Contextual Analytics Resource Flexibility Always On Corporate memory Deep analytics Data Labs Data refinery Hadoop integration Ad hoc projects Peak workload assist Disaster recovery High availability Archive reporting & retrieval Audit and compliance 55 Copyright Teradata

Resource Flexibility Ad Hoc Projects Resource Flexibility The Executive Request > New inventory supplier > Urgent marketing campaign > Sales manager challenges numbers > Marketing buys sample social media data > What if projects Fast reaction > Fire disrupts supply chain > Hurricane relief plan > Major competitor action Mergers and acquisitions 56 Copyright Teradata

Resource Flexibility Peak Workload Assist Resource Flexibility Load balance prime time user activity > Support subset of users > Common during month end, quarter end, retail Mondays Help meet batch SLAs > Daily batch reports > Month end, quarter end, CFO and sales summaries Value-Add Enabling SW > Unity Director, Loader, Data Mover, Ecosystem Manager > Workload Management 57 Copyright Teradata

ALWAYS ON Contextual Analytics Resource Flexibility Always On Corporate memory Deep analytics Data Labs Data refinery Hadoop integration Ad hoc projects Peak workload assist Disaster recovery High availability Archive reporting & retrieval Audit and compliance 58 Copyright Teradata

Disaster Recovery Always On Always On Maintain all or a portion of the production IDW for use in a true disaster > Unity Director, Unity Loader, Unity Data Mover, Unity Ecosystem Manager Minimum necessary users and applications > Keep the core business running Teradata Unity 59 Copyright Teradata

Always On High Availability Always On Data warehouses are operational, mission-critical systems > Continuous data access to end users Planned maintenance of production warehouse > Software updates > Hardware upgrades Unplanned outages > Hardware or software failures hidden from users > Reduces pressure on IT for system recovery 60 Copyright Teradata

CORPORATE MEMORY Contextual Analytics Resource Flexibility Always On Corporate memory Deep analytics Data Labs Data refinery Hadoop integration Ad hoc projects Peak workload assist Disaster recovery High availability Archive reporting & retrieval Audit and compliance 61 Copyright Teradata

Corporate Memory Archival Reporting and Retrieval Corporate memory Archival Reporting > Marketing - revisit lost customers > CFO - track fraud back further > Manufacturing - compare parts cost trends > Call center - find old warranties, call logs Self-service query and reporting > Long term trends > Ad hoc historical questions > Small tactical look-ups Reduced dependency on tape files Keep 5-10+ years history Audit and government demands Financial security and trust Equal opportunity employment Fair lending practices Tax audit (ugh) Common requirements 5-10 years of data storage Fast report turn around Trusted data Secure environment Self-service queries 62 Copyright Teradata

Performance Comparison Benchmark to sort 1TB of data in 1 minute > This is a very basic benchmark to sort a TB, typical usage with concurrency, joins, mix workloads Teradata will do even better > Hadoop requires 8x the number of nodes to sort 1TB of data in 1 minute Hadoop 1700 27 nodes 206 nodes ~8x servers Another Customer test shows even more impressive results > Query took 1 second on the 1700 vs. 20 minutes on Hadoop Hadoop 90 nodes ~15x 1700 6 nodes servers 63 Copyright Teradata

CUSTOMER SUCCESSES 64 Copyright Teradata

A/B testing Contextual analytics: join behavior to IDW data Digital investment optimization Hadoop integration Archive reporting and retrieval Dual load Peak workload assist Load refine data Join for image IDW 10PB structured analytics Analyze & Report Singularity 36PB weblogs, IDW copy 65 Copyright Teradata Discover & Explore Hadoop 50PB bot detection, images

Large US Credit Card Company Deep history queries Compliance queries BAR / DR Future plan for data load Unity Director Unity Loader ~350TB / 10 nodes each site Regulatory queries, deep history, IDW copy (BAR) 66 Copyright Teradata

When to Use Which? You Have Many Choices 67 Copyright Teradata 6700 Hadoop Aster 1700 Structured data X X X Multi-structured X X JSON, XML, weblog ETL Statistics X X X X Interactive Queries X Evolving X X MapReduce X X Graph X X N-Path Predictive analytics In-DB Programmatic SQL-MR In-DB Interactive Performance high low-med med-high med Data Governance high Evolving med-low high Interactive tools All Few All All X

THANK YOU TO OUR TUG SPONSOR Trusted supplier to major OEMs for 30 years Joint engineering with Teradata Fully integrated with Teradata nodes and Database New technology > Chromium FX RAID controllers which support 5.2 Gb/s SAS 2.0 > Inde EcoStor technology eliminates the need for cache batteries 68 Copyright Teradata

BACKUP SLIDES

When to Use Which? Workload Schema Scale Access methods Teradata Appliance for Hadoop 70 Copyright Teradata Integrated Big Data Platform 1700 Batch processing of data at scale. Improving capabilities to support a Hundreds of concurrent users performing interactive analytics. Batch processing small number of interactive users Typically schema can be defined after Typically schema is defined before data is data is stored stored (Native with JSON, XML and Weblog) Can scale to large data volumes at low Can scale to large data volumes at cost moderate or significant cost Data accessed through programs Data accessed through SQL and BI tools created by developers, SQL-like systems, and other methods SQL Flexible programming, evolving SQL ANSI SQL Raw Cleansed (ETL) (Native with JSON, XML and Data Weblog) Access Scans Seeks Complexity Complex processing Complex joins Cost/Efficiency Low cost of storage and processing. Efficient use of CPU/IO Executes on tens to thousands of Very fast response times servers Benefits Parallelization of traditional programming languages (Java, C++, Python, Perl, etc.) Supports higher-level programming frameworks such as Pig and HiveQL Radically changes the economic model for storing high volumes of data Easy to consume data Rationalization of data from multiple sources into a single enterprise view Clean, safe, secure data Cross-functional analysis Transform once, use many

Comparing Data Platform Configurations Teradata Appliance for Hadoop Integrated Big Data Platform 1700 Nodes -full rack 18 MPP nodes/cabinet 1+1, 2+1, 3+0 MPP nodes/cabinet Node CPU Storage Total user data capacity Master (Qty. 2): dual 8-core Intel Xeon @2.60GHz Data (Qty. 16): dual 6-core Intel Xeon @2.0GHz 192 3TB HDDs/cabinet 152TB/cabinet (9.5 TB/data node uncompressed) Dual 8-core Intel Xeon @2.60GHz 168 3TB HDDs /cabinet (+6 global hot spares) 229TB/cabinet (114 TB/node uncompressed) Memory Management, troubleshooting and support Availability 256GB per master node 128GB per data node Teradata Vital Infrastructure, Teradata Viewpoint, single source software and hardware support Software data replication Up to 512GB per node Teradata Vital Infrastructure, Teradata Viewpoint, single source software and hardware support Hot standby node available, global hot spare drives Interconnect 40GB InfiniBand 40GB InfiniBand OS SUSE Linux 11 SUSE Linux 11 71 Copyright Teradata