HIGH PERFORMANCE ANALYTICS FOR TERADATA



Similar documents
BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Artur Borycki. Director International Solutions Marketing

SAS and Teradata Partnership

UNIFY YOUR (BIG) DATA

Advanced In-Database Analytics

Investor Presentation. Second Quarter 2015

Revolution R Enterprise

INVESTOR PRESENTATION. First Quarter 2014

ANALYTICS CENTER LEARNING PROGRAM

Teradata s Big Data Technology Strategy & Roadmap

Understanding the Value of In-Memory in the IT Landscape

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

INVESTOR PRESENTATION. Third Quarter 2014

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

The Future of Data Management

IBM PureData Systems. Robert Božič 2013 IBM Corporation

The Data Mining Process

Big Data. Fast Forward. Putting data to productive use

High Performance Predictive Analytics in R and Hadoop:

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Teradata Unified Big Data Architecture

Welcome. Host: Eric Kavanagh. The Briefing Room. Twitter Tag: #briefr

CoolaData Predictive Analytics

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

KnowledgeSEEKER Marketing Edition

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Harnessing the power of advanced analytics with IBM Netezza

ANALYTICS IN BIG DATA ERA

Using Tableau Software with Hortonworks Data Platform

Oracle Big Data Strategy Simplified Infrastrcuture

Analytics A survey on analytic usage, trends, and future initiatives. Research conducted and written by:

Turning Data Into Answers With HP Vertica

Data-Driven Decisions: Role of Operations Research in Business Analytics

Find the Hidden Signal in Market Data Noise

Introducing Oracle Exalytics In-Memory Machine

Advanced Big Data Analytics with R and Hadoop

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

III JORNADAS DE DATA MINING

Reporting trends and pain points of current and new customers IBM Corporation

Analytics Industry Trends Survey. Research conducted and written by:

Driving Value From Big Data

Empowering the Masses with Analytics

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

Advanced analytics at your hands

Starting Smart with Oracle Advanced Analytics

Course Syllabus For Operations Management. Management Information Systems

CERULIUM TERADATA COURSE CATALOG

Ramesh Bhashyam Teradata Fellow Teradata Corporation

Open Source in Financial Services: Meet the challenges of new business models and disruption

Safe Harbor Statement

Fast and Easy Delivery of Data Mining Insights to Reporting Systems

Business Intelligence In SAP Environments

Smarter Analytics. Barbara Cain. Driving Value from Big Data

Achieving Business Value through Big Data Analytics Philip Russom

Big Data and Data Science: Behind the Buzz Words

The Future of Data Management with Hadoop and the Enterprise Data Hub

Predictive Analytics: Turn Information into Insights

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Warehouse as a Service. Lot 2 - Platform as a Service. Version: 1.1, Issue Date: 05/02/2014. Classification: Open

SPSS Modeler Integration with IBM DB2 Analytics Accelerator

SAP Database Strategy Overview. Uwe Grigoleit September 2013

Understanding Data Warehouse Needs Session #1568 Trends, Issues and Capabilities

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

The Enterprise Data Hub and The Modern Information Architecture

How To Learn To Use Big Data

IBM Netezza High Capacity Appliance

Bringing Big Data into the Enterprise

Sunnie Chung. Cleveland State University

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data

<Insert Picture Here> The Age of the Pure Play BI Vendor is Over

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

IBM Netezza Analytics

MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Laurence Liew General Manager, APAC. Economics Is Driving Big Data Analytics to the Cloud

IBM Data Warehousing and Analytics Portfolio Summary

High-Performance Analytics

EMC/Greenplum Driving the Future of Data Warehousing and Analytics

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Business Analytics and the Nexus of Information

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

TERADATA QUERY GRID. Teradata User Group September 2014

Introduction to Data Mining

Machine Learning with MATLAB David Willingham Application Engineer

Information Architecture

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

Outlines. Business Intelligence. What Is Business Intelligence? Data mining life cycle

Netezza and Business Analytics Synergy

Application of Predictive Analytics for Better Alignment of Business and IT

Comprehensive Analytics on the Hortonworks Data Platform

Transcription:

F

HIGH PERFORMANCE ANALYTICS FOR TERADATA F

F BORN AND BRED IN FINANCIAL SERVICES AND HEALTHCARE. DECADES OF EXPERIENCE IN PARALLEL PROGRAMMING AND ANALYTICS. FOCUSED ON MAKING DATA SCIENCE HIGHLY PERFORMING AND ACCESSIBLE.

F AUGMENT BUSINESS INTELLIGENCE AND ANALYSIS ACCELERATE ANALYTIC PROCESSES AND DATA SCIENCE ADVANCE BIG MATH AND BIG DATA

F SQL SPSS R RDBMS SAS Python UNIX MPP Matlab Clustering Excel GPU Regression Decision Trees Data Mining Machine Learning EDW Hadoop

Break the Bonds of Traditional Analytics F Big Math meets Big Data to solve your analytics problems Analyze your entire data set.............. no more data sampling required Exceed your Service Level Agreements.... unmatched, parallel, in-database performance Bring predictive power to the masses...... on demand analytics with no user licenses Accelerate existing analytic procedures....sas, R, SPSS, MatLab, etc. Integrate with any existing interface....... SAS, R, Excel, Microstrategy, Tableau, Business Objects, Cognos, Mobile applications, etc.

F

F

Business Solutions with In-Database Analytics Investment & Commercial Banking Retail Banking Media/Telecom Retail MANUFACTURING Health & Life Sciences Insurance Portfolio Management Market Risk Management Credit Risk Management (Credit Card, Mortgage) Wallet Share Analysis Customer Churn Customer Lifetime Value Demand Forecasting Inventory Optimization Demand forecasting Inventory optimization Predictive Modeling of Chronic Illness Adverse Reaction Analysis Property & Casualty Loss Estimation Risk Management Credit Risk Management Campaign Management Packaging of Programming Channels Market Basket Analysis Root cause analysis of defects Provider Scoring Pricing & Risk Models Pricing Sales & Marketing Revenue Optimization of Pay Per View Movies Customer Segmentation Yield optimization Pharmaceutical Benefits Analysis Marketing Analytics Equity Analysis Tick Data Analysis Compliance Movie Recommendation Engine Product Promotion Product Recommendation Engine Drug Trial Simulation Catastrophe Modeling

Full Platform Support All 679 in-database functions are certified on... Teradata (1700) Extreme Data Appliance Teradata (2700) Data Warehouse Appliance Teradata (6700) Active Enterprise Data Warehouse Teradata Aster Big Analytics Appliance Teradata Software Versions 13.10, 14.0, & 14.10+ Aster Software Version 6.1+

Disk Array VPROCS Fuzzy Logix Teradata Integration BYNET Fuzzy Logix functions are integrated at the lowest possible level in order to complement and exploit the efficiencies in the Teradata architecture by: > Reducing data movement between the AMPs and between the Teradata Server and Clients VPROCs AMP & PE VPROCs AMP & PE VPROCs AMP & PE VPROCs AMP & PE > Functions run IN the database process avoiding any interprocess communication and memory space duplication Fuzzy Logix implementation spans the following types of functions: > C++ External Stored Procedures > C++ User Defined Functions Scalar Functions Aggregate Functions Table Functions Implementation choice is tailored for each function Functions are accessible via SQL language making them pervasive and non-intrusive Any client supporting the SQL interface (even via ODBC/JDBC) can access the functions

TERADATA UNIFIED DATA ARCHITECTURE ERP VIEWPOINT VIEWPOINT TVI TVI, MDM MDM GOVERNANCE & INTEGRATION CONNECTORS UNITY SQL-H, UNITY, STUDIO Marketing Marketing Executives SCM CRM INTEGRATED DATA WAREHOUSE Applications Operational Systems Images DATA PLATFORM Business Intelligence Frontline Workers Audio and Video TERADATA DATABASE (1700) TERADATA DATABASE (2700, 6700) Data Mining Customers Partners Machine Logs DISCOVERY PLATFORM Math and Stats Engineers Data Scientists Text Web and Social SOURCES HADOOP (HORTONWORKS) TERADATA ASTER DATABASE Languages ANALYTIC TOOLS Business Analysts USERS

In-Database Analytics - Example

F

HOW IT WORKS TODAY F Analytic Tools SAS SAS 100 a@b 099 b@c 100 e@f 1 a@b 15 2 b@c 0 3 e@f 21 a@b a b b@c b c e@f e f. 013 a@b. 021 b@c. 553 e@f LISTS DATA INTERMEDIATE MODEL SCORES DATABASE x. 01234 x1. 00013 x2. 141414 METADATA ANALYSIS SERVER PREDICTIONS DATA WAREHOUSE EXTRACT SELECT SYNTHESIZE CLASSIFY CLUSTER LOAD TREND CHART VISUALIZE VALIDATE ACT DECIDE

F LOADING AND UNLOADING DATA TAKES A LONG TIME. ANALYTIC RESULTS SEGREGATED BY PLATFORM. PREDICTIONS ARE RETURNED IN BATCH-TIME. THE ANALYSIS SERVER IS HEAVILY RESOURCE-CONSTRAINED.

Eliminates data movement. Results are universally F accessible. Predictions available in real-time. Maximizes use of resources. Analytic Tools SAS SAS 100 a@b 099 b@c 100 e@f 1 a@b 15 2 b@c 0 3 e@f 21 a@b a b b@c b c e@f e f. 013 a@b. 021 b@c. 553 e@f LISTS DATA INTERMEDIATE MODEL SCORES DATABASE x. 01234 x1. 00013 x2. 141414 METADATA ANALYSIS SERVER PREDICTIONS DATA WAREHOUSE EXTRACT SELECT SYNTHESIZE CLASSIFY CLUSTER LOAD TREND CHART VISUALIZE VALIDATE ACT DECIDE

F DATA MOVEMENT IS ELIMINATED. THE RESULTS OF ANALYTICS ARE UNIVERSALLY ACCESSIBLE. PREDICTIONS ARE AVAILABLE IN REAL-TIME TO A BROAD AUDIENCE. RESOURCES ARE MAXIMIZED ACROSS THE ORGANIZATION

Analytics Growth Options SAS multi-tier environment pulling data from Oracle Slow & expensive Implement only Teradata (replacing Oracle) 10x faster Modify the SAS code to run SQL in the database (Aggregation, Summation, Data Manipulation) 20x faster Modify (replace or augment) the SAS code with In-Database Analytics 100x 1000x faster Financial Services POC Results for 100,000 Linear Regressions Legacy Environment 20 hours Revolution R (in database) 50 minutes Fuzzy Logix DB Lytix 33 seconds

Benchmarks Pharma: Drug Simulation (matchit poisson simulation) 200,000 observations Pharma: Drug Simulation (matchit poisson simulation) 1,200,000 observations Retail: Market basket analysis for the largest retailer in America 486 Billion rows Retail: Marketing co-movement and scoring models Retail: Demand Forecast for 300 stores and 3000 product categories Healthcare: Provider scoring for one of the largest insurers in America Healthcare: Preventative Medicine 500 variables, 25+ million rows (Large regression, sparse matrix) Media: Large cable and internet provider customer analytics (regressions) Banking: Value at risk for equity options - 2.5 billion simulations Manufacturing: Warranty analysis for 15,000 cars and 1,200 variables Manufacturing: Warranty analysis for 250,000 cars and 1,200 variables R 5 hours R Not possible SAS 20 hours SAS 4 hours MatLab 5 days SAS/Oracle 25 jobs and 6 weeks Not possible SAS 10 hours N/A SIMCA 24 hours SIMCA Not Possible Fuzzy Logix 3 minutes Fuzzy Logix 5 minutes SAS + Fuzzy Logix 2 hours SAS + Fuzzy Logix 17 minutes Fuzzy Logix 46 minutes Fuzzy Logix 1 job in 4 minutes Fuzzy Logix 3 minutes SAS + Fuzzy Logix 10 minutes Fuzzy Logix 3 minutes Fuzzy Logix 6 minutes Fuzzy Logix 54 minutes

Gilead: Performance Benchmarks Pharmaceutical Research Scientific computation used for drug research Identify hypotheses, create cohorts, test hypotheses on cohorts with statistical analysis Computations include matching recipients between two treatment groups, Poisson Regression and Monte Carlo Simulations Critical for FDA approval Performance Benchmark 26

Disease Prediction & Translational Medicine Predictive Healthcare Predict future health episodes based on existing conditions Statistical analysis with sparse matrices Not possible with traditional approach Built predictive models in minutes Analyze 25 million lives & 500 disease code variables in less than 2 minutes Functions Used Hypothesis Testing, Logistic Regression, Weighted Logistic Regression, Stepwise Logistic Regression 10

Retail Inventory Optimization Major Retailer: Forecasting model 300 stores and 3000 product categories Current Situation: Takes 3-5 days with conventional analytics Teradata + DB Lytix Data Preparation takes 10-15 minutes Stepwise Regression for 300 stores and 3000 product categories takes 30 minutes Scoring for 300 stores and 3000 product categories performed in less than 1 minute 29

Warranty & Repair Analytics Warranty Data Analysis for Automobile Manufacturer Current Situation: Takes 10-12 hours for data preparation, another 10-12 hours for analysis Teradata + DB Lytix: Orthogonal PLS Benchmarks 30

Credit Risk Management Customer Default & Payment Prediction Identify credit card customers who may default Predict payment amount of customers who under pay Identify customers who make significantly high payments to target for acquiring other products 54 billion rows processed Functions Used Backward Logistic Regression, Decision Tree 31

Compliance Internal Rate of Return (IRR) Calculation IRR Calculation Wealth management company wants to calculate IRR for each customer s portfolio Using traditional analytic platform the process takes one day Today s Solution with Fuzzy Logix: 10 billion rows 10 million portfolios Entire process takes 7 minutes Functions Used Fin Lytix Fixed Income Mathematics, NPV algorithms 32

VWAP: 23 Million Trades to a wireless ipad

Break the Bonds of Traditional Analytics F Big Math meets Big Data to solve your analytics problems Analyze your entire data set.............. no more data sampling required Exceed your Service Level Agreements.... unmatched, parallel, in-database performance Bring predictive power to the masses...... on demand analytics with no user licenses Accelerate existing analytic procedures....sas, R, SPSS, MatLab, etc. Integrate with any existing interface....... SAS, R, Excel, Microstrategy, Tableau, Business Objects, Cognos, Mobile applications, etc.