F
HIGH PERFORMANCE ANALYTICS FOR TERADATA F
F BORN AND BRED IN FINANCIAL SERVICES AND HEALTHCARE. DECADES OF EXPERIENCE IN PARALLEL PROGRAMMING AND ANALYTICS. FOCUSED ON MAKING DATA SCIENCE HIGHLY PERFORMING AND ACCESSIBLE.
F AUGMENT BUSINESS INTELLIGENCE AND ANALYSIS ACCELERATE ANALYTIC PROCESSES AND DATA SCIENCE ADVANCE BIG MATH AND BIG DATA
F SQL SPSS R RDBMS SAS Python UNIX MPP Matlab Clustering Excel GPU Regression Decision Trees Data Mining Machine Learning EDW Hadoop
Break the Bonds of Traditional Analytics F Big Math meets Big Data to solve your analytics problems Analyze your entire data set.............. no more data sampling required Exceed your Service Level Agreements.... unmatched, parallel, in-database performance Bring predictive power to the masses...... on demand analytics with no user licenses Accelerate existing analytic procedures....sas, R, SPSS, MatLab, etc. Integrate with any existing interface....... SAS, R, Excel, Microstrategy, Tableau, Business Objects, Cognos, Mobile applications, etc.
F
F
Business Solutions with In-Database Analytics Investment & Commercial Banking Retail Banking Media/Telecom Retail MANUFACTURING Health & Life Sciences Insurance Portfolio Management Market Risk Management Credit Risk Management (Credit Card, Mortgage) Wallet Share Analysis Customer Churn Customer Lifetime Value Demand Forecasting Inventory Optimization Demand forecasting Inventory optimization Predictive Modeling of Chronic Illness Adverse Reaction Analysis Property & Casualty Loss Estimation Risk Management Credit Risk Management Campaign Management Packaging of Programming Channels Market Basket Analysis Root cause analysis of defects Provider Scoring Pricing & Risk Models Pricing Sales & Marketing Revenue Optimization of Pay Per View Movies Customer Segmentation Yield optimization Pharmaceutical Benefits Analysis Marketing Analytics Equity Analysis Tick Data Analysis Compliance Movie Recommendation Engine Product Promotion Product Recommendation Engine Drug Trial Simulation Catastrophe Modeling
Full Platform Support All 679 in-database functions are certified on... Teradata (1700) Extreme Data Appliance Teradata (2700) Data Warehouse Appliance Teradata (6700) Active Enterprise Data Warehouse Teradata Aster Big Analytics Appliance Teradata Software Versions 13.10, 14.0, & 14.10+ Aster Software Version 6.1+
Disk Array VPROCS Fuzzy Logix Teradata Integration BYNET Fuzzy Logix functions are integrated at the lowest possible level in order to complement and exploit the efficiencies in the Teradata architecture by: > Reducing data movement between the AMPs and between the Teradata Server and Clients VPROCs AMP & PE VPROCs AMP & PE VPROCs AMP & PE VPROCs AMP & PE > Functions run IN the database process avoiding any interprocess communication and memory space duplication Fuzzy Logix implementation spans the following types of functions: > C++ External Stored Procedures > C++ User Defined Functions Scalar Functions Aggregate Functions Table Functions Implementation choice is tailored for each function Functions are accessible via SQL language making them pervasive and non-intrusive Any client supporting the SQL interface (even via ODBC/JDBC) can access the functions
TERADATA UNIFIED DATA ARCHITECTURE ERP VIEWPOINT VIEWPOINT TVI TVI, MDM MDM GOVERNANCE & INTEGRATION CONNECTORS UNITY SQL-H, UNITY, STUDIO Marketing Marketing Executives SCM CRM INTEGRATED DATA WAREHOUSE Applications Operational Systems Images DATA PLATFORM Business Intelligence Frontline Workers Audio and Video TERADATA DATABASE (1700) TERADATA DATABASE (2700, 6700) Data Mining Customers Partners Machine Logs DISCOVERY PLATFORM Math and Stats Engineers Data Scientists Text Web and Social SOURCES HADOOP (HORTONWORKS) TERADATA ASTER DATABASE Languages ANALYTIC TOOLS Business Analysts USERS
In-Database Analytics - Example
F
HOW IT WORKS TODAY F Analytic Tools SAS SAS 100 a@b 099 b@c 100 e@f 1 a@b 15 2 b@c 0 3 e@f 21 a@b a b b@c b c e@f e f. 013 a@b. 021 b@c. 553 e@f LISTS DATA INTERMEDIATE MODEL SCORES DATABASE x. 01234 x1. 00013 x2. 141414 METADATA ANALYSIS SERVER PREDICTIONS DATA WAREHOUSE EXTRACT SELECT SYNTHESIZE CLASSIFY CLUSTER LOAD TREND CHART VISUALIZE VALIDATE ACT DECIDE
F LOADING AND UNLOADING DATA TAKES A LONG TIME. ANALYTIC RESULTS SEGREGATED BY PLATFORM. PREDICTIONS ARE RETURNED IN BATCH-TIME. THE ANALYSIS SERVER IS HEAVILY RESOURCE-CONSTRAINED.
Eliminates data movement. Results are universally F accessible. Predictions available in real-time. Maximizes use of resources. Analytic Tools SAS SAS 100 a@b 099 b@c 100 e@f 1 a@b 15 2 b@c 0 3 e@f 21 a@b a b b@c b c e@f e f. 013 a@b. 021 b@c. 553 e@f LISTS DATA INTERMEDIATE MODEL SCORES DATABASE x. 01234 x1. 00013 x2. 141414 METADATA ANALYSIS SERVER PREDICTIONS DATA WAREHOUSE EXTRACT SELECT SYNTHESIZE CLASSIFY CLUSTER LOAD TREND CHART VISUALIZE VALIDATE ACT DECIDE
F DATA MOVEMENT IS ELIMINATED. THE RESULTS OF ANALYTICS ARE UNIVERSALLY ACCESSIBLE. PREDICTIONS ARE AVAILABLE IN REAL-TIME TO A BROAD AUDIENCE. RESOURCES ARE MAXIMIZED ACROSS THE ORGANIZATION
Analytics Growth Options SAS multi-tier environment pulling data from Oracle Slow & expensive Implement only Teradata (replacing Oracle) 10x faster Modify the SAS code to run SQL in the database (Aggregation, Summation, Data Manipulation) 20x faster Modify (replace or augment) the SAS code with In-Database Analytics 100x 1000x faster Financial Services POC Results for 100,000 Linear Regressions Legacy Environment 20 hours Revolution R (in database) 50 minutes Fuzzy Logix DB Lytix 33 seconds
Benchmarks Pharma: Drug Simulation (matchit poisson simulation) 200,000 observations Pharma: Drug Simulation (matchit poisson simulation) 1,200,000 observations Retail: Market basket analysis for the largest retailer in America 486 Billion rows Retail: Marketing co-movement and scoring models Retail: Demand Forecast for 300 stores and 3000 product categories Healthcare: Provider scoring for one of the largest insurers in America Healthcare: Preventative Medicine 500 variables, 25+ million rows (Large regression, sparse matrix) Media: Large cable and internet provider customer analytics (regressions) Banking: Value at risk for equity options - 2.5 billion simulations Manufacturing: Warranty analysis for 15,000 cars and 1,200 variables Manufacturing: Warranty analysis for 250,000 cars and 1,200 variables R 5 hours R Not possible SAS 20 hours SAS 4 hours MatLab 5 days SAS/Oracle 25 jobs and 6 weeks Not possible SAS 10 hours N/A SIMCA 24 hours SIMCA Not Possible Fuzzy Logix 3 minutes Fuzzy Logix 5 minutes SAS + Fuzzy Logix 2 hours SAS + Fuzzy Logix 17 minutes Fuzzy Logix 46 minutes Fuzzy Logix 1 job in 4 minutes Fuzzy Logix 3 minutes SAS + Fuzzy Logix 10 minutes Fuzzy Logix 3 minutes Fuzzy Logix 6 minutes Fuzzy Logix 54 minutes
Gilead: Performance Benchmarks Pharmaceutical Research Scientific computation used for drug research Identify hypotheses, create cohorts, test hypotheses on cohorts with statistical analysis Computations include matching recipients between two treatment groups, Poisson Regression and Monte Carlo Simulations Critical for FDA approval Performance Benchmark 26
Disease Prediction & Translational Medicine Predictive Healthcare Predict future health episodes based on existing conditions Statistical analysis with sparse matrices Not possible with traditional approach Built predictive models in minutes Analyze 25 million lives & 500 disease code variables in less than 2 minutes Functions Used Hypothesis Testing, Logistic Regression, Weighted Logistic Regression, Stepwise Logistic Regression 10
Retail Inventory Optimization Major Retailer: Forecasting model 300 stores and 3000 product categories Current Situation: Takes 3-5 days with conventional analytics Teradata + DB Lytix Data Preparation takes 10-15 minutes Stepwise Regression for 300 stores and 3000 product categories takes 30 minutes Scoring for 300 stores and 3000 product categories performed in less than 1 minute 29
Warranty & Repair Analytics Warranty Data Analysis for Automobile Manufacturer Current Situation: Takes 10-12 hours for data preparation, another 10-12 hours for analysis Teradata + DB Lytix: Orthogonal PLS Benchmarks 30
Credit Risk Management Customer Default & Payment Prediction Identify credit card customers who may default Predict payment amount of customers who under pay Identify customers who make significantly high payments to target for acquiring other products 54 billion rows processed Functions Used Backward Logistic Regression, Decision Tree 31
Compliance Internal Rate of Return (IRR) Calculation IRR Calculation Wealth management company wants to calculate IRR for each customer s portfolio Using traditional analytic platform the process takes one day Today s Solution with Fuzzy Logix: 10 billion rows 10 million portfolios Entire process takes 7 minutes Functions Used Fin Lytix Fixed Income Mathematics, NPV algorithms 32
VWAP: 23 Million Trades to a wireless ipad
Break the Bonds of Traditional Analytics F Big Math meets Big Data to solve your analytics problems Analyze your entire data set.............. no more data sampling required Exceed your Service Level Agreements.... unmatched, parallel, in-database performance Bring predictive power to the masses...... on demand analytics with no user licenses Accelerate existing analytic procedures....sas, R, SPSS, MatLab, etc. Integrate with any existing interface....... SAS, R, Excel, Microstrategy, Tableau, Business Objects, Cognos, Mobile applications, etc.