Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Similar documents
Big Data and Predictive Analytics: Fiserv Data Mining Case Study [CON8631] Data Warehouse and Big Data

Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2

Fraud and Anomaly Detection Using Oracle Advanced Analytic Option 12c

Starting Smart with Oracle Advanced Analytics

Getting Started with Oracle Data Miner 11g R2. Brendan Tierney

Advanced In-Database Analytics

Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining

Tax Fraud in Increasing

Exadata V2 + Oracle Data Mining 11g Release 2 Importing 3 rd Party (SAS) dm models

The Data Mining Process

Big Data Analytics with Oracle Advanced Analytics

Data Analysis with Various Oracle Business Intelligence and Analytic Tools

Big Data Analytics with Oracle Advanced Analytics In-Database Option

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Oracle Data Miner (Extension of SQL Developer 4.0)

Data Mining with Oracle Database 11g Release 2

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Using OBIEE for Location-Aware Predictive Analytics

Oracle Data Miner (Extension of SQL Developer 4.0)

How To Handle Big Data With A Data Scientist

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

Oracle Database 12c Plug In. Switch On. Get SMART.

Harnessing the power of advanced analytics with IBM Netezza

Oracle Data Mining Hands On Lab

The Oracle Data Mining Machine Bundle: Zero to Predictive Analytics in Two Weeks Collaborate 15 IOUG

Oracle Data Mining 11g Release 2

Extend your analytic capabilities with SAP Predictive Analysis

ORACLE TAX ANALYTICS. The Solution. Oracle Tax Data Model KEY FEATURES

KnowledgeSEEKER Marketing Edition

Oracle Data Mining In-Database Data Mining Made Easy!

SEIZE THE DATA SEIZE THE DATA. 2015

Oracle Big Data Essentials

Oracle Big Data Strategy Simplified Infrastrcuture

New Features in Oracle Application Express 4.1. Oracle Application Express Websheets. Oracle Database Cloud Service

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

IBM SPSS Modeler 15 In-Database Mining Guide

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Greenplum Database. Getting Started with Big Data Analytics. Ofir Manor Pre Sales Technical Architect, EMC Greenplum

IBM SPSS Modeler 14.2 In-Database Mining Guide

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

extreme Datamining mit Oracle R Enterprise

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Building In-Database Predictive Scoring Model: Check Fraud Detection Case Study

Sunnie Chung. Cleveland State University

SEIZE THE DATA SEIZE THE DATA. 2015

Make Better Decisions Through Predictive Intelligence

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

Oracle Big Data SQL Technical Update

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Cisco Data Preparation

Overview, Goals, & Introductions

Hexaware E-book on Predictive Analytics

Fusion Applications Overview of Business Intelligence and Reporting components

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Advanced Big Data Analytics with R and Hadoop

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Azure Machine Learning, SQL Data Mining and R

An In-Depth Look at In-Memory Predictive Analytics for Developers

Optimized Hadoop for Enterprise

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

IBM SPSS Modeler Professional

Confidently Anticipate and Drive Better Business Outcomes

Oracle Big Data Building A Big Data Management System

PSG College of Technology, Coimbatore Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

STATISTICA Solutions for Financial Risk Management Management and Validated Compliance Solutions for the Banking Industry (Basel II)

What Are They Thinking? With Oracle Application Express and Oracle Data Miner

KnowledgeSEEKER POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Predictive Analytics

Budgeting and Planning with Microsoft Excel and Oracle OLAP

About Dell Statistica

2015 Workshops for Professors

Knowledge Discovery from patents using KMX Text Analytics

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Oracle Database 11g Comparison Chart

The basic data mining algorithms introduced may be enhanced in a number of ways.

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

ADVANCED ANALYTICS AND FRAUD DETECTION THE RIGHT TECHNOLOGY FOR NOW AND THE FUTURE

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

Graph Database Performance: An Oracle Perspective

EMC Greenplum Driving the Future of Data Warehousing and Analytics. Tools and Technologies for Big Data

An Oracle BI and EPM Development Roadmap

Anomaly and Fraud Detection with Oracle Data Mining

Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Big Data: Are You Ready? Kevin Lancaster

Transcription:

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Charlie Berger, MS Eng, MBA Sr. Director Product Management, Data Mining and Advanced Analytics charlie.berger@oracle.com www.twitter.com/charliedatamine 1

Oracle Advanced Analytics Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics Key Features In-database data mining algorithms and open source R algorithms SQL, PL/SQL, R languages Scalable, parallel in-database execution Workflow GUI and IDEs Integrated component of Database Enables enterprise analytical applications 2

7 Data Mining Partners Oracle Advanced Analytics Evolution Analytical SQL in the Database Oracle Data Mining Oracle acquires 9.2i launched 2 Thinking Machine Corp s algorithms (NB dev. team + Darwin and AR) via Java data API mining software Oracle Data Mining 10g & 10gR2 introduces SQL dm functions, 7 new SQL dm algorithms and new Oracle Data Miner Classic wizards driven GUI ODM 11g & 11gR2 adds AutoDataPrep (ADP), text New algorithms (EM, PCA, SVD SQLDEV/Oracle Data Miner 4.0 work flow GUI launched with SQL script generation and SQL Query node (R integration) mining, perf. improvements OAA/ORE 1.3 + 1.4 SQLDEV/Oracle Data Miner launched adding several 3.2 work flow GUI new scalable R launched algorithms Oracle Adv. Analytics for Hadoop Connector launched with scalable BDA algorithms Integration with R and introduction/addition of Oracle R Enterprise Product renamed Oracle Advanced Analytics (ODM + ORE) 1998 1999 2002 2004 2005 2008 2011 2015 3

Oracle Advanced Analytics Performance and Scalability with Low Total Cost of Ownership Traditional Analytics Data Import Data Mining Model Scoring Data Prep. & Transformation Data Mining Model Building Data Prep & Transformation Data Extraction Hours, Days or Weeks Oracle Advanced Analytics avings Model Scoring Embedded Data Prep Model Building Data Preparation Secs, Mins or Hours Data remains in the Database Scalable, parallel Data Mining algorithms in SQL kernel Fast parallelized native SQL data mining functions, SQL data preparation and efficient execution of R open-source packages High-performance parallel scoring of SQL data mining functions and R opensource models Fastest way to deliver enterprise-wide predictive analytics Integrated GUI for Predictive Analytics Database scoring engine Lowest TCO Eliminate data duplication Eliminate separate analytical servers Leverage investment in Oracle IT 4

More Data Variety Better Predictive Models Increasing sources of relevant data can boost model accuracy Model with Big Data and hundreds -- thousands of input variables including: Demographic data Purchase POS transactional data Unstructured data, text & comments Spatial location data Long term vs. recent historical behavior Web visits Sensor data etc. Responders 100% 0% Population Size Naïve Guess or Random 100% Model with 20 variables Model with 75 variables Model with 250 variables 5

Oracle Advanced Analytics Architecture SQL Developer R Client OBIEE Applications Oracle Database Enterprise Edition Oracle Advanced Analytics Native SQL Data Mining/Analytic Functions + High-performance R Integration for Scalable, Distributed, Parallel Execution 6

Fusion HCM Predictive Workforce Predictive Analytics Applications Fusion Human Capital Management Powered by OAA Oracle Advanced Analytics factory-installed predictive analytics Employees likely to leave and predicted performance Top reasons, expected behavior Real-time "What if?" analysis 7

Oracle Data Miner SQL Developer 4.0 Extension Free OTN Download Easy to Use Oracle Data Miner GUI for data analysts Work flow paradigm Powerful Multiple algorithms & data transformations Runs 100% in-db Build, evaluate and apply models Automate and Deploy Save and share analytical workflows Generate SQL scripts for deployment 8

Oracle Advanced Analytics (Database) Option Oracle Data Miner 4.0 Summary New Features Oracle Data Miner/SQLDEV 4.0 (for Oracle Database 11g and 12c) New Graph node (box, scatter, bar, histograms) SQL Query node + integration of R scripts Automatic SQL script generation for deployment Oracle Advanced Analytics 12c features exposed in Oracle Data Miner New SQL data mining algorithms/enhancements Expectation Maximization clustering algorithm PCA & Singular Vector Decomposition algorithms Improved/automated Text Mining, Prediction Details and other algorithm improvements) Predictive SQL Queries automatic build, apply within SQL query 9

SQL Developer/Oracle Data Miner 4.0 New Features Graph node Scatter, line, bar, box plots, histograms Group_by supported R 10

SQL Developer/Oracle Data Miner 4.0 New Features SQL Query node Allows any form of query/transformation/ statistics within an ODM r work flow Use SQL anywhere to handle special/ unique data manipulation use cases Recency, Frequency, Monetary (RFM) SQL Window functions for e,g. moving average of $$ checks written past 3 months vs. past 3 days Allows integration of R Scripts R 11

SQL Developer/Oracle Data Miner 4.0 New Features SQL Script Generation Deploy entire methodology as a SQL script Immediate deployment of data analyst s methodologies R 12

SQL Developer/Oracle Data Miner 4.0 New Features SQL Query node Allows integration of R Scripts R 13

SQL Developer/Oracle Data Miner 4.0 New Features R SQL Query node Allows integration of R Scripts 14

SQL Developer/Oracle Data Miner 4.0 New Features Database/Data Mining Parallelism On/Off Control Allows users to take full advantage of Oracle parallelism/scalability on an Oracle Data Miner node by node basis Default is Off Important for large Oracle Database & Oracle Exadata shops Parallel Query On (All) R 15

12c New Features New Server Functionality 3 New Oracle Data Mining SQL functions algorithms Expectation Maximization (EM) Clustering New Clustering Technique Probabilistic clustering algorithm that creates a density model of the data Improved approach for data originating in different domains (for example, sales transactions and customer demographics, or structured data and text or other unstructured data) Automatically determines the optimal number of clusters needed to model the data. Principal Components Analysis (PCA) Data Reduction & improved modeling capability Based on SVD, powerful feature extraction method use orthogonal linear projections to capture the underlying variance of the data Singular Value Decomposition (SVD) Big data workhorse technique for matrix operations Scales well to very large data sizes (both rows and attributes) for very large numerical data sets (e.g. sensor data, text, etc.) R 16

12c Preview New Server Functionality Text Mining Support Enhancements This enhancement greatly simplifies the data mining process (model build, deployment and scoring) when text data is present in the input: Manual pre-processing of text data is no longer needed. No text index needs to be created Additional data types are supported: CLOB, BLOB, BFILE Character data can be specified as either categorical values or text R 17

12c New Features New Server Functionality Predictive Queries Immediate build/apply of ODM models in SQL query Classification & regression Multi-target (nested) problems Clustering query Anomaly query Feature extraction query OAA automatically creates multiple anomaly detection models Grouped_By and scores by partition via powerful SQL query R Select cust_income_level, cust_id, round(probanom,2) probanom, round(pctrank,3)*100 pctrank from ( select cust_id, cust_income_level, probanom, percent_rank() over (partition by cust_income_level order by probanom desc) pctrank from ( select cust_id, cust_income_level, prediction_probability(of anomaly, 0 using *) over (partition by cust_income_level) probanom from customers ) ) where pctrank <=.05 order by cust_income_level, probanom desc; 18

12c New Features New Server Functionality Predictive Queries Immediate build/apply of ODM models in SQL query Classification & regression Multi-target (nested) problems Clustering query Anomaly query Feature extraction query OAA automatically creates multiple anomaly detection models Grouped_By and scores by partition via powerful SQL query Results/Predictions! R 19

Oracle Advanced Analytics Previews Oracle Data Miner 4.1 & OAA 12.2 Preview Oracle Data Miner/SQLDEV 4.0 New JSON Query node to work with Data Source node Oracle Advanced Analytics 12.2c Re-engineering all OAA/ODM algorithms for maximum scalability & parallelism Analytical components available for faster/tighter R integration for selected R algorithms Enables direct uptake into Oracle Data Miner GUI Model viewers, advanced options settings, etc. Partition-By modeling 20

Oracle Advanced Analytics Option Links and Resources Oracle Advanced Analytics Overview: Link to presentation Big Data Analytics using Oracle Advanced Analytics In-Database Option OAA data sheet on OTN Oracle Internal OAA Product Management Wiki and Workspace YouTube recorded OAA Presentations and Demos: Oracle Advanced Analytics and Data Mining at the YouTube Movies (6 + OAA Demos on Retail, Fraud, Loyalty, Overview, etc.) Getting Started: Link to Getting Started w/ ODM blog entry Link to New OAA/Oracle Data Mining 2-Day Instructor Led Oracle University course. Link to OAA/Oracle Data Mining 4.0 Oracle by Examples (free) Tutorials on OTN Take a Free Test Drive of Oracle Advanced Analytics (Oracle Data Miner GUI) on the Amazon Cloud (Vlamis Partner) Link to SQL Developer Days Virtual Event w/ downloadable Virtual Machine (VM) images of Oracle Database + ODM/ODMr and e-training for Hands on Labs Link to OAA/Oracle R Enterprise (free) Tutorial Series on OTN Additional Resources: Oracle Advanced Analytics Option on OTN page OAA/Oracle Data Mining on OTN page, ODM Documentation & ODM Blog OAA/Oracle R Enterprise page on OTN page, ORE Documentation & ORE Blog Oracle SQL based Basic Statistical functions on OTN Business Intelligence, Warehousing & Analytics BIWA Summit 2014, Jan 14-16 at Oracle HQ Conference Center 21

22