What Are They Thinking? With Oracle Application Express and Oracle Data Miner Roel Hartman Brendan Tierney
Agenda Who are we The Scenario Graphs & Charts in APEX - Live Demo Oracle Data Miner & DBA tasks Including Oracle Data Mining in APEX Building a Workflow based on Oracle Data Mining
Brendan Tierney Currently: Lecturer DBA Data Mining Consultant BI & Data Architect Trainer Working with Oracle products since 1992/1993 Oracle version 5 up to 11g Oracle Reports (RPT), ReportWriter I, RPT, Forms 2.3 Oracle Data Miner since 2005 Data Warehousing since 1997 Data Mining since 1998 Analytics since 1993 Available in ebook & Print formats
4
The Scenario But? Is there an Alternative? + =
The Scenario We have a number of products We got the opinions from Amazon (star rating) Can we use Data Mining to predict opinions Can we build interactive dashboards in the DB Data Mining & Interactive Dashboards with APEX all in-side the Database
APEX - POOR MAN S BI TOOL 7
8
+ any JavaScript charting engine you like
DEMO 10
ORACLE ADVANCED ANALYTICS 11
Advanced Analytics Option
Technique Algorithms Applicability Classification Regression Logistic Regression (GLM) Decision Trees Naïve Bayes Support Vector Machine Multiple Regression Support Vector Machine Classical Statistical Technique Popular / Rules / Transparency Embedded Wide / Narrow Data / Text Classical Statistical Technique Wide / Narrow Data / Text Anomaly Detection One Class SVM Lack Examples Attribute Importance Minimum Descriptive Length Attribute Reduction Identify Useful Data Reduce Data Noise Association Rules Apriori Market Basket Analysis Link Analysis Clustering Feature Extraction Enhanced K- Means O- Cluster Expectation Maximization Non- Negative Matrix Factorization Principal Components Analysis Singular Vector Decomposition Product Grouping Text Mining Gene and Protein Analysis Text Analysis Feature Reduction
Oracle'Data'Mining'! PL/SQL'Package'! DBMS_DATA_MINING'! DBMS_DATA_MINING_TRANSFORM'! DBMS_PREDICTIVE_ANALYTICS'! 12c' 'PredicBve'Queries'! aka''dynamic'queries'! TransiBve'dynamic'Data'Mining'models'! Can'scale'to'many'100+'models'all'in'one' statement'' OTN Technical Article! SQL'FuncBons' PREDICTION' PREDICTION_PROBABILITY' PREDICTION_BOUNDS' PREDICTION_COST' PREDICTION_DETAILS' PREDICTION_SET' CLUSTER_ID' CLUSTER_DETAILS' CLUSTER_DISTANCE' CLUSTER_PROBABILITY' CLUSTER_SET' FEATURE_ID' FEATURE_DETAILS' FEATURE_SET' FEATURE_VALUE'
Sta$s$cal(Func$ons(in(Oracle( All(of(these(are( FREE(( with(the(database( These(are(o:en( forgo<en(about(
Text Mining in Oracle Natural language processing Oracle Text It deals with the actual text element. It transforms it into a format that the machine can use. Oracle Data Mining It uses the information given by the NLP and uses a lot of maths to determine whether something is negative or positive. Artificial intelligence / Machine Learning All done in Oracle Data Miner (using Oracle Text) Allows Data Analysts to do this Isolated from the underlying complexity
How is it done with Oracle Text & Oracle Advanced Analytics Product Review Human Labelling Tokenization Stop Word Punctuation Text Ready for DM Machine Learning Algorithms Evaluation Model New Product Reviews Sentiment Score Visualisation / Presentation Actionable Insights
Let us have a closer look at what Oracle Text does
Tokenization Tokenization is the process of breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens. The list of tokens becomes input for further processing such as parsing or text mining Tokens are separated by whitespace characters, such as a space or line break, or by punctuation characters. Punctuation and whitespace may or may not be included in the resulting list of tokens. Today 28 Sept we are at OUF Sunday. Today 28 Sept we are at OUF Sunday.
Stop Words For analyzing twitter we can include hash tags e.g. #OOW14
Stop Words For analyzing twitter we can include hash tags e.g. #OOW14 Today 28 Sept we are at OUF Sunday.
Punctuations Characters that are defined as punctuations are removed from a token before text indexing., : ; @ ~ # { } [ ] + = - _ ( ) * & ^ % $! ` \ /? Today 28 Sept OUF Sunday. Product Review Human Labelling Tokenization Stop Word Punctuation Text Ready for DM
Using your Sentiment Analyzer Then add to the Business Model layer CUSTOMER_SENTIMENT CUSTOMER TRANSACTIONS TRANSACTIONS CUSTOMER TITLE NAME STATUS SEX AGE RATING LOCATION DEFAULTS REGION PRED PROBABILITY CUSTOMER_SENTIMENT CUSTOMER_SENTIMENT CUSTOMER SENTIMENT_V CASE_ID PRED PROBABILITY Add the view to Physical layer of the BI Repository
33 lines of SQL code The models are first class objects in the DB to build and implement a Just like calling any other function Sentiment Classifier They are fast in Oracle Built a model on 550,000 records in 2 minutes Scored 1.2M records in 52 seconds (on a mid spec development sever) >80M records per hour without using the Parallel Option
DEMO - ADDING ADVANCED ANALYTICS TO APEX GRAPHS 27
And then there is Interactive Reports
- Create a visualisation of your model - Dashboard - Use your model for workflow decisions DEMO 29
All inside the Database + = APEX - SMARTMAN S POOR BI TOOL 30
Roel Hartman Brendan Tierney roel@apexconsulting.nl @roelh brendan.tierney@oralytics.com @brendantierney