Benefits of analytics using Microsoft Azure Machine Learning (ML) Tomaž

Similar documents
Microsoft Azure Machine learning Algorithms

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Azure Machine Learning, SQL Data Mining and R

Prerequisites. Course Outline

Data Mining Algorithms Part 1. Dejan Sarka

License table for Competency partners

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

INTRODUCING AZURE MACHINE LEARNING

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

SAP Predictive Analytics: An Overview and Roadmap. Charles Gadalla, SESSION CODE: 603

Course 10977A: Updating Your SQL Server Skills to Microsoft SQL Server 2014

No BI without Machine Learning

License table for Competency partners. Core licenses

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

An In-Depth Look at In-Memory Predictive Analytics for Developers

Machine learning for algo trading

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Developing Microsoft Azure Solutions 20532B; 5 Days, Instructor-led

License table for Microsoft Action Pack partners

April 2016 JPoint Moscow, Russia. How to Apply Big Data Analytics and Machine Learning to Real Time Processing. Kai Wähner.

Sunnie Chung. Cleveland State University

Predictive analytics for the business analyst: your first steps with SAP InfiniteInsight

from Larson Text By Susan Miertschin

MSCA Introduction to Statistical Concepts

Introduction to Data Mining

Course MS20467C Designing Self-Service Business Intelligence and Big Data Solutions

Car Insurance. Prvák, Tomi, Havri

Data Mining + Business Intelligence. Integration, Design and Implementation

Developing Microsoft Azure Solutions

BIG DATA What it is and how to use?

Developing Microsoft Azure Solutions 20532A; 5 days

Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata

Outlines. Business Intelligence. What Is Business Intelligence? Data mining life cycle

TURKISH ORACLE USER GROUP

MS 10978A Introduction to Azure for Developers

Model Deployment. Dr. Saed Sayad. University of Toronto

Copyrighted , Address :- EH1-Infotech, SCF 69, Top Floor, Phase 3B-2, Sector 60, Mohali (Chandigarh),

Course Description. Course Audience. Course Outline. Course Page - Page 1 of 5. Microsoft Azure Fundamentals M Length: 2 days Price: $ 1,295.

SURVEY REPORT DATA SCIENCE SOCIETY 2014

Upgrading Your SQL Server Skills to Microsoft SQL Server 2014 va

Machine Learning with MATLAB David Willingham Application Engineer

Updating Your SQL Server Skills from Microsoft SQL Server 2008 to Microsoft SQL Server 2014

Empowering the Masses with Analytics

Course Outline. Microsoft Azure Fundamentals Course 10979A: 2 days Instructor Led. About this Course. Audience Profile. At Course Completion

Course 10978A Introduction to Azure for Developers

MS 10977B Upgrading Your SQL Server Skills to Microsoft SQL Server 2014

Azure Data Lake Analytics

An Introduction to Data Mining

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

Designing Self-Service Business Intelligence and Big Data Solutions

whitepaper Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

How To Extend An Enterprise Bio Solution

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

SQL Server Analysis Services Complete Practical & Real-time Training

MS1b Statistical Data Mining

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

MSCA Introduction to Statistical Concepts

Updating Your SQL Server Skills to Microsoft SQL Server 2014

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

ANALYTICS CENTER LEARNING PROGRAM

Worldwide Advanced and Predictive Analytics Software Market Shares, 2014: The Rise of the Long Tail

2015 Workshops for Professors

Updating Your SQL Server Skills to Microsoft SQL Server 2014

The Brave New World of Power BI and Hybrid Cloud

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014

CLOUD COMPUTING & WINDOWS AZURE

Course 20532B: Developing Microsoft Azure Solutions

KnowledgeSEEKER POWERFUL SEGMENTATION, STRATEGY DESIGN AND VISUALIZATION SOFTWARE

Updating Your Microsoft SQL Server 2008 BI Skills to SQL Server 2008 R2

Data Mining Part 5. Prediction

Data Mining with SQL Server Data Tools

Predictive Data modeling for health care: Comparative performance study of different prediction models

Making big data simple with Databricks

The Data Mining Process

Big Data and Data Science: Behind the Buzz Words

Big Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify

10977B: Updating Your SQL Server Skills to Microsoft SQL Server 2014

SQL Server 2012 Business Intelligence Boot Camp

Advanced analytics at your hands

SOFTWARE TESTING TRAINING COURSES CONTENTS

Course 10977: Updating Your SQL Server Skills to Microsoft SQL Server 2014

Designing a Data Solution with Microsoft SQL Server 2014

Predictive Analytics

The PI System and Hadoop: Unleash the Power of Big Data

This three-day instructor-led course provides existing SQL Server database professionals with the knowledge

Data Mining. SPSS Clementine Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

SQL Server Business Intelligence

RAPIDMINER FREE SOFTWARE FOR DATA MINING, ANALYTICS AND BUSINESS INTELLIGENCE. Luigi Grimaudo Database And Data Mining Research Group

How To Understand How Weka Works

R and Hadoop: Architectural Options. Bill Jacobs VP Product Marketing & Field CTO, Revolution

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

IBM SPSS Modeler 15 In-Database Mining Guide

Mobile Monetization Scenario Design & Big Data. Arther Wu Senior Director of Monetization and Business Operation

10978A: Introduction to Azure for Developers

Microsoft Report Builder 3.0 with SQL 2008R2, SQL 2012 Express and SQL 2012 Course 55005A; 2 Days, Instructor-led

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

Transcription:

Benefits of analytics using Microsoft Azure Machine Learning (ML) Tomaž Kaštrun @tomaz_tsql

Gold sponsors

Platinum sponsor

Special Thanks Special Thanks to SQL Saturday Bratislava Organizers! Making SQL Server community stronger, bigger and better!

Speaker info BI Developer (MSSQL Server, C#, SAS, R, SAP, Py) 10+ ys experience MSSQL Server 15+ ys experience data analysis and DM, Data Scientist (NO!) Working: Spar ICS Österreich, Spar Slovenija MCPT, MCT SQL Server tomaz.kastrun@gmail.com @tomaz_tsql http://tsqljokes.tumblr.com/ https://tomaztsql.wordpress.com Publishing articles, speaking at SQL events Coffee Lover, Fixie bikes junkie

Microsoft and Machine Learning

2015 -> new SQL Server 2016 -> R integration in SQL Server (mid-end 2016) -> CTP2 SQL Server 2016 available https://www.microsoft.com/en-us/evalcenter/evaluate-sql-server-2016 -> April 2015 Microsoft acquires Revolution analytics -> What to expect (not confirmed) - multi-threading R analytics within SQL Server - in memory R analytics (RRO, MKL from Revolution Analytics) - Azure extensions - R language systematization - R libraries systematization

Intro to R and ML Implementation of the S statistical programming language 1. Originally invented in Bell Labs (formerly AT&T) in 1976 2. R first release dates ~1993 (more @ http://r-project.org), y. 2000 first stable production use 3. Last stable release: 3.2.1 (June 18th, 2015) 4. Open source, functional (imperative) programming with support of OOP 5. Extremely powerful graphics capabilities 6. Cross-platform, multi-paradigm 7. CRAN huge R library repository (6679 libraries; June 19th, 2015) (http://cran.r-project.org/web/packages/) 8. Large and growing ML/R/Data science community

How + where to get R -> R on CRAN (Comprehensive R Archive Network http://cran-r.project.org -> R Studio http://www.rstudio.com Since April 1st, 2015, Microsoft officially acquires Revolution analytics http://revolutionanalytics.com Microsoft already announced integration of R in SQL Server 2016.

DEMO #1 Language R

Machine Learning (ML) -> Machine learning is predicting future based on past data -> characteristics of past data are constantly being tested for model improvements

Machine Learning (ML) - Benefits

Supervised VS. Unsupervised -> Supervised learning Linear Regression

Supervised VS. Unsupervised -> Unsupervised learning Cluster analysis

Common Machine Learning Algorithms

Azure ML -> Fully-managed & Scalabel cloud service -> Focus on ability to develop & deploy -> For data-scientist, for statiticians and for emerging data-scientists -> Friendly User-interface for data science workflow -> Wide range of ML algorithms -> R and Python integration -> Support for R libraries

Title

Basic ML Workflow (modules)

Azure ML Modules -> Machine Learning libraries are encapsulated in modules -> each module can perform a task in machine learning scenario -> Workflow is a set of modules, connected among each other, from reading the data, applying ML algorithm to generating result -> Categories: -> Data format Conversions -> Data input and output -> Data Transformation -> Machine Learning Modules -> Statistical functions -> OpenCV Library, R Execution, Python Execution

Azure ML Modules -> Each module has additional attributes, features for fine tunning of generated output -> Modules have ports for establishing connections -> Modules can also visualize, download and save the output.

Azure ML Modules (Data transformation)

Azure ML Modules (Learning Models)

Selecting Classification Algorithm How large is your training data? To Avoid over-fitting use high bias/low variance classifiers such as Naive Bayes Do you need to train incrementally or in a batched mode? If you need to update your classifier with new data frequently (or you have a lot of data), you probably want to use Bayesian algorithms that update well. Both neural nets and SVM need to work on the training data in batch mode. Is your data exclusively categorical or exclusively numeric or a mixture of both kinds? Bayesian works best with categorical/binomial data. Decision trees can't predict numerical values. Do you or your audience need to understand how the classifier works? Use Bayesian or Decision Trees, since these can be easily explained to most people. Neural networks and SVM are "black boxes" in the sense that you can't really see how they are classifying data. How fast does your classification need to be generated? SVM's are fast when it comes to classifying since they only need to determine which side of the "line" your data is on. Decision trees can be slow especially when they're complex (e.g. lots of branches). How much complexity does the problem present or require? Neural nets and SVMs can handle complex non-linear classification.

Selecting Regression Algorithm Bayesian Linear Regression Boosted Decision Tree Regression Decision Forest Regression Linear Regression Neural Network Regression Ordinal Regression Poisson Regression

Analysis Services (SSAS) Task / Problem Algorithm Predicting a discrete attribute Microsoft Decision Trees Algorithm Flag the customers in a prospective buyers list as good or poor prospects. Microsoft Naive Bayes Algorithm Calculate the probability that a server will fail within the next 6 months. Microsoft Clustering Algorithm Categorize patient outcomes and explore related factors. Microsoft Neural Network Algorithm Predicting a continuous attribute Microsoft Decision Trees Algorithm Forecast next year's sales. Microsoft Time Series Algorithm Predict site visitors given past historical and seasonal trends. Microsoft Linear Regression Algorithm Generate a risk score given demographics. Microsoft Linear Regression Algorithm Predicting a sequence Microsoft Sequence Clustering Algorithm Perform clickstream analysis of a company's Web site. Microsoft Sequence Clustering Algorithm Analyze the factors leading to server failure. Microsoft Sequence Clustering Algorithm Capture and analyze sequences of activities during outpatient visits, to formulate best practices around common activities. Microsoft Sequence Clustering Algorithm Finding groups of common items in transactions Microsoft Association Algorithm Use market basket analysis to determine product placement. Microsoft Decision Trees Algorithm Suggest additional products to a customer for purchase. Microsoft Decision Trees Algorithm Analyze survey data from visitors to an event, to find which activities or booths were correlated, to plan future activities. Microsoft Decision Trees Algorithm Finding groups of similar items Microsoft Clustering Algorithm Create patient risk profiles groups based on attributes such as demographics and behaviors. Microsoft Sequence Clustering Algorithm Analyze users by browsing and buying patterns. Microsoft Sequence Clustering Algorithm Identify servers that have similar usage characteristics. Microsoft Sequence Clustering Algorithm

Analysis Services vs. Azure ML On-premises vs. Cloud Pricing Administrating / Corporate environment Algorithms and statistics Data visualization (Profit & lift charts for DM, Classification matrix, Neural Networks, ) Integration of ML service in schema of Azure services vs. SQL Server edition

DEMO #2 Working with modules

Azure ML Modules R Extended

Title 29 6/20/2015 Footer Goes Here

DEMO #3 R Script in Azure

Azure ML API -> Already included as part of Azure subscription -> Provides connection to ML workflow and external application -> Prepared for users to predict or score the model -> Supports two modes of operation -> Request response Service (is a low latency, high scale WS for synchronous singular prediction) -> Batch execution Service (is a asynchronous WS for bulk predictions)

Azure ML API -> Advantages: -> Launch your model in minutes for real-time predictions -> Publish into Azure data market for selling predictions to your customers -> Integrate your client with cloud ML API in minutes by leveraging ready to execute code -> Make most of your existing R and Python code by embedding it within Execute-R of Execute-Py Module

DEMO #4 Azure ML API

Azure Pricing Machine Learning is offered in two tiers: Free and Standard. Free:Experience the Machine Learning Studio for free using up to 10GB of your own data. Standard:Adds the ability to work over larger data sets from a broader range of data sources and deploy machine learning algorithms into production as Web Services in the ML API Service.

Azure Pricing ML Seat Subscription Monthly Fee 7.43/ Seat/ Month ML Studio Usage Hourly 0.74/Experiment Hour ML API Usage Hourly 1.48/Production API Compute Hour Transactions 0.37/1,000 Production API Transactions Valid on: June 11th, 2015

Azure Pricing SOURCE: http://azure.microsoft.com/en-gb/pricing/details/machine-learning/