Machine Learning with MATLAB David Willingham Application Engineer



Similar documents
Is a Data Scientist the New Quant? Stuart Kozola MathWorks

How To Build A Trading Engine In A Microsoft Microsoft Matlab (A Trading Engine)

Turning Data into Actionable Insights: Predictive Analytics with MATLAB WHITE PAPER

Deploying MATLAB -based Applications David Willingham Senior Application Engineer

Predictive Modeling Techniques in Insurance

Data Analysis with MATLAB The MathWorks, Inc. 1

MATLAB for Use in Finance Portfolio Optimization (Mean Variance, CVaR & MAD) Market, Credit, Counterparty Risk Analysis and beyond

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data

Introduction to MATLAB for Data Analysis and Visualization

Maschinelles Lernen mit MATLAB

Introduction to MATLAB Gergely Somlay Application Engineer

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

ANALYTICS CENTER LEARNING PROGRAM

Machine learning for algo trading

Credit Risk Modeling with MATLAB

Origins, Evolution, and Future Directions of MATLAB Loren Shure

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML

Learning outcomes. Knowledge and understanding. Competence and skills

Session 62 TS, Predictive Modeling for Actuaries: Predictive Modeling Techniques in Insurance Moderator: Yonasan Schwartz, FSA, MAAA

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Make Better Decisions Through Predictive Intelligence

MS1b Statistical Data Mining

Azure Machine Learning, SQL Data Mining and R

Predictive Analytics

Software Development Training Camp 1 (0-3) Prerequisite : Program development skill enhancement camp, at least 48 person-hours.

What s New in MATLAB and Simulink

Bayesian networks - Time-series models - Apache Spark & Scala

MATLAB as a Collaboration Platform Marta Wilczkowiak Senior Applications Engineer MathWorks

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

The Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS?

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Machine Learning for Data Science (CS4786) Lecture 1

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

IBM SPSS Modeler Professional

Get to Know the IBM SPSS Product Portfolio

Bringing Big Data Modelling into the Hands of Domain Experts

Data Mining + Business Intelligence. Integration, Design and Implementation

Information and Decision Sciences (IDS)

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

MathWorks Products and Prices North America Academic March 2013

Machine Learning Capacity and Performance Analysis and R

Chapter 12 Discovering New Knowledge Data Mining

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

DATA MINING TECHNIQUES AND APPLICATIONS

Predict Influencers in the Social Network

Advanced In-Database Analytics

DATA SCIENCE CURRICULUM WEEK 1 ONLINE PRE-WORK INSTALLING PACKAGES COMMAND LINE CODE EDITOR PYTHON STATISTICS PROJECT O5 PROJECT O3 PROJECT O2

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

Some vendors have a big presence in a particular industry; some are geared toward data scientists, others toward business users.

Solving Big Data Problems in Computer Vision with MATLAB Loren Shure

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Machine Learning. CUNY Graduate Center, Spring Professor Liang Huang.

Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining

Find the Hidden Signal in Market Data Noise

The Data Mining Process

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Principles of Data Mining by Hand&Mannila&Smyth

CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

An Overview of Knowledge Discovery Database and Data mining Techniques

An In-Depth Look at In-Memory Predictive Analytics for Developers

Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2

Prerequisites. Course Outline

Data Science, Predictive Analytics & Big Data Analytics Solutions. Service Presentation

Applications of Deep Learning to the GEOINT mission. June 2015

APPROACHABLE ANALYTICS MAKING SENSE OF DATA

The Scientific Data Mining Process

The Predictive Data Mining Revolution in Scorecards:

INDIAN STATISTICAL INSTITUTE announces Training Program on Statistical Techniques for Data Mining & Business Analytics

: Introduction to Machine Learning Dr. Rita Osadchy

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Data Mining Algorithms Part 1. Dejan Sarka

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

Data Mining and Neural Networks in Stata

2015 The MathWorks, Inc. 1

ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)

Machine Learning: Overview

2015 MATLAB Conference Perth 21 st May 2015 Nicholas Brown. Deploying Electricity Load Forecasts on MATLAB Production Server.

Statistical Models in Data Mining

Comparison of K-means and Backpropagation Data Mining Algorithms

TIETS34 Seminar: Data Mining on Biometric identification

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

Machine Learning using MapReduce

Machine Learning CS Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Predictive Dynamix Inc

Transcription:

Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1

Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the machine learning workflow with MATLAB 2

Motivation Do you want to create a model of a system? Understand dynamics Predict Outputs How do you create a model? Develop an equation Takes time to develop, sometimes even years Unknown if there is actually an equation at all Another option, Machine Learning 3

Used Across Many Application Areas Biology Financial Services Image & Video Processing Audio Processing Energy Tumor Detection, Drug Discovery Credit Scoring, Algorithm Trading, Bond Classification Pattern Recognition Speech Recognition Load, Price Forecasting, Trading 4

Machine Learning Characteristics and Examples Characteristics Lots of data (many variables) System too complex to know the governing equation (e.g., black-box modeling) Examples Pattern recognition (speech, images) Financial algorithms (credit scoring, algo trading) Energy forecasting (load, price) Biology (tumor detection, drug discovery) AAA 93.68% AA 2.44% A 0.14% BBB 0.03% BB 0.03% B 0.00% CCC 0.00% 5.55% 0.59% 92.60% 4.03% 4.18% 91.02% 0.23% 7.49% 0.12% 0.73% 0.00% 0.11% 0.00% 0.00% 0.18% 0.00% 0.73% 0.15% 3.90% 0.60% 87.86% 3.78% 8.27% 86.74% 0.82% 9.64% 0.37% 1.84% 0.00% 0.00% 0.00% 0.00% 0.08% 0.00% 0.39% 0.06% 3.28% 0.18% 85.37% 2.41% 6.24% 81.88% 0.00% 0.06% 0.08% 0.16% 0.64% 1.64% 9.67% D 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 0.00% 100.00% AAA AA A BBB BB B CCC D 6

Challenges Machine Learning Significant technical expertise required No one size fits all solution Locked into Black Box solutions Time required to conduct the analysis 7

Overview Machine Learning Type of Learning Categories of Algorithms Unsupervised Learning Clustering Machine Learning Group and interpret data based only on input data Supervised Learning Develop predictive model based on both input and output data Classification Regression 8

Unsupervised Learning k-means, Fuzzy C-Means Hierarchical Clustering Neural Networks Gaussian Mixture Hidden Markov Model 9

Supervised Learning Regression Neural Networks Decision Trees Ensemble Methods Non-linear Reg. (GLM, Logistic) Linear Regression Classification Support Vector Machines Discriminant Analysis Naive Bayes Nearest Neighbor 10

Supervised Learning - Workflow Speed up Computations Select Model Data Train the Model Use for Prediction Import Data Explore Data Prepare Data Known data Known responses Model Model New Data Predicted Responses Measure Accuracy 11

Classification Overview What is classification? Predicting the best group for each point Learns from labeled observations Uses input features Why use classification? Accurately group data never seen before 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Group 1 Group 2 Group 3 Group 4 Group 5 Group 6 Group 7 Group 8-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 How is classification done? Can use several algorithms to build a predictive model Good training data is critical 12

Clustering Overview What is clustering? Segment data into groups, based on data similarity 1 0.9 0.8 0.7 0.6 Why use clustering? Identify outliers Resulting groups may be the matter of interest 0.5 0.4 0.3 0.2 0.1 How is clustering done? Can be achieved by various algorithms It is an iterative process (involving trial and error) 0-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 15

Deploying MATLAB Applications to Excel Toolboxes 3 1 MATLAB Desktop End-User Machine 2 MATLAB Compiler MATLAB Builder EX.dll.bas 18

Deployment Highlights Database Servers.exe HADOOP Excel Desktop Applications Spreadsheets Application Servers.NET C Client Front End Applications Web Applications Java CTF Batch/Cron Jobs Royalty-free deployment Point-and-click workflow Unified process for desktop and server apps 19

MATLAB for Machine Learning Challenges Time (loss of productivity) Extract value from data Computation speed Time to deploy & integrate Technology risk MATLAB Solution Rapid analysis and application development High productivity from data preparation, interactive exploration, visualizations. Machine learning, Video, Image, and Financial Depth and breadth of algorithms in classification, clustering, and regression Fast training and computation Parallel computation, Optimized libraries Ease of deployment and leveraging enterprise Push-button deployment into production High-quality libraries and support Industry-standard algorithms in use in production Access to support, training and advisory services when needed 20

Machine Learning with MATLAB Interactive environment Visual tools for exploratory data analysis Easy to evaluate and choose best algorithm Apps available to help you get started (e.g,. neural network tool, curve fitting tool) Multiple algorithms to choose from Clustering Classification Regression 21

Learn More : Machine Learning with MATLAB http://www.mathworks.com/discovery/ machine-learning.html Data Driven Fitting with MATLAB Classification with MATLAB Regression with MATLAB Multivariate Classification in the Life Sciences Electricity Load and Price Forecasting Credit Risk Modeling with MATLAB 22

Training Services Exploit the full potential of MathWorks products Flexible delivery options: Public training available worldwide Onsite training with standard or customized courses Web-based training with live, interactive instructor-led courses Self-paced interactive online training More than 30 course offerings: Introductory and intermediate training on MATLAB, Simulink, Stateflow, code generation, and Polyspace products Specialized courses in control design, signal processing, parallel computing, code generation, communications, financial analysis, and other areas 23

Continuous Improvement Consulting Services Accelerating return on investment A global team of experts supporting every stage of tool and process integration Process and Technology Standardization Process and Technology Automation Process Assessment Advisory Services Component Deployment Full Application Deployment Jumpstart Migration Planning Research Advanced Engineering Product Engineering Teams Supplier Involvement 24

Technical Support Resources Over 100 support engineers All with MS degrees (EE, ME, CS) Local support in North America, Europe, and Asia Comprehensive, product-specific Web support resources High customer satisfaction 95% of calls answered within three minutes 70% of issues resolved within 24 hours 80% of customers surveyed rate satisfaction at 80 100% 25

MATLAB Central Community for MATLAB and Simulink users Over 1 million visits per month File Exchange Upload/download access to free files including MATLAB code, Simulink models, and documents Ability to rate files, comment, and ask questions More than 12,500 contributed files, 300 submissions per month, 50,000 downloads per month Newsgroup Blogs Web forum for technical discussions about MathWorks products More than 300 posts per day Commentary from engineers who design, build, and support MathWorks products Open conversation at blogs.mathworks.com Based on February 2011 data 26

Questions? 27