# Introduction to Machine Learning

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Introduction to Machine Learning Isabelle Guyon

2 What is Machine Learning? Learning algorithm Trained machine TRAINING DATA Answer Query

3 What for? Classification Time series prediction Regression Clustering

4 Some Learning Machines Linear models Kernel methods Neural networks Decision trees

5 Applications training examples Ecology System diagnosis Market Analysis OCR HWR Machine Vision Text Categorization Bioinformatics inputs

6 Banking / Telecom / Retail Identify: Prospective customers Dissatisfied customers Good customers Bad payers Obtain: More effective advertising Less credit risk Fewer fraud Decreased churn rate

7 Biomedical / Biometrics Medicine: Screening Diagnosis and prognosis Drug discovery Security: Face recognition Signature / fingerprint / iris verification DNA fingerprinting 6

8 Computer / Internet Computer interfaces: Troubleshooting wizards Handwriting and speech Brain waves Internet Hit ranking Spam filtering Text categorization Text translation Recommendation 7

9 Challenges training examples 10 5 Sylva Ada NIPS 2003 & WCCI Dexter, Nova Madelon Gisette Gina Arcene, Dorothea, Hiva inputs

10 Ten Classification Tasks ARCENE DEXTER DOROTHEA GISETTE MADELON Test BER (%) ADA GINA HIVA NOVA SYLVA

11 Challenge Winning Methods BER/<BER> Linear /Kernel Neural Nets Trees /RF Naïve Bayes Gisette (HWR) Gina (HWR) Dexter (Text) Nova (Text) Madelon (Artificial) Arcene (Spectral) Dorothea (Pharma) Hiva (Pharma) Ada (Marketing) Sylva (Ecology)

12 Conventions n X={x ij } y ={y x m j } i α w

13 Learning problem Data matrix: X m lines = patterns (data points, examples): samples, patients, documents, images, n columns = features: (attributes, input variables): genes, proteins, words, pixels, Colon cancer, Alon et al 1999 Unsupervised learning Is there structure in data? Supervised learning Predict an outcome y.

14 Linear Models f(x) = w x +b = Σ j=1:n w j x j +b Linearity in the parameters, NOT in the input components. f(x) = w Φ(x) +b = Σ j w j φ j (x) +b (Perceptron) f(x) = Σ i=1:m α i k(x i,x) +b (Kernel method)

15 Artificial Neurons x 1 Cell potential w 1 x 2 w 2 Σ f(x) Activation of other neurons x n 1 w n b Synapses Dendrites Axon Activation function McCulloch and Pitts, 1943 f(x) = w x + b

16 Linear Decision Boundary hyperplane 0.5 x x 3 X x x X xx

17 Perceptron x 1 φ 1 (x) Rosenblatt, 1957 x 2 φ 2 (x) w 1 w 2 Σ f(x) x n φ N (x) 1 w N b f(x) = w Φ(x) + b

18 NL Decision Boundary x Hs.7780 x x 1 x 2 Hs x 1 Hs

19 Kernel Method x 1 k(x 1,x) Potential functions, Aizerman et al 1964 x 2 k(x 2,x) α 1 α 2 Σ x n k(x m,x) 1 α m b f(x) = Σ i α i k(x i,x) + b k(.,. ) is a similarity measure or kernel.

20 Hebb s Rule w j w j + y i x ij Activation of another neuron x j w j Σ y Axon Dendrite Synapse Link to Naïve Bayes

21 Kernel Trick (for Hebb s rule) Hebb s rule for the Perceptron: w = Σ i y i Φ(x i ) f(x) = w Φ(x) = Σ i y i Φ(x i ) Φ(x) Define a dot product: k(x i,x) = Φ(x i ) Φ(x) f(x) = Σ i y i k(x i,x)

22 Kernel Trick (general) f(x) = Σ i α i k(x i, x) k(x i, x) = Φ(x i ) Φ(x) Dual forms f(x) = w Φ(x) w = Σ i α i Φ(x i )

23 What is a Kernel? A kernel is: a similarity measure a dot product in some feature space: k(s, t) = Φ(s) Φ(t) But we do not need to know the Φ representation. Examples: k(s, t) = exp(- s-t 2 /σ 2 ) k(s, t) = (s t) q Gaussian kernel Polynomial kernel

24 Multi-Layer Perceptron Back-propagation, Rumelhart et al, 1986 Σ x j Σ Σ internal latent variables hidden units

25 Chessboard Problem

26 Tree Classifiers CART (Breiman, 1984) or C4.5 (Quinlan, 1993) f 2 All the data Choose f 1 Choose f 2 f 1 At each step, choose the feature that reduces entropy most. Work towards node purity.

27 Linear discriminant Iris Data (Fisher, 1936) Figure from Norbert Jankowski and Krzysztof Grabczewski Tree classifier setosa virginica versicolor Gaussian mixture Kernel method (SVM)

28 Fit / Robustness Tradeoff x 2 x 2 x 1 x 1 15

29 Performance evaluation f(x) = 0 f(x) < 0 f(x) < 0 x 2 x 2 f(x) = 0 f(x) > 0 f(x) > 0 x 1 x 1

30 Performance evaluation f(x) = -1 f(x) < -1 f(x) < -1 x 2 x 2 f(x) = -1 f(x) > -1 f(x) > -1 x 1 x 1

31 Performance evaluation f(x) = 1 f(x) < 1 f(x) < 1 x 2 x 2 f(x) = 1 f(x) > 1 f(x) > 1 x 1 x 1

32 ROC Curve For a given threshold on f(x), you get a point on the ROC curve. 100% Ideal ROC curve Actual ROC Positive class success rate (hit rate, sensitivity) Random ROC negative class success rate (false alarm rate, 1-specificity) 100%

33 ROC Curve For a given threshold on f(x), you get a point on the ROC curve. 100% Positive class success rate (hit rate, sensitivity) Ideal ROC curve (AUC=1) Actual ROC Random ROC (AUC=0.5) 0 0 AUC negative class success rate (false alarm rate, 1-specificity) 100%

34 Lift Curve Customers ranked according to f(x); selection of the top ranking customers. Gini = M O Gini=2 AUC-1 0 Gini 1 100% Hit rate = Frac. good customers select. 0 O Ideal Lift Actual Lift M Random lift Fraction of customers selected 100%

35 Performance Assessment Cost matrix Truth: y Predictions: F(x) Class -1 Class +1 Class -1 tn fp Class +1 fn tp Total Class+1 /Total rej=tn+fn sel=fp+tp Precision = tp/sel Total neg=tn+fp pos=fn+tp Class +1 / Total False alarm = fp/neg Hit rate = tp/pos m=tn+fp Frac. selected = sel/m +fn+tp False alarm rate = type I errate = 1-specificity Hit rate = 1-type II errate = sensitivity = recall = test power Compare F(x) = sign(f(x)) to the target y, and report: Error rate = (fn + fp)/m {Hit rate, False alarm rate} or {Hit rate, Precision} or {Hit rate, Frac.selected} Balanced error rate (BER) = (fn/pos + fp/neg)/2 = 1 (sensitivity+specificity)/2 F measure = 2 precision.recall/(precision+recall) Vary the decision threshold θ in F(x) = sign(f(x)+θ), and plot: ROC curve: Hit rate vs. False alarm rate Lift curve: Hit rate vs. Fraction selected Precision/recall curve: Hit rate vs. Precision

36 What is a Risk Functional? A function of the parameters of the learning machine, assessing how much it is expected to fail on a given task. Examples: Classification: Error rate: (1/m) Σ i=1:m 1(F(x i ) y i ) 1- AUC (Gini Index = 2 AUC-1) Regression: Mean square error: (1/m) Σ i=1:m (f(x i )-y i ) 2

37 How to train? Define a risk functional R[f(x,w)] Optimize it w.r.t. w (gradient descent, mathematical programming, simulated annealing, genetic algorithms, etc.) R[f(x,w)] Parameter space (w) w* ( to be continued in the next lecture)

38 How to Train? Define a risk functional R[f(x,w)] Find a method to optimize it, typically gradient descent w j w j - η R/ w j or any optimization method (mathematical programming, simulated annealing, genetic algorithms, etc.) ( to be continued in the next lecture)

39 Summary With linear threshold units ( neurons ) we can build: Linear discriminant (including Naïve Bayes) Kernel methods Neural networks Decision trees The architectural hyper-parameters may include: The choice of basis functions φ (features) The kernel The number of units Learning means fitting: Parameters (weights) Hyper-parameters Be aware of the fit vs. robustness tradeoff

40 Want to Learn More? Pattern Classification, R. Duda, P. Hart, and D. Stork. Standard pattern recognition textbook. Limited to classification problems. Matlab code. The Elements of statistical Learning: Data Mining, Inference, and Prediction. T. Hastie, R. Tibshirani, J. Friedman, Standard statistics textbook. Includes all the standard machine learning methods for classification, regression, clustering. R code. Linear Discriminants and Support Vector Machines, I. Guyon and D. Stork, In Smola et al Eds. Advances in Large Margin Classiers. Pages , MIT Press, Feature Extraction: Foundations and Applications. I. Guyon et al, Eds. Book for practitioners with datasets of NIPS 2003 challenge, tutorials, best performing methods, Matlab code, teaching material.

### Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

### Neural networks. Chapter 20, Section 5 1

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

### Introduction to Machine Learning NPFL 054

Introduction to Machine Learning NPFL 054 http://ufal.mff.cuni.cz/course/npfl054 Barbora Hladká hladka@ufal.mff.cuni.cz Martin Holub holub@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and

### Review of some concepts in predictive modeling

Review of some concepts in predictive modeling Brigham and Women s Hospital Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support A disjoint list of topics? Naïve Bayes

### CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### Introduction to Machine Learning

Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 5: Decision Theory & ROC Curves Gaussian ML Estimation Many figures courtesy Kevin Murphy s textbook,

### C19 Machine Learning

C9 Machine Learning 8 Lectures Hilary Term 25 2 Tutorial Sheets A. Zisserman Overview: Supervised classification perceptron, support vector machine, loss functions, kernels, random forests, neural networks

### Learning. Artificial Intelligence. Learning. Types of Learning. Inductive Learning Method. Inductive Learning. Learning.

Learning Learning is essential for unknown environments, i.e., when designer lacks omniscience Artificial Intelligence Learning Chapter 8 Learning is useful as a system construction method, i.e., expose

### 203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

### Introduction to Machine Learning Using Python. Vikram Kamath

Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression

### Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

### CS 2750 Machine Learning. Lecture 1. Machine Learning. CS 2750 Machine Learning.

Lecture 1 Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x-5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### Drug Store Sales Prediction

Drug Store Sales Prediction Chenghao Wang, Yang Li Abstract - In this paper we tried to apply machine learning algorithm into a real world problem drug store sales forecasting. Given store information,

### Classifiers & Classification

Classifiers & Classification Forsyth & Ponce Computer Vision A Modern Approach chapter 22 Pattern Classification Duda, Hart and Stork School of Computer Science & Statistics Trinity College Dublin Dublin

### Data Mining Practical Machine Learning Tools and Techniques

Counting the cost Data Mining Practical Machine Learning Tools and Techniques Slides for Section 5.7 In practice, different types of classification errors often incur different costs Examples: Loan decisions

### Chapter 7. Diagnosis and Prognosis of Breast Cancer using Histopathological Data

Chapter 7 Diagnosis and Prognosis of Breast Cancer using Histopathological Data In the previous chapter, a method for classification of mammograms using wavelet analysis and adaptive neuro-fuzzy inference

### Machine Learning. 01 - Introduction

Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

### Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the

### COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection.

COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection. Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise

### Assessing Data Mining: The State of the Practice

Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality

### EECS 445: Introduction to Machine Learning Winter 2015

Instructor: Prof. Jenna Wiens Office: 3609 BBB wiensj@umich.edu EECS 445: Introduction to Machine Learning Winter 2015 Graduate Student Instructor: Srayan Datta Office: 3349 North Quad (**office hours

### Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

### Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

### Lecture 1: Introduction to Neural Networks Kevin Swingler / Bruce Graham

Lecture 1: Introduction to Neural Networks Kevin Swingler / Bruce Graham kms@cs.stir.ac.uk 1 What are Neural Networks? Neural Networks are networks of neurons, for example, as found in real (i.e. biological)

### Feature Subset Selection in E-mail Spam Detection

Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature

### Machine Learning model evaluation. Luigi Cerulo Department of Science and Technology University of Sannio

Machine Learning model evaluation Luigi Cerulo Department of Science and Technology University of Sannio Accuracy To measure classification performance the most intuitive measure of accuracy divides the

### Introduction to Neural Networks for Senior Design

Introduction to Neural Networks for Senior Design Intro-1 Neural Networks: The Big Picture Artificial Intelligence Neural Networks Expert Systems Machine Learning not ruleoriented ruleoriented Intro-2

### Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu

Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill

### Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

### MA2823: Foundations of Machine Learning

MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr

### Introduction to Pattern Recognition

Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

### Introduction to Artificial Neural Networks. Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks v.3 August Michel Verleysen Introduction - Introduction to Artificial Neural Networks p Why ANNs? p Biological inspiration p Some examples of problems p Historical

### Lecture 6. Artificial Neural Networks

Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm

### BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### Big Data Analytics CSCI 4030

High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

### Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

### Performance Measures for Machine Learning

Performance Measures for Machine Learning 1 Performance Measures Accuracy Weighted (Cost-Sensitive) Accuracy Lift Precision/Recall F Break Even Point ROC ROC Area 2 Accuracy Target: 0/1, -1/+1, True/False,

### Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승 How much energy do we need for brain functions? Information processing: Trade-off between energy consumption and wiring cost Trade-off between energy consumption

### Artificial neural networks

Artificial neural networks Now Neurons Neuron models Perceptron learning Multi-layer perceptrons Backpropagation 2 It all starts with a neuron 3 Some facts about human brain ~ 86 billion neurons ~ 10 15

### Detection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup

Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor

### CLASSIFICATION JELENA JOVANOVIĆ. Web:

CLASSIFICATION JELENA JOVANOVIĆ Email: jeljov@gmail.com Web: http://jelenajovanovic.net OUTLINE What is classification? Binary and multiclass classification Classification algorithms Performance measures

### Obtaining Value from Big Data

Obtaining Value from Big Data Course Notes in Transparency Format technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Data deluge, is it enough?

### Searching for Gravitational Waves from the Coalescence of High Mass Black Hole Binaries

Searching for Gravitational Waves from the Coalescence of High Mass Black Hole Binaries 2015 SURE Presentation September 22 nd, 2015 Lau Ka Tung Department of Physics, The Chinese University of Hong Kong

### Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

### INTRODUCTION TO MACHINE LEARNING 3RD EDITION

ETHEM ALPAYDIN The MIT Press, 2014 Lecture Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e CHAPTER 1: INTRODUCTION Big Data 3 Widespread

### Linear Classification. Volker Tresp Summer 2015

Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

### Machine Learning (CS 567)

Machine Learning (CS 567) Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol Han (cheolhan@usc.edu)

### The many faces of ROC analysis in machine learning. Peter A. Flach University of Bristol, UK

The many faces of ROC analysis in machine learning Peter A. Flach University of Bristol, UK www.cs.bris.ac.uk/~flach/ Objectives After this tutorial, you will be able to [model evaluation] produce ROC

### More Data Mining with Weka

More Data Mining with Weka Class 5 Lesson 1 Simple neural networks Ian H. Witten Department of Computer Science University of Waikato New Zealand weka.waikato.ac.nz Lesson 5.1: Simple neural networks Class

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### An Introduction to Data Mining

An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

### Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

### Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation

### Neural Networks. Neural network is a network or circuit of neurons. Neurons can be. Biological neurons Artificial neurons

Neural Networks Neural network is a network or circuit of neurons Neurons can be Biological neurons Artificial neurons Biological neurons Building block of the brain Human brain contains over 10 billion

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

### MS1b Statistical Data Mining

MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

### IT Applications in Business Analytics SS2016 / Lecture 07 Use Case 1 (Two Class Classification) Thomas Zeutschler

Hochschule Düsseldorf University of Applied Scienses Fachbereich Wirtschaftswissenschaften W Business Analytics (M.Sc.) IT in Business Analytics IT Applications in Business Analytics SS2016 / Lecture 07

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### An Introduction to Statistical Machine Learning - Overview -

An Introduction to Statistical Machine Learning - Overview - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland

### Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

### Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

### HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

### Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

### Role of Neural network in data mining

Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

### An Introduction to Machine Learning

An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

### Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

### A TUTORIAL. BY: Negin Yousefpour PhD Student Civil Engineering Department TEXAS A&M UNIVERSITY

ARTIFICIAL NEURAL NETWORKS: A TUTORIAL BY: Negin Yousefpour PhD Student Civil Engineering Department TEXAS A&M UNIVERSITY Contents Introduction Origin Of Neural Network Biological Neural Networks ANN Overview

### Introduction to Neural Networks

Introduction to Neural Networks 2nd Year UG, MSc in Computer Science http://www.cs.bham.ac.uk/~jxb/inn.html Lecturer: Dr. John A. Bullinaria http://www.cs.bham.ac.uk/~jxb John A. Bullinaria, 2004 Module

### CSC 321 H1S Study Guide (Last update: April 3, 2016) Winter 2016

1. Suppose our training set and test set are the same. Why would this be a problem? 2. Why is it necessary to have both a test set and a validation set? 3. Images are generally represented as n m 3 arrays,

### Learning is a very general term denoting the way in which agents:

What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

### Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

### Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

### 1. Classification problems

Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification

### MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

### Neural Networks and Support Vector Machines

INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

### Neural Networks. Introduction to Artificial Intelligence CSE 150 May 29, 2007

Neural Networks Introduction to Artificial Intelligence CSE 150 May 29, 2007 Administration Last programming assignment has been posted! Final Exam: Tuesday, June 12, 11:30-2:30 Last Lecture Naïve Bayes

### E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

### Neural Machine Translation by Jointly Learning to Align and Translate

Neural Machine Translation by Jointly Learning to Align and Translate Neural Traduction Automatique par Conjointement Apprentissage Pour Aligner et Traduire Dzmitry Bahdanau KyungHyun Cho Yoshua Bengio

### Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

### Introduction to Data Mining

Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

### Quiz 1 for Name: Good luck! 20% 20% 20% 20% Quiz page 1 of 16

Quiz 1 for 6.034 Name: 20% 20% 20% 20% Good luck! 6.034 Quiz page 1 of 16 Question #1 30 points 1. Figure 1 illustrates decision boundaries for two nearest-neighbour classifiers. Determine which one of

### Data Clustering. Dec 2nd, 2013 Kyrylo Bessonov

Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

### Meta-learning. Synonyms. Definition. Characteristics

Meta-learning Włodzisław Duch, Department of Informatics, Nicolaus Copernicus University, Poland, School of Computer Engineering, Nanyang Technological University, Singapore wduch@is.umk.pl (or search

### Lecture 9: Introduction to Pattern Analysis

Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns

### NEURAL NETWORKS A Comprehensive Foundation

NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments

### Using Neural Networks for Pattern Classification Problems

Using Neural Networks for Pattern Classification Problems Converting an Image Camera captures an image Image needs to be converted to a form that can be processed by the Neural Network Converting an Image

### Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi

Neural Networks CAP5610 Machine Learning Instructor: Guo-Jun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector

### Practical Introduction to Machine Learning and Optimization. Alessio Signorini <alessio.signorini@oneriot.com>

Practical Introduction to Machine Learning and Optimization Alessio Signorini Everyday's Optimizations Although you may not know, everybody uses daily some sort of optimization

### Foundations of Artificial Intelligence. Introduction to Data Mining

Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present

### Machine learning for algo trading

Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with

### A Neural Support Vector Network Architecture with Adaptive Kernels. 1 Introduction. 2 Support Vector Machines and Motivations

A Neural Support Vector Network Architecture with Adaptive Kernels Pascal Vincent & Yoshua Bengio Département d informatique et recherche opérationnelle Université de Montréal C.P. 6128 Succ. Centre-Ville,

### Analysis Tools and Libraries for BigData

+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I

### Maschinelles Lernen mit MATLAB

Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical