Optimization for Machine Learning

Size: px
Start display at page:

Download "Optimization for Machine Learning"

Transcription

1 Optimization for Machine Learning Lecture 4: SMO-MKL S.V. N. (vishy) Vishwanathan Purdue University July 11, 2012 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 1 / 22

2 Binary Classification y i = +1 y i = 1 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22

3 Binary Classification y i = +1 y i = 1 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22

4 Binary Classification y i = +1 y i = 1 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22

5 Binary Classification y i = +1 w, x 1 + b = +1 w, x 2 + b = 1 w, x 1 x 2 = 2 w w, x 1 x 2 = 2 w x 2 x 1 y i = 1 {x w, x + b = 1} {x w, x + b = 1} {x w, x + b = 0} S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22

6 Linear Support Vector Machines Optimization Problem min w,b,ξ s.t. 1 2 w 2 + C m i=1 y i ( w, x i + b) 1 ξ i for all i ξ i 0 ξ i S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 3 / 22

7 The Kernel Trick y x S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 4 / 22

8 The Kernel Trick y x S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 4 / 22

9 The Kernel Trick x 2 + y 2 x S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 4 / 22

10 Kernel Trick Optimization Problem min w,b,ξ s.t. 1 2 w 2 + C m i=1 ξ i y i ( w, φ(x i ) + b) 1 ξ i for all i ξ i 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 22

11 Kernel Trick Optimization Problem max α s.t. 1 2 α Hα + 1 α 0 α i C α i y i = 0 i H ij = y i y j φ(x i ), φ(x j ) S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 22

12 Kernel Trick Optimization Problem max α s.t. 1 2 α Hα + 1 α 0 α i C α i y i = 0 i H ij = y i y j k(x i, x j ) S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 22

13 Key Question Which kernel should I use? The Multiple Kernel Learning Answer Cook up as many (base) kernels as you can Compute a data dependent kernel function as a linear combination of base kernels k(x, x ) = k d k k k (x, x ) s.t. d k 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 22

14 Key Question Which kernel should I use? The Multiple Kernel Learning Answer Cook up as many (base) kernels as you can Compute a data dependent kernel function as a linear combination of base kernels k(x, x ) = k d k k k (x, x ) s.t. d k 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 22

15 Object Detection Localize a specified object of interest if it exists in a given image S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 7 / 22

16 Some Examples of MKL Detection S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 8 / 22

17 Summary of Our Results Sonar Dataset with 800 kernels p Training Time (s) # Kernels Selected SMO-MKL Shogun SMO-MKL Shogun Web dataset: 50,000 points and 50 kernels 30 minutes Sonar with a hundred thousand kernels Precomputed: 8 minutes Kernels computed on-the-fly: 30 minutes S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 9 / 22

18 Setting up the Optimization Problem -I The Setup We are given K kernel functions k 1,..., k n with corresponding feature maps φ 1 ( ),..., φ n ( ) We are interested in deriving the feature map φ(x) = d1 φ 1 (x). dn φ n (x) S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 10 / 22

19 Setting up the Optimization Problem -I The Setup We are given K kernel functions k 1,..., k n with corresponding feature maps φ 1 ( ),..., φ n ( ) We are interested in deriving the feature map φ(x) = d1 φ 1 (x). dn φ n (x) = w = w 1. w n S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 10 / 22

20 Setting up the Optimization Problem Optimization Problem min w,b,ξ s.t. 1 2 w 2 + C m i=1 ξ i y i ( w, φ(x i ) + b) 1 ξ i for all i ξ i 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22

21 Setting up the Optimization Problem Optimization Problem min w,b,ξ,d s.t. 1 2 w k 2 + C k y i ( k m i=1 ξ i dk w k, φ k (x i ) + b ) 1 ξ i for all i ξ i 0 d k 0 for all k S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22

22 Setting up the Optimization Problem Optimization Problem min w,b,ξ,d s.t. ( ) 2 1 m w k 2 + C ξ i + ρ d p p 2 2 k k i=1 k ( ) y i dk w k, φ k (x i ) + b 1 ξ i for all i k ξ i 0 d k 0 for all k S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22

23 Setting up the Optimization Problem Optimization Problem min w,b,ξ,d s.t. 1 w k 2 + C 2 d k k y i ( k ( m ξ i + ρ 2 k ) i=1 w k, φ k (x i ) + b d p k ) 2 p 1 ξ i for all i ξ i 0 d k 0 for all k S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22

24 Setting up the Optimization Problem Optimization Problem min max d α s.t. ( 1 d k α H k α + 1 α + ρ 2 2 k 0 α i C α i y i = 0 i d k 0 k d p k ) 2 p S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22

25 Saddle Point Problem d α S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 12 / 22

26 Solving the Saddle Point Saddle Point Problem min max d α s.t. ( 1 d k α H k α + 1 α + ρ 2 2 k 0 α i C α i y i = 0 i d k 0 k d p k ) 2 p S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 13 / 22

27 Our Approach The Key Insight Eliminate d D(α) := max α s.t. 1 8ρ ( k 0 α i C α i y i = 0 i 1 p + 1 q = 1 ) 2 ( ) q q α H k α + 1 α Not a QP but very close to one! S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 14 / 22

28 Our Approach SMO-MKL: High Level Overview D(α) := max α s.t. 1 8ρ ( k 0 α i C α i y i = 0 i ) 2 ( ) q q α H k α + 1 α Algorithm Choose two variables α i and α j to optimize Solve the one dimensional reduced optimization problem Repeat until convergence S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 15 / 22

29 Our Approach SMO-MKL: High Level Overview Selecting the Working Set Compute directional derivative and directional Hessian Greedily select the variables Solving the Reduced Problem Analytic solution for p = q = 2 (one dimensional quartic) For other values of p use Newton Raphson S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 22

30 Our Approach SMO-MKL: High Level Overview Selecting the Working Set Compute directional derivative and directional Hessian Greedily select the variables Solving the Reduced Problem Analytic solution for p = q = 2 (one dimensional quartic) For other values of p use Newton Raphson S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 22

31 Experiments Generalization Performance Australian 90 Test Accuracy (%) SMO-MKL Shogun S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 22

32 Experiments Generalization Performance ionosphere 94 Test Accuracy (%) SMO-MKL Shogun S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 22

33 Experiments Scaling with Training Set Size Adult: 123 dimensions, 50 RBF kernels, p = 1.33, C = 1 CPU Time in seconds SMO-MKL Shogun Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 22

34 Experiments Scaling with Training Set Size Adult: 123 dimensions, 50 RBF kernels, p = 1.33, C = 1 CPU Time in seconds Observed O(n 1.9 ) Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 22

35 Experiments On Another Dataset Web: 300 dimensions, 50 RBF kernels, p = 1.33, C = 1 CPU Time in seconds Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 19 / 22

36 Experiments Scaling with Number of Kernels Sonar: 208 examples, 59 dimensions, p = 1.33, C = 1 CPU Time in seconds Number of Kernels S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 20 / 22

37 Experiments Scaling with Number of Kernels Sonar: 208 examples, 59 dimensions, p = 1.33, C = 1 CPU Time in seconds SMO-MKL O(n) Number of Kernels S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 20 / 22

38 Experiments On Another Dataset Real-sim: 72,309 examples, 20,958 dimensions, p = 1.33, C = 1 CPU Time in seconds 10 5 SMO-MKL O(n) Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 21 / 22

39 References References S. V. N. Vishwanathan, Zhaonan Sun, Theera-Ampornpunt, and Manik Varma. Multiple Kernel Learning and the SMO Algorithm. NIPS Pages Code available for download from com/en-us/um/people/manik/code/smo-mkl/download.html S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 22 / 22

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines. Colin Campbell, Bristol University Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

More information

Support Vector Machines Explained

Support Vector Machines Explained March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),

More information

Support Vector Machine (SVM)

Support Vector Machine (SVM) Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

An Introduction to Machine Learning

An Introduction to Machine Learning An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

A Study on the Comparison of Electricity Forecasting Models: Korea and China

A Study on the Comparison of Electricity Forecasting Models: Korea and China Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

More information

Online learning of multi-class Support Vector Machines

Online learning of multi-class Support Vector Machines IT 12 061 Examensarbete 30 hp November 2012 Online learning of multi-class Support Vector Machines Xuan Tuan Trinh Institutionen för informationsteknologi Department of Information Technology Abstract

More information

Support Vector Machines for Classification and Regression

Support Vector Machines for Classification and Regression UNIVERSITY OF SOUTHAMPTON Support Vector Machines for Classification and Regression by Steve R. Gunn Technical Report Faculty of Engineering, Science and Mathematics School of Electronics and Computer

More information

Forecasting and Hedging in the Foreign Exchange Markets

Forecasting and Hedging in the Foreign Exchange Markets Christian Ullrich Forecasting and Hedging in the Foreign Exchange Markets 4u Springer Contents Part I Introduction 1 Motivation 3 2 Analytical Outlook 7 2.1 Foreign Exchange Market Predictability 7 2.2

More information

KERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA

KERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA Rahayu, Kernel Logistic Regression-Linear for Leukemia Classification using High Dimensional Data KERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA S.P. Rahayu 1,2

More information

Distributed Machine Learning and Big Data

Distributed Machine Learning and Big Data Distributed Machine Learning and Big Data Sourangshu Bhattacharya Dept. of Computer Science and Engineering, IIT Kharagpur. http://cse.iitkgp.ac.in/~sourangshu/ August 21, 2015 Sourangshu Bhattacharya

More information

Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

More information

Fast Kernel Classifiers with Online and Active Learning

Fast Kernel Classifiers with Online and Active Learning Journal of Machine Learning Research 6 (2005) 1579 1619 Submitted 3/05; Published 9/05 Fast Kernel Classifiers with Online and Active Learning Antoine Bordes NEC Laboratories America 4 Independence Way

More information

A Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression

A Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression A Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression José Guajardo, Jaime Miranda, and Richard Weber, Department of Industrial Engineering, University of Chile Abstract

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

Data clustering optimization with visualization

Data clustering optimization with visualization Page 1 Data clustering optimization with visualization Fabien Guillaume MASTER THESIS IN SOFTWARE ENGINEERING DEPARTMENT OF INFORMATICS UNIVERSITY OF BERGEN NORWAY DEPARTMENT OF COMPUTER ENGINEERING BERGEN

More information

Online Passive-Aggressive Algorithms on a Budget

Online Passive-Aggressive Algorithms on a Budget Zhuang Wang Dept. of Computer and Information Sciences Temple University, USA zhuang@temple.edu Slobodan Vucetic Dept. of Computer and Information Sciences Temple University, USA vucetic@temple.edu Abstract

More information

Training and Testing Low-degree Polynomial Data Mappings via Linear SVM

Training and Testing Low-degree Polynomial Data Mappings via Linear SVM Journal of Machine Learning Research 11 (2010) 1471-1490 Submitted 8/09; Revised 1/10; Published 4/10 Training and Testing Low-degree Polynomial Data Mappings via Linear SVM Yin-Wen Chang Cho-Jui Hsieh

More information

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric

More information

Fóra Gyula Krisztián. Predictive analysis of financial time series

Fóra Gyula Krisztián. Predictive analysis of financial time series Eötvös Loránd University Faculty of Science Fóra Gyula Krisztián Predictive analysis of financial time series BSc Thesis Supervisor: Lukács András Department of Computer Science Budapest, June 2014 Acknowledgements

More information

O(n 3 ) O(4 d nlog 2d n) O(n) O(4 d log 2d n)

O(n 3 ) O(4 d nlog 2d n) O(n) O(4 d log 2d n) O(n 3 )O(4 d nlog 2d n) O(n)O(4 d log 2d n) O(n 3 ) O(n) O(n 3 ) O(4 d nlog 2d n) O(n)O(4 d log 2d n) 1. Fast Gaussian Filtering Algorithm 2. Binary Classification Using The Fast Gaussian Filtering Algorithm

More information

Online (and Offline) on an Even Tighter Budget

Online (and Offline) on an Even Tighter Budget Online (and Offline) on an Even Tighter Budget Jason Weston NEC Laboratories America, Princeton, NJ, USA jasonw@nec-labs.com Antoine Bordes NEC Laboratories America, Princeton, NJ, USA antoine@nec-labs.com

More information

Principles of Scientific Computing Nonlinear Equations and Optimization

Principles of Scientific Computing Nonlinear Equations and Optimization Principles of Scientific Computing Nonlinear Equations and Optimization David Bindel and Jonathan Goodman last revised March 6, 2006, printed March 6, 2009 1 1 Introduction This chapter discusses two related

More information

Trading regret rate for computational efficiency in online learning with limited feedback

Trading regret rate for computational efficiency in online learning with limited feedback Trading regret rate for computational efficiency in online learning with limited feedback Shai Shalev-Shwartz TTI-C Hebrew University On-line Learning with Limited Feedback Workshop, 2009 June 2009 Shai

More information

Lecture 2: The SVM classifier

Lecture 2: The SVM classifier Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function

More information

Research Article Prediction of Banking Systemic Risk Based on Support Vector Machine

Research Article Prediction of Banking Systemic Risk Based on Support Vector Machine Mathematical Problems in Engineering Volume 203, Article ID 36030, 5 pages http://dx.doi.org/0.55/203/36030 Research Article Prediction of Banking Systemic Risk Based on Support Vector Machine Shouwei

More information

KEITH LEHNERT AND ERIC FRIEDRICH

KEITH LEHNERT AND ERIC FRIEDRICH MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They

More information

Stock Market Volatility Prediction: A Service-Oriented Multi-Kernel Learning Approach

Stock Market Volatility Prediction: A Service-Oriented Multi-Kernel Learning Approach Stock Market Volatility Prediction: A Service-Oriented Multi-Kernel Learning Approach Feng Wang State Key Lab of Software Engineering Wuhan University Wuhan, China Email: fengwang@whu.edu.cn Ling Liu College

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Going For Large Scale 1

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Applications of Support Vector-Based Learning

Applications of Support Vector-Based Learning Applications of Support Vector-Based Learning Róbert Ormándi The supervisors are Prof. János Csirik and Dr. Márk Jelasity Research Group on Artificial Intelligence of the University of Szeged and the Hungarian

More information

More Generality in Efficient Multiple Kernel Learning

More Generality in Efficient Multiple Kernel Learning Manik Varma manik@microsoft.com Microsoft Research India, Second Main Road, Sadashiv Nagar, Bangalore 560 080, India Bodla Rakesh Babu rakeshbabu@research.iiit.net CVIT, International Institute of Information

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

Classification of high resolution satellite images

Classification of high resolution satellite images Thesis for the degree of Master of Science in Engineering Physics Classification of high resolution satellite images Anders Karlsson Laboratoire de Systèmes d Information Géographique Ecole Polytéchnique

More information

DUOL: A Double Updating Approach for Online Learning

DUOL: A Double Updating Approach for Online Learning : A Double Updating Approach for Online Learning Peilin Zhao School of Comp. Eng. Nanyang Tech. University Singapore 69798 zhao6@ntu.edu.sg Steven C.H. Hoi School of Comp. Eng. Nanyang Tech. University

More information

MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION

MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION Matthew A. Lanham & Ralph D. Badinelli Virginia Polytechnic Institute and State University Department of Business

More information

Data visualization and dimensionality reduction using kernel maps with a reference point

Data visualization and dimensionality reduction using kernel maps with a reference point Data visualization and dimensionality reduction using kernel maps with a reference point Johan Suykens K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg 1 B-31 Leuven (Heverlee), Belgium Tel: 32/16/32 18

More information

Early defect identification of semiconductor processes using machine learning

Early defect identification of semiconductor processes using machine learning STANFORD UNIVERISTY MACHINE LEARNING CS229 Early defect identification of semiconductor processes using machine learning Friday, December 16, 2011 Authors: Saul ROSA Anton VLADIMIROV Professor: Dr. Andrew

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

Scalable Developments for Big Data Analytics in Remote Sensing

Scalable Developments for Big Data Analytics in Remote Sensing Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,

More information

Classification using Intersection Kernel Support Vector Machines is Efficient

Classification using Intersection Kernel Support Vector Machines is Efficient To Appear in IEEE Computer Vision and Pattern Recognition 28, Anchorage Classification using Intersection Kernel Support Vector Machines is Efficient Subhransu Maji EECS Department U.C. Berkeley smaji@eecs.berkeley.edu

More information

Online Classification on a Budget

Online Classification on a Budget Online Classification on a Budget Koby Crammer Computer Sci. & Eng. Hebrew University Jerusalem 91904, Israel kobics@cs.huji.ac.il Jaz Kandola Royal Holloway, University of London Egham, UK jaz@cs.rhul.ac.uk

More information

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS MAXIMIZING RETURN ON DIRET MARKETING AMPAIGNS IN OMMERIAL BANKING S 229 Project: Final Report Oleksandra Onosova INTRODUTION Recent innovations in cloud computing and unified communications have made a

More information

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S

Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Mathematical finance and linear programming (optimization)

Mathematical finance and linear programming (optimization) Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may

More information

IFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,...

IFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,... IFT3395/6390 Historical perspective: back to 1957 (Prof. Pascal Vincent) (Rosenblatt, Perceptron ) Machine Learning from linear regression to Neural Networks Computer Science Artificial Intelligence Symbolic

More information

Nonlinear Optimization: Algorithms 3: Interior-point methods

Nonlinear Optimization: Algorithms 3: Interior-point methods Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Privacy-Preserving Outsourcing Support Vector Machines with Random Transformation

Privacy-Preserving Outsourcing Support Vector Machines with Random Transformation Privacy-Preserving Outsourcing Support Vector Machines with Random Transformation Keng-Pei Lin Ming-Syan Chen Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan Research Center

More information

Designing a learning system

Designing a learning system Lecture Designing a learning system Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x4-8845 http://.cs.pitt.edu/~milos/courses/cs750/ Design of a learning system (first vie) Application or Testing

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

A fast multi-class SVM learning method for huge databases

A fast multi-class SVM learning method for huge databases www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,

More information

Large-Scale Sparsified Manifold Regularization

Large-Scale Sparsified Manifold Regularization Large-Scale Sparsified Manifold Regularization Ivor W. Tsang James T. Kwok Department of Computer Science and Engineering The Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method

The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

Multiclass Classification. 9.520 Class 06, 25 Feb 2008 Ryan Rifkin

Multiclass Classification. 9.520 Class 06, 25 Feb 2008 Ryan Rifkin Multiclass Classification 9.520 Class 06, 25 Feb 2008 Ryan Rifkin It is a tale Told by an idiot, full of sound and fury, Signifying nothing. Macbeth, Act V, Scene V What Is Multiclass Classification? Each

More information

Introduction to Online Learning Theory

Introduction to Online Learning Theory Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent

More information

Stock Trading Using Analytics

Stock Trading Using Analytics American Journal of Marketing Research Vol. 2, No. 2, 2016, pp. 27-37 http://www.aiscience.org/journal/ajmr ISSN: 2381-750X (Print); ISSN: 2381-7518 (Online) Stock Trading Using Analytics Chandrika Kadirvel

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

A semi-supervised Spam mail detector

A semi-supervised Spam mail detector A semi-supervised Spam mail detector Bernhard Pfahringer Department of Computer Science, University of Waikato, Hamilton, New Zealand Abstract. This document describes a novel semi-supervised approach

More information

Support Vector Machine. Tutorial. (and Statistical Learning Theory)

Support Vector Machine. Tutorial. (and Statistical Learning Theory) Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@nec-labs.com 1 Support Vector Machines: history SVMs introduced

More information

Core Vector Machines: Fast SVM Training on Very Large Data Sets

Core Vector Machines: Fast SVM Training on Very Large Data Sets Journal of Machine Learning Research 6 (2005) 363 392 Submitted 12/04; Published 4/05 Core Vector Machines: Fast SVM Training on Very Large Data Sets Ivor W. Tsang James T. Kwok Pak-Ming Cheung Department

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Foundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.

More information

Fast Multipole Method for particle interactions: an open source parallel library component

Fast Multipole Method for particle interactions: an open source parallel library component Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,

More information

A User s Guide to Support Vector Machines

A User s Guide to Support Vector Machines A User s Guide to Support Vector Machines Asa Ben-Hur Department of Computer Science Colorado State University Jason Weston NEC Labs America Princeton, NJ 08540 USA Abstract The Support Vector Machine

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

More information

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center

Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II

More information

Hong Kong Stock Index Forecasting

Hong Kong Stock Index Forecasting Hong Kong Stock Index Forecasting Tong Fu Shuo Chen Chuanqi Wei tfu1@stanford.edu cslcb@stanford.edu chuanqi@stanford.edu Abstract Prediction of the movement of stock market is a long-time attractive topic

More information

Machine Learning in FX Carry Basket Prediction

Machine Learning in FX Carry Basket Prediction Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines

More information

Support Vector Machines

Support Vector Machines CS229 Lecture notes Andrew Ng Part V Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe are indeed the best)

More information

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore Steven C.H. Hoi School of Computer Engineering Nanyang Technological University Singapore Acknowledgments: Peilin Zhao, Jialei Wang, Hao Xia, Jing Lu, Rong Jin, Pengcheng Wu, Dayong Wang, etc. 2 Agenda

More information

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Journal of Computational Information Systems 10: 17 (2014) 7629 7635 Available at http://www.jofcis.com A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Tian

More information

An Internal Model for Operational Risk Computation

An Internal Model for Operational Risk Computation An Internal Model for Operational Risk Computation Seminarios de Matemática Financiera Instituto MEFF-RiskLab, Madrid http://www.risklab-madrid.uam.es/ Nicolas Baud, Antoine Frachot & Thierry Roncalli

More information

Employer Health Insurance Premium Prediction Elliott Lui

Employer Health Insurance Premium Prediction Elliott Lui Employer Health Insurance Premium Prediction Elliott Lui 1 Introduction The US spends 15.2% of its GDP on health care, more than any other country, and the cost of health insurance is rising faster than

More information

fml The SHOGUN Machine Learning Toolbox (and its python interface)

fml The SHOGUN Machine Learning Toolbox (and its python interface) fml The SHOGUN Machine Learning Toolbox (and its python interface) Sören Sonnenburg 1,2, Gunnar Rätsch 2,Sebastian Henschel 2,Christian Widmer 2,Jonas Behr 2,Alexander Zien 2,Fabio de Bona 2,Alexander

More information

(Quasi-)Newton methods

(Quasi-)Newton methods (Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting

More information

Data Mining for Business Analytics

Data Mining for Business Analytics Data Mining for Business Analytics Lecture 2: Introduction to Predictive Modeling Stern School of Business New York University Spring 2014 MegaTelCo: Predicting Customer Churn You just landed a great analytical

More information

Support Vector Pruning with SortedVotes for Large-Scale Datasets

Support Vector Pruning with SortedVotes for Large-Scale Datasets Support Vector Pruning with SortedVotes for Large-Scale Datasets Frerk Saxen, Konrad Doll and Ulrich Brunsmann University of Applied Sciences Aschaffenburg, Germany Email: {Frerk.Saxen, Konrad.Doll, Ulrich.Brunsmann}@h-ab.de

More information

Message-passing sequential detection of multiple change points in networks

Message-passing sequential detection of multiple change points in networks Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal

More information

Machine Learning Logistic Regression

Machine Learning Logistic Regression Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.

More information

Introduction to the Finite Element Method (FEM)

Introduction to the Finite Element Method (FEM) Introduction to the Finite Element Method (FEM) ecture First and Second Order One Dimensional Shape Functions Dr. J. Dean Discretisation Consider the temperature distribution along the one-dimensional

More information

From N to N+1: Multiclass Transfer Incremental Learning

From N to N+1: Multiclass Transfer Incremental Learning From N to N+1: Multiclass Transfer Incremental Learning Ilja Kuzborskij 1,2, Francesco Orabona 3, and Barbara Caputo 1 1 Idiap Research Institute, Switzerland, 2 École Polytechnique Fédérale de Lausanne

More information

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,

More information

LCs for Binary Classification

LCs for Binary Classification Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it

More information

Neural Networks and Support Vector Machines

Neural Networks and Support Vector Machines INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines

More information