Optimization for Machine Learning
|
|
- Marianna Pitts
- 7 years ago
- Views:
Transcription
1 Optimization for Machine Learning Lecture 4: SMO-MKL S.V. N. (vishy) Vishwanathan Purdue University July 11, 2012 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 1 / 22
2 Binary Classification y i = +1 y i = 1 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22
3 Binary Classification y i = +1 y i = 1 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22
4 Binary Classification y i = +1 y i = 1 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22
5 Binary Classification y i = +1 w, x 1 + b = +1 w, x 2 + b = 1 w, x 1 x 2 = 2 w w, x 1 x 2 = 2 w x 2 x 1 y i = 1 {x w, x + b = 1} {x w, x + b = 1} {x w, x + b = 0} S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 2 / 22
6 Linear Support Vector Machines Optimization Problem min w,b,ξ s.t. 1 2 w 2 + C m i=1 y i ( w, x i + b) 1 ξ i for all i ξ i 0 ξ i S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 3 / 22
7 The Kernel Trick y x S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 4 / 22
8 The Kernel Trick y x S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 4 / 22
9 The Kernel Trick x 2 + y 2 x S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 4 / 22
10 Kernel Trick Optimization Problem min w,b,ξ s.t. 1 2 w 2 + C m i=1 ξ i y i ( w, φ(x i ) + b) 1 ξ i for all i ξ i 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 22
11 Kernel Trick Optimization Problem max α s.t. 1 2 α Hα + 1 α 0 α i C α i y i = 0 i H ij = y i y j φ(x i ), φ(x j ) S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 22
12 Kernel Trick Optimization Problem max α s.t. 1 2 α Hα + 1 α 0 α i C α i y i = 0 i H ij = y i y j k(x i, x j ) S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 5 / 22
13 Key Question Which kernel should I use? The Multiple Kernel Learning Answer Cook up as many (base) kernels as you can Compute a data dependent kernel function as a linear combination of base kernels k(x, x ) = k d k k k (x, x ) s.t. d k 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 22
14 Key Question Which kernel should I use? The Multiple Kernel Learning Answer Cook up as many (base) kernels as you can Compute a data dependent kernel function as a linear combination of base kernels k(x, x ) = k d k k k (x, x ) s.t. d k 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 6 / 22
15 Object Detection Localize a specified object of interest if it exists in a given image S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 7 / 22
16 Some Examples of MKL Detection S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 8 / 22
17 Summary of Our Results Sonar Dataset with 800 kernels p Training Time (s) # Kernels Selected SMO-MKL Shogun SMO-MKL Shogun Web dataset: 50,000 points and 50 kernels 30 minutes Sonar with a hundred thousand kernels Precomputed: 8 minutes Kernels computed on-the-fly: 30 minutes S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 9 / 22
18 Setting up the Optimization Problem -I The Setup We are given K kernel functions k 1,..., k n with corresponding feature maps φ 1 ( ),..., φ n ( ) We are interested in deriving the feature map φ(x) = d1 φ 1 (x). dn φ n (x) S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 10 / 22
19 Setting up the Optimization Problem -I The Setup We are given K kernel functions k 1,..., k n with corresponding feature maps φ 1 ( ),..., φ n ( ) We are interested in deriving the feature map φ(x) = d1 φ 1 (x). dn φ n (x) = w = w 1. w n S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 10 / 22
20 Setting up the Optimization Problem Optimization Problem min w,b,ξ s.t. 1 2 w 2 + C m i=1 ξ i y i ( w, φ(x i ) + b) 1 ξ i for all i ξ i 0 S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22
21 Setting up the Optimization Problem Optimization Problem min w,b,ξ,d s.t. 1 2 w k 2 + C k y i ( k m i=1 ξ i dk w k, φ k (x i ) + b ) 1 ξ i for all i ξ i 0 d k 0 for all k S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22
22 Setting up the Optimization Problem Optimization Problem min w,b,ξ,d s.t. ( ) 2 1 m w k 2 + C ξ i + ρ d p p 2 2 k k i=1 k ( ) y i dk w k, φ k (x i ) + b 1 ξ i for all i k ξ i 0 d k 0 for all k S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22
23 Setting up the Optimization Problem Optimization Problem min w,b,ξ,d s.t. 1 w k 2 + C 2 d k k y i ( k ( m ξ i + ρ 2 k ) i=1 w k, φ k (x i ) + b d p k ) 2 p 1 ξ i for all i ξ i 0 d k 0 for all k S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22
24 Setting up the Optimization Problem Optimization Problem min max d α s.t. ( 1 d k α H k α + 1 α + ρ 2 2 k 0 α i C α i y i = 0 i d k 0 k d p k ) 2 p S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 11 / 22
25 Saddle Point Problem d α S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 12 / 22
26 Solving the Saddle Point Saddle Point Problem min max d α s.t. ( 1 d k α H k α + 1 α + ρ 2 2 k 0 α i C α i y i = 0 i d k 0 k d p k ) 2 p S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 13 / 22
27 Our Approach The Key Insight Eliminate d D(α) := max α s.t. 1 8ρ ( k 0 α i C α i y i = 0 i 1 p + 1 q = 1 ) 2 ( ) q q α H k α + 1 α Not a QP but very close to one! S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 14 / 22
28 Our Approach SMO-MKL: High Level Overview D(α) := max α s.t. 1 8ρ ( k 0 α i C α i y i = 0 i ) 2 ( ) q q α H k α + 1 α Algorithm Choose two variables α i and α j to optimize Solve the one dimensional reduced optimization problem Repeat until convergence S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 15 / 22
29 Our Approach SMO-MKL: High Level Overview Selecting the Working Set Compute directional derivative and directional Hessian Greedily select the variables Solving the Reduced Problem Analytic solution for p = q = 2 (one dimensional quartic) For other values of p use Newton Raphson S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 22
30 Our Approach SMO-MKL: High Level Overview Selecting the Working Set Compute directional derivative and directional Hessian Greedily select the variables Solving the Reduced Problem Analytic solution for p = q = 2 (one dimensional quartic) For other values of p use Newton Raphson S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 16 / 22
31 Experiments Generalization Performance Australian 90 Test Accuracy (%) SMO-MKL Shogun S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 22
32 Experiments Generalization Performance ionosphere 94 Test Accuracy (%) SMO-MKL Shogun S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 17 / 22
33 Experiments Scaling with Training Set Size Adult: 123 dimensions, 50 RBF kernels, p = 1.33, C = 1 CPU Time in seconds SMO-MKL Shogun Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 22
34 Experiments Scaling with Training Set Size Adult: 123 dimensions, 50 RBF kernels, p = 1.33, C = 1 CPU Time in seconds Observed O(n 1.9 ) Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 18 / 22
35 Experiments On Another Dataset Web: 300 dimensions, 50 RBF kernels, p = 1.33, C = 1 CPU Time in seconds Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 19 / 22
36 Experiments Scaling with Number of Kernels Sonar: 208 examples, 59 dimensions, p = 1.33, C = 1 CPU Time in seconds Number of Kernels S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 20 / 22
37 Experiments Scaling with Number of Kernels Sonar: 208 examples, 59 dimensions, p = 1.33, C = 1 CPU Time in seconds SMO-MKL O(n) Number of Kernels S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 20 / 22
38 Experiments On Another Dataset Real-sim: 72,309 examples, 20,958 dimensions, p = 1.33, C = 1 CPU Time in seconds 10 5 SMO-MKL O(n) Number of Training Examples S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 21 / 22
39 References References S. V. N. Vishwanathan, Zhaonan Sun, Theera-Ampornpunt, and Manik Varma. Multiple Kernel Learning and the SMO Algorithm. NIPS Pages Code available for download from com/en-us/um/people/manik/code/smo-mkl/download.html S.V. N. Vishwanathan (Purdue University) Optimization for Machine Learning 22 / 22
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
More informationSupport Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
More informationSupport Vector Machine (SVM)
Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationA Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationA Study on the Comparison of Electricity Forecasting Models: Korea and China
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison
More informationIntroduction: Overview of Kernel Methods
Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
More informationOnline learning of multi-class Support Vector Machines
IT 12 061 Examensarbete 30 hp November 2012 Online learning of multi-class Support Vector Machines Xuan Tuan Trinh Institutionen för informationsteknologi Department of Information Technology Abstract
More informationSupport Vector Machines for Classification and Regression
UNIVERSITY OF SOUTHAMPTON Support Vector Machines for Classification and Regression by Steve R. Gunn Technical Report Faculty of Engineering, Science and Mathematics School of Electronics and Computer
More informationForecasting and Hedging in the Foreign Exchange Markets
Christian Ullrich Forecasting and Hedging in the Foreign Exchange Markets 4u Springer Contents Part I Introduction 1 Motivation 3 2 Analytical Outlook 7 2.1 Foreign Exchange Market Predictability 7 2.2
More informationKERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA
Rahayu, Kernel Logistic Regression-Linear for Leukemia Classification using High Dimensional Data KERNEL LOGISTIC REGRESSION-LINEAR FOR LEUKEMIA CLASSIFICATION USING HIGH DIMENSIONAL DATA S.P. Rahayu 1,2
More informationDistributed Machine Learning and Big Data
Distributed Machine Learning and Big Data Sourangshu Bhattacharya Dept. of Computer Science and Engineering, IIT Kharagpur. http://cse.iitkgp.ac.in/~sourangshu/ August 21, 2015 Sourangshu Bhattacharya
More informationMaking Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research
More informationFast Kernel Classifiers with Online and Active Learning
Journal of Machine Learning Research 6 (2005) 1579 1619 Submitted 3/05; Published 9/05 Fast Kernel Classifiers with Online and Active Learning Antoine Bordes NEC Laboratories America 4 Independence Way
More informationA Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression
A Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression José Guajardo, Jaime Miranda, and Richard Weber, Department of Industrial Engineering, University of Chile Abstract
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationData clustering optimization with visualization
Page 1 Data clustering optimization with visualization Fabien Guillaume MASTER THESIS IN SOFTWARE ENGINEERING DEPARTMENT OF INFORMATICS UNIVERSITY OF BERGEN NORWAY DEPARTMENT OF COMPUTER ENGINEERING BERGEN
More informationOnline Passive-Aggressive Algorithms on a Budget
Zhuang Wang Dept. of Computer and Information Sciences Temple University, USA zhuang@temple.edu Slobodan Vucetic Dept. of Computer and Information Sciences Temple University, USA vucetic@temple.edu Abstract
More informationTraining and Testing Low-degree Polynomial Data Mappings via Linear SVM
Journal of Machine Learning Research 11 (2010) 1471-1490 Submitted 8/09; Revised 1/10; Published 4/10 Training and Testing Low-degree Polynomial Data Mappings via Linear SVM Yin-Wen Chang Cho-Jui Hsieh
More informationAcknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
More informationSupport Vector Machines
Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric
More informationFóra Gyula Krisztián. Predictive analysis of financial time series
Eötvös Loránd University Faculty of Science Fóra Gyula Krisztián Predictive analysis of financial time series BSc Thesis Supervisor: Lukács András Department of Computer Science Budapest, June 2014 Acknowledgements
More informationO(n 3 ) O(4 d nlog 2d n) O(n) O(4 d log 2d n)
O(n 3 )O(4 d nlog 2d n) O(n)O(4 d log 2d n) O(n 3 ) O(n) O(n 3 ) O(4 d nlog 2d n) O(n)O(4 d log 2d n) 1. Fast Gaussian Filtering Algorithm 2. Binary Classification Using The Fast Gaussian Filtering Algorithm
More informationOnline (and Offline) on an Even Tighter Budget
Online (and Offline) on an Even Tighter Budget Jason Weston NEC Laboratories America, Princeton, NJ, USA jasonw@nec-labs.com Antoine Bordes NEC Laboratories America, Princeton, NJ, USA antoine@nec-labs.com
More informationPrinciples of Scientific Computing Nonlinear Equations and Optimization
Principles of Scientific Computing Nonlinear Equations and Optimization David Bindel and Jonathan Goodman last revised March 6, 2006, printed March 6, 2009 1 1 Introduction This chapter discusses two related
More informationTrading regret rate for computational efficiency in online learning with limited feedback
Trading regret rate for computational efficiency in online learning with limited feedback Shai Shalev-Shwartz TTI-C Hebrew University On-line Learning with Limited Feedback Workshop, 2009 June 2009 Shai
More informationLecture 2: The SVM classifier
Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function
More informationResearch Article Prediction of Banking Systemic Risk Based on Support Vector Machine
Mathematical Problems in Engineering Volume 203, Article ID 36030, 5 pages http://dx.doi.org/0.55/203/36030 Research Article Prediction of Banking Systemic Risk Based on Support Vector Machine Shouwei
More informationKEITH LEHNERT AND ERIC FRIEDRICH
MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They
More informationStock Market Volatility Prediction: A Service-Oriented Multi-Kernel Learning Approach
Stock Market Volatility Prediction: A Service-Oriented Multi-Kernel Learning Approach Feng Wang State Key Lab of Software Engineering Wuhan University Wuhan, China Email: fengwang@whu.edu.cn Ling Liu College
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Going For Large Scale 1
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationApplications of Support Vector-Based Learning
Applications of Support Vector-Based Learning Róbert Ormándi The supervisors are Prof. János Csirik and Dr. Márk Jelasity Research Group on Artificial Intelligence of the University of Szeged and the Hungarian
More informationMore Generality in Efficient Multiple Kernel Learning
Manik Varma manik@microsoft.com Microsoft Research India, Second Main Road, Sadashiv Nagar, Bangalore 560 080, India Bodla Rakesh Babu rakeshbabu@research.iiit.net CVIT, International Institute of Information
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationClassification of high resolution satellite images
Thesis for the degree of Master of Science in Engineering Physics Classification of high resolution satellite images Anders Karlsson Laboratoire de Systèmes d Information Géographique Ecole Polytéchnique
More informationDUOL: A Double Updating Approach for Online Learning
: A Double Updating Approach for Online Learning Peilin Zhao School of Comp. Eng. Nanyang Tech. University Singapore 69798 zhao6@ntu.edu.sg Steven C.H. Hoi School of Comp. Eng. Nanyang Tech. University
More informationMERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION
MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION Matthew A. Lanham & Ralph D. Badinelli Virginia Polytechnic Institute and State University Department of Business
More informationData visualization and dimensionality reduction using kernel maps with a reference point
Data visualization and dimensionality reduction using kernel maps with a reference point Johan Suykens K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg 1 B-31 Leuven (Heverlee), Belgium Tel: 32/16/32 18
More informationEarly defect identification of semiconductor processes using machine learning
STANFORD UNIVERISTY MACHINE LEARNING CS229 Early defect identification of semiconductor processes using machine learning Friday, December 16, 2011 Authors: Saul ROSA Anton VLADIMIROV Professor: Dr. Andrew
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationSemi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationScalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
More informationClassification using Intersection Kernel Support Vector Machines is Efficient
To Appear in IEEE Computer Vision and Pattern Recognition 28, Anchorage Classification using Intersection Kernel Support Vector Machines is Efficient Subhransu Maji EECS Department U.C. Berkeley smaji@eecs.berkeley.edu
More informationOnline Classification on a Budget
Online Classification on a Budget Koby Crammer Computer Sci. & Eng. Hebrew University Jerusalem 91904, Israel kobics@cs.huji.ac.il Jaz Kandola Royal Holloway, University of London Egham, UK jaz@cs.rhul.ac.uk
More informationMAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS
MAXIMIZING RETURN ON DIRET MARKETING AMPAIGNS IN OMMERIAL BANKING S 229 Project: Final Report Oleksandra Onosova INTRODUTION Recent innovations in cloud computing and unified communications have made a
More informationLinear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationMathematical finance and linear programming (optimization)
Mathematical finance and linear programming (optimization) Geir Dahl September 15, 2009 1 Introduction The purpose of this short note is to explain how linear programming (LP) (=linear optimization) may
More informationIFT3395/6390. Machine Learning from linear regression to Neural Networks. Machine Learning. Training Set. t (3.5, -2,..., 127, 0,...
IFT3395/6390 Historical perspective: back to 1957 (Prof. Pascal Vincent) (Rosenblatt, Perceptron ) Machine Learning from linear regression to Neural Networks Computer Science Artificial Intelligence Symbolic
More informationNonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,
More informationLocal features and matching. Image classification & object localization
Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationPrivacy-Preserving Outsourcing Support Vector Machines with Random Transformation
Privacy-Preserving Outsourcing Support Vector Machines with Random Transformation Keng-Pei Lin Ming-Syan Chen Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan Research Center
More informationDesigning a learning system
Lecture Designing a learning system Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x4-8845 http://.cs.pitt.edu/~milos/courses/cs750/ Design of a learning system (first vie) Application or Testing
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationA fast multi-class SVM learning method for huge databases
www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,
More informationLarge-Scale Sparsified Manifold Regularization
Large-Scale Sparsified Manifold Regularization Ivor W. Tsang James T. Kwok Department of Computer Science and Engineering The Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
More informationThe Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
More informationClassifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang
Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental
More informationMulticlass Classification. 9.520 Class 06, 25 Feb 2008 Ryan Rifkin
Multiclass Classification 9.520 Class 06, 25 Feb 2008 Ryan Rifkin It is a tale Told by an idiot, full of sound and fury, Signifying nothing. Macbeth, Act V, Scene V What Is Multiclass Classification? Each
More informationIntroduction to Online Learning Theory
Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent
More informationStock Trading Using Analytics
American Journal of Marketing Research Vol. 2, No. 2, 2016, pp. 27-37 http://www.aiscience.org/journal/ajmr ISSN: 2381-750X (Print); ISSN: 2381-7518 (Online) Stock Trading Using Analytics Chandrika Kadirvel
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationA semi-supervised Spam mail detector
A semi-supervised Spam mail detector Bernhard Pfahringer Department of Computer Science, University of Waikato, Hamilton, New Zealand Abstract. This document describes a novel semi-supervised approach
More informationSupport Vector Machine. Tutorial. (and Statistical Learning Theory)
Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@nec-labs.com 1 Support Vector Machines: history SVMs introduced
More informationCore Vector Machines: Fast SVM Training on Very Large Data Sets
Journal of Machine Learning Research 6 (2005) 363 392 Submitted 12/04; Published 4/05 Core Vector Machines: Fast SVM Training on Very Large Data Sets Ivor W. Tsang James T. Kwok Pak-Ming Cheung Department
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationFoundations of Machine Learning On-Line Learning. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Foundations of Machine Learning On-Line Learning Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Motivation PAC learning: distribution fixed over time (training and test). IID assumption.
More informationFast Multipole Method for particle interactions: an open source parallel library component
Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,
More informationA User s Guide to Support Vector Machines
A User s Guide to Support Vector Machines Asa Ben-Hur Department of Computer Science Colorado State University Jason Weston NEC Labs America Princeton, NJ 08540 USA Abstract The Support Vector Machine
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
More informationActive Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
More informationMachine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
More informationDistance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
More informationHong Kong Stock Index Forecasting
Hong Kong Stock Index Forecasting Tong Fu Shuo Chen Chuanqi Wei tfu1@stanford.edu cslcb@stanford.edu chuanqi@stanford.edu Abstract Prediction of the movement of stock market is a long-time attractive topic
More informationMachine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
More informationSupport Vector Machines
CS229 Lecture notes Andrew Ng Part V Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe are indeed the best)
More informationSteven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore
Steven C.H. Hoi School of Computer Engineering Nanyang Technological University Singapore Acknowledgments: Peilin Zhao, Jialei Wang, Hao Xia, Jing Lu, Rong Jin, Pengcheng Wu, Dayong Wang, etc. 2 Agenda
More informationA Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM
Journal of Computational Information Systems 10: 17 (2014) 7629 7635 Available at http://www.jofcis.com A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Tian
More informationAn Internal Model for Operational Risk Computation
An Internal Model for Operational Risk Computation Seminarios de Matemática Financiera Instituto MEFF-RiskLab, Madrid http://www.risklab-madrid.uam.es/ Nicolas Baud, Antoine Frachot & Thierry Roncalli
More informationEmployer Health Insurance Premium Prediction Elliott Lui
Employer Health Insurance Premium Prediction Elliott Lui 1 Introduction The US spends 15.2% of its GDP on health care, more than any other country, and the cost of health insurance is rising faster than
More informationfml The SHOGUN Machine Learning Toolbox (and its python interface)
fml The SHOGUN Machine Learning Toolbox (and its python interface) Sören Sonnenburg 1,2, Gunnar Rätsch 2,Sebastian Henschel 2,Christian Widmer 2,Jonas Behr 2,Alexander Zien 2,Fabio de Bona 2,Alexander
More information(Quasi-)Newton methods
(Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting
More informationData Mining for Business Analytics
Data Mining for Business Analytics Lecture 2: Introduction to Predictive Modeling Stern School of Business New York University Spring 2014 MegaTelCo: Predicting Customer Churn You just landed a great analytical
More informationSupport Vector Pruning with SortedVotes for Large-Scale Datasets
Support Vector Pruning with SortedVotes for Large-Scale Datasets Frerk Saxen, Konrad Doll and Ulrich Brunsmann University of Applied Sciences Aschaffenburg, Germany Email: {Frerk.Saxen, Konrad.Doll, Ulrich.Brunsmann}@h-ab.de
More informationMessage-passing sequential detection of multiple change points in networks
Message-passing sequential detection of multiple change points in networks Long Nguyen, Arash Amini Ram Rajagopal University of Michigan Stanford University ISIT, Boston, July 2012 Nguyen/Amini/Rajagopal
More informationMachine Learning Logistic Regression
Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.
More informationIntroduction to the Finite Element Method (FEM)
Introduction to the Finite Element Method (FEM) ecture First and Second Order One Dimensional Shape Functions Dr. J. Dean Discretisation Consider the temperature distribution along the one-dimensional
More informationFrom N to N+1: Multiclass Transfer Incremental Learning
From N to N+1: Multiclass Transfer Incremental Learning Ilja Kuzborskij 1,2, Francesco Orabona 3, and Barbara Caputo 1 1 Idiap Research Institute, Switzerland, 2 École Polytechnique Fédérale de Lausanne
More informationFeature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier
Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,
More informationLCs for Binary Classification
Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it
More informationNeural Networks and Support Vector Machines
INF5390 - Kunstig intelligens Neural Networks and Support Vector Machines Roar Fjellheim INF5390-13 Neural Networks and SVM 1 Outline Neural networks Perceptrons Neural networks Support vector machines
More information