On sequence kernels for SVM classification of sets of vectors: application to speaker verification
|
|
- Christal Parsons
- 8 years ago
- Views:
Transcription
1 On sequence kernels for SVM classification of sets of vectors: application to speaker verification Major part of the Ph.D. work of In collaboration with Jérôme Louradour Francis Bach (ARMINES) within E-TEAM 11 1/36
2 Introduction Text independent speaker verification A binary classification task Determine if a speech sequence has been uttered by a target speaker Classical approach séquence test? locuteur cible imposteurs, Monde PRÉ-TRAITEMENT ATTRIBUTION DE SCORE GMM APPRENTISSAGE UBM Classifieur PRISE DE DÉCISION locuteur cible ou imposteur 2/36
3 Introduction Text independent speaker verification A binary classification task Determine if a speech sequence has been uttered by a target speaker Classical approach : UBM-GMM system séquence test? locuteur cible imposteurs, Monde PRÉ-TRAITEMENT ATTRIBUTION DE SCORE GMM APPRENTISSAGE UBM Classifieur PRISE DE DÉCISION locuteur cible ou imposteur Probabilistic GMM modeling (generative) 2/36
4 Introduction Text independent speaker verification A binary classification task Determine if a speech sequence has been uttered by a target speaker Classical approach : UBM-GMM system séquence test? locuteur cible imposteurs, Monde PRÉ-TRAITEMENT ATTRIBUTION DE SCORE GMM APPRENTISSAGE UBM Classifieur PRISE DE DÉCISION locuteur cible ou imposteur Probabilistic GMM modeling (generative) y Motivation : apply SVM, powerful for binary classification 2/36
5 Introduction Support Vector Machines (SVM) + Good theoretical foundations + Well mastered learning algorithms + Good results in static data classification In speech processing... Extension to dynamic data is not easy Size of training corpus time & memory consuming Bad results of SVM when applied at the frame level 3/36
6 Introduction Support Vector Machines (SVM) + Good theoretical foundations + Well mastered learning algorithms + Good results in static data classification In speech processing... Extension to dynamic data is not easy Size of training corpus time & memory consuming Bad results of SVM when applied at the frame level y Conception and study of sequence kernels for speaker verification 3/36
7 Sequence kernels Outline 1 Sequence kernels 2 FSNS sequence kernels 3 Experimental evaluation of sequence kernels 4/36
8 Sequence kernels Basics Basics on kernels Similarity measure Mercer property : symetric, positive definite = k(x, y) = φ(x) φ(y) φ : expansion in a Feature Space R D (dimension D + ) 5/36
9 Sequence kernels Basics Sequence kernels Three famillies of kernels 1 Mutual Information kernels Based on a priori data distribution 2 Kernels between probability densities Sequence distribution 3 Combination of vector kernels Function of kernel values between inter(intra)-sequence elements 6/36
10 Sequence kernels Basics Sequence kernels Three famillies of kernels 1 Mutual Information kernels Based on a priori data distribution 2 Kernels between probability densities Sequence distribution 3 Combination of vector kernels Function of kernel values between inter(intra)-sequence elements Definition of a sequence Variable-length set of acoustic vectors - same speaker, - same recording session 6/36
11 Sequence kernels Sequence kernels used in speaker verification Mutual Information kernels Exploit the a priori (generative) UBM which parameters θ o are esimated on non-labeled data Fisher kernel [Jaakkola and Haussler, 1998] Approximation of a Mutual Information kernel [Seeger, 2002] κ(x, Y) = φ(x) S 1 φ(y) φ(x) = θ log p(x θ) θ=θo (Fisher expantion) S : second moments of φ (Fisher information matrix) 7/36
12 Sequence kernels Sequence kernels used in speaker verification Mutual Information kernels Exploit the a priori (generative) UBM which parameters θ o are esimated on non-labeled data Fisher kernel [Jaakkola and Haussler, 1998] Approximation of a Mutual Information kernel [Seeger, 2002] κ(x, Y) = φ(x) S 1 φ(y) φ(x) = θ log p(x θ) θ=θo (Fisher expantion) S : second moments of φ (Fisher information matrix) 2 Gaussians GMM 7/36
13 Sequence kernels Sequence kernels used in speaker verification Mutual Information kernels Exploit the a priori (generative) UBM which parameters θ o are esimated on non-labeled data Fisher kernel [Jaakkola and Haussler, 1998] Approximation of a Mutual Information kernel [Seeger, 2002] κ(x, Y) = φ(x) S 1 φ(y) φ(x) = θ log p(x θ) θ=θo (Fisher expantion) S : second moments of φ (Fisher information matrix) Distance between Fisher expansions 7/36
14 Sequence kernels Sequence kernels used in speaker verification Kernels between probability densities X, Y GGGGGGGGGGGGGA Generative p X, p Y learning probability product kernels [Jebara and Kondor, 2003] κ(x, Y) = p X (z) q p Y (z) q dz Analytic form a GMM with degree q = 1 [Lyu, 2005] spherical normalization for more robustness : κ(x,y) κ(x,y)= κ(x,x)κ(y,y) Exponential embedding of divergences Analogous of the Gaussien kernel : κ(x, Y) = e D(pX,pY)2 2ρ 2 D : Distance between GMMs, approximation of KL divergence [Do, 2003] y supervectors GMM Approach [Campbell et al., 2006] 8/36
15 Sequence kernels Sequence kernels used in speaker verification Combination of vector kernels Sequences of vectors X = { x t t = 1 T X } Y = { y t t = 1 T Y } Similarity Similarity between between vectors sets of vectors k(x, y) = φ(x) φ(y) w κ(x, Y) Mercer kernel function of k(x t, y t ), k(x t, x t ), k(y t, y t ),... Simple example : κ(x, Y) = 1 T XT Y T X T Y t=1 t =1 k(x t, y t ) complexity O(T 2 ) for each kernel computation 9/36
16 Sequence kernels Sequence kernels used in speaker verification Combination of vector kernels GLDS kernel [Campbell, 2001] ( ) 1 κ(x, Y) = T X t ( ) φ 1 1 q(x t ) SB T Y t φ q(y t ) φ q : Polynomial expansion (R d R D ) D = (q+d)! q!d! monomes of degree q S B : Matrix of second moments of φ q estimated on B Normalization by S B : - Kernel scoring on X a discr. model learned on Y - Same amplitude for each component of the Feature Space Explicit expansion of data : + high efficiency during test (linear SVM model) Impossible to use expansions φ q of high or infinite dimension (in practice, maxi degree q = 3) 10/36
17 Sequence kernels Sequence kernels used in speaker verification Combination of vector kernels GLDS kernel [Campbell, 2001] ( ) 1 κ(x, Y) = T X t ( ) φ 1 1 q(x t ) SB T Y t φ q(y t ) φ q : Polynomial expansion (R d R D ) D = (q+d)! q!d! monomes of degree q S B : Matrix of second moments of φ q estimated on B Normalization by S B : - Kernel scoring on X a discr. model learned on Y - Same amplitude for each component of the Feature Space Explicit expansion of data : + high efficiency during test (linear SVM model) Impossible to use expansions φ q of high or infinite dimension (in practice, maxi degree q = 3) 10/36
18 FSNS sequence kernels Outline 1 Sequence kernels 2 FSNS sequence kernels 3 Experimental evaluation of sequence kernels 11/36
19 FSNS sequence kernels Definition Extension of the GLDS kernel : FSNS kernels FSNS kernels (Feature Space Normalized Sequence kernels) ( ) (SB 1 κ(x, Y) = T X t φ(x t) + εi ) ( 1 1 T Y t φ(y t ) ) φ : any Mercer expansion S B : second moments matrix of φ Regularization ε > 0 : necessary if φ is of high dimension { avoid to compute φ y Objectif : rewrite using the Mercer kernel k = φ φ 12/36
20 FSNS sequence kernels Definition Extension of the GLDS kernel : FSNS kernels FSNS kernels (Feature Space Normalized Sequence kernels) ( ) (SB 1 κ(x, Y) = T X t φ(x t) + εi ) ( 1 1 T Y t φ(y t ) ) φ : any Mercer expansion S B : second moments matrix of φ Regularization ε > 0 : necessary if φ is of high dimension { avoid to compute φ y Objectif : rewrite using the Mercer kernel k = φ φ FSMS kernels (Feature Space Mahalanobis Sequence kernels) ( ) (ΣB 1 κ(x, Y) = T X t φ(x t) + εi ) ( 1 ) 1 T Y t φ(y t ) Σ B : covariance matrix of φ SVM are invariant to translations in the Feature Space same as FSNS with centring of φ kernel KL divergence between Gaussians in the Feature Space 12/36
21 FSNS sequence kernels Dual Form Dual form of FSNS kernels Training (Background) data : B = { b i i = 1 N} Gram matrix Empirical map K = k(b1,b1) k(b1,b N) k(b1, x). k(b i,b j ). ψ B (x) =. k(b N,b 1) k(b N,b N ) k(b N, x) Proposition (without regularization, ε = 0) ( ) 1 κ(x, Y) = T X t ( φ(x 1 ) 1 t) SB T Y t φ(y t ) ( ) ( 1 = T X t ( ) ψ B(x t ) 1 N K2) 1 1 T Y t ψ B(y t ) 13/36
22 FSNS sequence kernels Dual Form Dual form of FSNS kernels Training (Background) data : B = { b i i = 1 N} Gram matrix Empirical map K = k(b1,b1) k(b1,b N) k(b1, x). k(b i,b j ). ψ B (x) =. k(b N,b 1) k(b N,b N ) k(b N, x) Proposition κ(x, Y) = = ( ) (SB 1 T X t φ(x t) + εi ) ( 1 ) 1 T Y t φ(y t ) ( ) ( 1 T X t ψ B(x t ) 1 N K2 + εk ) ( ) 1 1 T Y t ψ B(y t ) Hypothesis : the φ(b i ) span the training φ(x) 13/36
23 FSNS sequence kernels Dual Form Dual form of FSNS kernels Training (Background) data : B = { b i i = 1 N} Gram matrix Empirical map K = k(b1,b1) k(b1,b N) k(b1, x). k(b i,b j ). ψ B (x) =. k(b N,b 1) k(b N,b N ) k(b N, x) Proposition (with centering) ( ) (ΣB 1 κ(x, Y) = T X t φ(x t) + εi ) ( 1 ) 1 T Y t φ(y t ) ( ) ( 1 = T X t ( ) ψ B(x t ) 1 N KΠK + εk) 1 1 T Y t ψ B(y t ) Hypothesis : the φ(b i ) span the training φ(x) Centering : Π = I 1 N 1 (instead of I) 13/36
24 FSNS sequence kernels Dual Form Computational complexity Dot product of normalized expansions κ(x, Y) = φ(x) M B φ(y) = Uφ(X), Uφ(Y) (1) = ψ B (X) R B ψ B (Y) = Vψ B (X), Vψ B (Y) (2) form (1) form (2) Pre-computation of U/V O(D 3 ) O(N 3 ) Sequence expansion Uφ/Vψ B O(TD 2 ) O(T N 2 ) Dot product comput. O(D) O(N) D : dimension of the Feature Space (size of φ) N : number of background vectors (size of ψ) + Possibility to use expansions φ of infinite dim (Gaussian kernel) Complexity problem for large databases 14/36
25 FSNS sequence kernels Dual Form Computational complexity Dot product of normalized expansions κ(x, Y) = φ(x) M B φ(y) = Uφ(X), Uφ(Y) (1) = ψ B (X) R B ψ B (Y) = Vψ B (X), Vψ B (Y) (2) form (1) form (2) Pre-computation of U/V O(D 3 ) O(N 3 ) Sequence expansion Uφ/Vψ B O(TD 2 ) O(T N 2 ) Dot product comput. O(D) O(N) D : dimension of the Feature Space (size of φ) N : number of background vectors (size of ψ) + Possibility to use expansions φ of infinite dim (Gaussian kernel) Complexity problem for large databases y Objective : find an appropriate approximation 14/36
26 FSNS sequence kernels Approximation Kernel Approximation Goal 1 Reduce the size of the empirical map ψ 2 Keep a maximum of information ICD : Incomplete Cholesky Decomposition [Fine and Scheinberg, 2001] 1 Selection of a sub-population of backround vectors : codebook C = { b p1 b pi b pm } B index I = { p 1 p i p m } {1 N} (taille m) 2 Low-rank approximation of the Gram matrix : K L I = K(:, I)K(I, I) 1 K(:, I) (rang m) min tr ( ) K L I min φ(bi ) φ C (b i ) 2 I C + CPU and memory 15/36
27 FSNS sequence kernels Approximation Kernel Approximation Goal 1 Reduce the size of the empirical map ψ 2 Keep a maximum of information ICD : Incomplete Cholesky Decomposition Training data Codebook ICD, Gaussian kernel 16/36
28 FSNS sequence kernels Approximation Approximate Form of FSNS kernels Proposition ICD K w K(:, I)K(I, I) 1 K(:, I) κ(x, Y) = ψ B (X) R B ψ B (Y) w κ(x, Y) ψ C (X) R B C ψ C (Y) Expansion of size m N : ψ C (X) = 1 T X t k(bp 1, x t). k(b pm, x t) R B C = ( 1 N K(:, I)ΠK(:, I) + εk(i, I) 1 Complexity form (1) form (2) approx form Expansion norm. O(TD 2 ) O(TN 2 ) O(Tm 2 ) Dot product O(D) O(N) O(m) 17/36
29 Experimental evaluation of sequence kernels Outline 1 Sequence kernels 2 FSNS sequence kernels 3 Experimental evaluation of sequence kernels 18/36
30 Experimental evaluation of sequence kernels Data Speech corpus : NIST Speaker Recognition Evaluation Development : NIST 2003 and 2004 Background corpus (1/2) Validation corpus (2/2) - Hyper-parameters of kernels - Parameter C (SVM leraning) - Decision threshold Evaluation : NIST tests, 400 target speakers DCF = τ FRP locfr% + τ FAP impfa% 19/36
31 Experimental evaluation of sequence kernels Data Speech corpus : NIST Speaker Recognition Evaluation Development : NIST 2003 and 2004 Background corpus (1/2) Validation corpus (2/2) - Hyper-parameters of kernels - Parameter C (SVM leraning) - Decision threshold Evaluation : NIST tests, 400 target speakers DCF = τ FRP locfr% + τ FAP impfa% Preprocessing SVM GMM acoustic vectors MFCC+ MFCC+ loge LFCC+ LFCC+ loge silence suppression clustering of the energy normalization feature warping centring-reduction 19/36
32 Experimental evaluation of sequence kernels FSNS kernels Development : Centring & Regularization? κ(x, Y) = ψ C (X) [ 1 N K(:, I) PK(:, I) + εk(i, I)] 1 ψc (Y) Centring Regul. EER(%) DCFmin P = Π ε = P = Π ε = P = Π (ε = 0) (P = I) (ε = 0) No significant gain with centering Regularization degrades 20/36
33 Experimental evaluation of sequence kernels FSNS kernels Developement : Choice and tuning of the vector kernel κ(x, Y) = ψ C (X) [ 1 N K(:, I) PK(:, I) + εk(i, I)] 1 ψc (Y) EER(%) ρ = 2ρ ρ = ρ 0/ ρ = ρ DCFmin Best results are obtained with a Gaussian kernel k(x, y) = e x y 2 2ρ 2... using a good spread parameter ρ ρ 0 = dσ 21/36
34 Experimental evaluation of sequence kernels FSNS kernels Developement : Size of the codebook κ(x, Y) = ψ C (X) [ 1 N K(:, I) PK(:, I) + εk(i, I)] 1 ψc (Y) Codebook size EER(%) DCFmin m = m = m = m = m = m = Augmenting la codebook size improve the performances... until a certain point (m 5000) 22/36
35 Experimental evaluation of sequence kernels FSNS kernels Evaluation : FSNS kernels vs. Classical approches Validation corpus Evaluation corpus EER(%) DCFmin (1) GLDS SVM (2) FSNS SVM (3) UBM-GMM EER DCF (1) GLDS SVM (2) FSNS SVM (3) UBM-GMM FSNS kernel : better performances than GLDS kernel FSNS-SVM competitive with UBM-GMM 23/36
36 Experimental evaluation of sequence kernels FSNS kernels Evaluation : Perspectives of fusion Validation corpus Evaluation corpus EER(%) DCFmin (1) GLDS SVM (2) FSNS SVM (3) UBM-GMM (1+2) fusion no improvment (1+3) fusion (2+3) fusion EER DCF (1) GLDS SVM (2) FSNS SVM (3) UBM-GMM (1+2) fusion no improvment (1+3) fusion (2+3) fusion Gain in discriminative / generative fusion 24/36
37 Experimental evaluation of sequence kernels All sequence kernels Evaluation : All sequence kernels EER DCF Probability product kernel GLDS kernel Fisher kernel FSNS kernel UBM-GMM (ref) supervectors GMM Best results : exploiting GMM in SVM 25/36
38 Experimental evaluation of sequence kernels All sequence kernels Evaluation : Fusion EER DCF (1) FSNS kernel (2) UBM-GMM (3) Supervectors GMM (2+3) fusion no improvment (1+2) fusion (1+3) fusion Gain in GMM / SVM fusion 26/36
39 Conclusions and future work Conclusions and future work Theoretical and experimental exploration of kernels between variable-length sets A lot of possible kernels A novel kernel (FSNS) Experimental comparison on NIST SRE Best performance : fusion FSNS/ GMM supervectors Several ways to improve SVM for speaker verification Model adaptation String kernels for high-level features (prosody... ) How to handle SVM scores? 27/36
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationIEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 1 A Joint Factor Analysis Approach to Progressive Model Adaptation in Text-Independent Speaker Verification Shou-Chun Yin, Richard Rose, Senior
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
More informationSupport Vector Machine. Tutorial. (and Statistical Learning Theory)
Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@nec-labs.com 1 Support Vector Machines: history SVMs introduced
More informationAS indicated by the growing number of participants in
1960 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 7, SEPTEMBER 2007 State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software Benoît
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
More informationTraining Universal Background Models for Speaker Recognition
Odyssey 2010 The Speaer and Language Recognition Worshop 28 June 1 July 2010, Brno, Czech Republic Training Universal Bacground Models for Speaer Recognition Mohamed Kamal Omar and Jason Pelecanos IBM
More informationEmotion Detection from Speech
Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction
More informationProbabilistic Discriminative Kernel Classifiers for Multi-class Problems
c Springer-Verlag Probabilistic Discriminative Kernel Classifiers for Multi-class Problems Volker Roth University of Bonn Department of Computer Science III Roemerstr. 164 D-53117 Bonn Germany roth@cs.uni-bonn.de
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationLearning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationSPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
More informationSZTAKI @ ImageCLEF 2011
SZTAKI @ ImageCLEF 2011 Bálint Daróczy Róbert Pethes András A. Benczúr Data Mining and Web search Research Group, Informatics Laboratory Computer and Automation Research Institute of the Hungarian Academy
More informationAutomatic Evaluation Software for Contact Centre Agents voice Handling Performance
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,
More informationMACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationDirect Loss Minimization for Structured Prediction
Direct Loss Minimization for Structured Prediction David McAllester TTI-Chicago mcallester@ttic.edu Tamir Hazan TTI-Chicago tamir@ttic.edu Joseph Keshet TTI-Chicago jkeshet@ttic.edu Abstract In discriminative
More informationTagging with Hidden Markov Models
Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,
More informationSupport Vector Machine (SVM)
Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More informationStatistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
More informationAutomatic parameter regulation for a tracking system with an auto-critical function
Automatic parameter regulation for a tracking system with an auto-critical function Daniela Hall INRIA Rhône-Alpes, St. Ismier, France Email: Daniela.Hall@inrialpes.fr Abstract In this article we propose
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationThe p-norm generalization of the LMS algorithm for adaptive filtering
The p-norm generalization of the LMS algorithm for adaptive filtering Jyrki Kivinen University of Helsinki Manfred Warmuth University of California, Santa Cruz Babak Hassibi California Institute of Technology
More informationSemi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
More informationABC System description for NIST SRE 2010
ABC System description for NIST SRE 2010 May 6, 2010 1 Introduction The ABC submission is a collaboration between: Agnitio Labs, South Africa Brno University of Technology, Czech Republic CRIM, Canada
More informationOn the Path to an Ideal ROC Curve: Considering Cost Asymmetry in Learning Classifiers
On the Path to an Ideal ROC Curve: Considering Cost Asymmetry in Learning Classifiers Francis R. Bach Computer Science Division University of California Berkeley, CA 9472 fbach@cs.berkeley.edu Abstract
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationBEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationADAPTIVE AND DISCRIMINATIVE MODELING FOR IMPROVED MISPRONUNCIATION DETECTION. Horacio Franco, Luciana Ferrer, and Harry Bratt
ADAPTIVE AND DISCRIMINATIVE MODELING FOR IMPROVED MISPRONUNCIATION DETECTION Horacio Franco, Luciana Ferrer, and Harry Bratt Speech Technology and Research Laboratory, SRI International, Menlo Park, CA
More informationBig learning: challenges and opportunities
Big learning: challenges and opportunities Francis Bach SIERRA Project-team, INRIA - Ecole Normale Supérieure December 2013 Omnipresent digital media Scientific context Big data Multimedia, sensors, indicators,
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationLocal Gaussian Process Regression for Real Time Online Model Learning and Control
Local Gaussian Process Regression for Real Time Online Model Learning and Control Duy Nguyen-Tuong Jan Peters Matthias Seeger Max Planck Institute for Biological Cybernetics Spemannstraße 38, 776 Tübingen,
More informationObject Recognition and Template Matching
Object Recognition and Template Matching Template Matching A template is a small image (sub-image) The goal is to find occurrences of this template in a larger image That is, you want to find matches of
More informationClassification Problems
Classification Read Chapter 4 in the text by Bishop, except omit Sections 4.1.6, 4.1.7, 4.2.4, 4.3.3, 4.3.5, 4.3.6, 4.4, and 4.5. Also, review sections 1.5.1, 1.5.2, 1.5.3, and 1.5.4. Classification Problems
More informationNovelty Detection in image recognition using IRF Neural Networks properties
Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,
More informationData Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationTracking in flussi video 3D. Ing. Samuele Salti
Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationSupport Vector Machines
Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric
More informationAcknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationSeveral Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
More informationLecture 9: Introduction to Pattern Analysis
Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
More informationDiscriminative Multimodal Biometric. Authentication Based on Quality Measures
Discriminative Multimodal Biometric Authentication Based on Quality Measures Julian Fierrez-Aguilar a,, Javier Ortega-Garcia a, Joaquin Gonzalez-Rodriguez a, Josef Bigun b a Escuela Politecnica Superior,
More informationMulti-class Classification: A Coding Based Space Partitioning
Multi-class Classification: A Coding Based Space Partitioning Sohrab Ferdowsi, Svyatoslav Voloshynovskiy, Marcin Gabryel, and Marcin Korytkowski University of Geneva, Centre Universitaire d Informatique,
More informationMIMO CHANNEL CAPACITY
MIMO CHANNEL CAPACITY Ochi Laboratory Nguyen Dang Khoa (D1) 1 Contents Introduction Review of information theory Fixed MIMO channel Fading MIMO channel Summary and Conclusions 2 1. Introduction The use
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
More informationIntroduction: Overview of Kernel Methods
Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University
More informationInteractive Machine Learning. Maria-Florina Balcan
Interactive Machine Learning Maria-Florina Balcan Machine Learning Image Classification Document Categorization Speech Recognition Protein Classification Branch Prediction Fraud Detection Spam Detection
More informationEstablishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
More informationClassifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang
Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental
More informationUser Authentication/Identification From Web Browsing Behavior
User Authentication/Identification From Web Browsing Behavior US Naval Research Laboratory PI: Myriam Abramson, Code 5584 Shantanu Gore, SEAP Student, Code 5584 David Aha, Code 5514 Steve Russell, Code
More informationIBM Research Report. CSR: Speaker Recognition from Compressed VoIP Packet Stream
RC23499 (W0501-090) January 19, 2005 Computer Science IBM Research Report CSR: Speaker Recognition from Compressed Packet Stream Charu Aggarwal, David Olshefski, Debanjan Saha, Zon-Yin Shae, Philip Yu
More informationSecure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics
Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics Anastasis Kounoudes 1, Anixi Antonakoudi 1, Vasilis Kekatos 2 1 The Philips College, Computing and Information Systems
More informationDiscuss the size of the instance for the minimum spanning tree problem.
3.1 Algorithm complexity The algorithms A, B are given. The former has complexity O(n 2 ), the latter O(2 n ), where n is the size of the instance. Let n A 0 be the size of the largest instance that can
More informationSemantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
More informationKEITH LEHNERT AND ERIC FRIEDRICH
MACHINE LEARNING CLASSIFICATION OF MALICIOUS NETWORK TRAFFIC KEITH LEHNERT AND ERIC FRIEDRICH 1. Introduction 1.1. Intrusion Detection Systems. In our society, information systems are everywhere. They
More informationDeveloping an Isolated Word Recognition System in MATLAB
MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling
More informationThe van Hoeij Algorithm for Factoring Polynomials
The van Hoeij Algorithm for Factoring Polynomials Jürgen Klüners Abstract In this survey we report about a new algorithm for factoring polynomials due to Mark van Hoeij. The main idea is that the combinatorial
More informationMachine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu
Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel
More informationCS 688 Pattern Recognition Lecture 4. Linear Models for Classification
CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationILLUMINATION NORMALIZATION BASED ON SIMPLIFIED LOCAL BINARY PATTERNS FOR A FACE VERIFICATION SYSTEM. Qian Tao, Raymond Veldhuis
ILLUMINATION NORMALIZATION BASED ON SIMPLIFIED LOCAL BINARY PATTERNS FOR A FACE VERIFICATION SYSTEM Qian Tao, Raymond Veldhuis Signals and Systems Group, Faculty of EEMCS University of Twente, the Netherlands
More informationSupervised Feature Selection & Unsupervised Dimensionality Reduction
Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or
More informationAnomaly detection. Problem motivation. Machine Learning
Anomaly detection Problem motivation Machine Learning Anomaly detection example Aircraft engine features: = heat generated = vibration intensity Dataset: New engine: (vibration) (heat) Density estimation
More informationMusic Genre Classification
Music Genre Classification Michael Haggblade Yang Hong Kenny Kao 1 Introduction Music classification is an interesting problem with many applications, from Drinkify (a program that generates cocktails
More informationCHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,
More informationBiometric Authentication using Online Signatures
Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu alisher@su.sabanciuniv.edu, berrin@sabanciuniv.edu http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,
More informationUniversal Algorithm for Trading in Stock Market Based on the Method of Calibration
Universal Algorithm for Trading in Stock Market Based on the Method of Calibration Vladimir V yugin Institute for Information Transmission Problems, Russian Academy of Sciences, Bol shoi Karetnyi per.
More informationModeling and Prediction of Network Traffic Based on Hybrid Covariance Function Gaussian Regressive
Journal of Information & Computational Science 12:9 (215) 3637 3646 June 1, 215 Available at http://www.joics.com Modeling and Prediction of Network Traffic Based on Hybrid Covariance Function Gaussian
More informationAnalysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
More informationFeature Engineering in Machine Learning
Research Fellow Faculty of Information Technology, Monash University, Melbourne VIC 3800, Australia August 21, 2015 Outline A Machine Learning Primer Machine Learning and Data Science Bias-Variance Phenomenon
More informationMathematical Models of Supervised Learning and their Application to Medical Diagnosis
Genomic, Proteomic and Transcriptomic Lab High Performance Computing and Networking Institute National Research Council, Italy Mathematical Models of Supervised Learning and their Application to Medical
More informationInvestigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition
, Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology
More informationA fast multi-class SVM learning method for huge databases
www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationLinear Discrimination. Linear Discrimination. Linear Discrimination. Linearly Separable Systems Pairwise Separation. Steven J Zeil.
Steven J Zeil Old Dominion Univ. Fall 200 Discriminant-Based Classification Linearly Separable Systems Pairwise Separation 2 Posteriors 3 Logistic Discrimination 2 Discriminant-Based Classification Likelihood-based:
More informationGender Identification using MFCC for Telephone Applications A Comparative Study
Gender Identification using MFCC for Telephone Applications A Comparative Study Jamil Ahmad, Mustansar Fiaz, Soon-il Kwon, Maleerat Sodanil, Bay Vo, and * Sung Wook Baik Abstract Gender recognition is
More informationOracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features
Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features Charlie Berger, MS Eng, MBA Sr. Director Product Management, Data Mining and Advanced Analytics charlie.berger@oracle.com www.twitter.com/charliedatamine
More informationEFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE
Uludağ Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi, Cilt 18, Sayı 1, 2013 ARAŞTIRMA EFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE Cemal HANİLÇİ * Figen ERTAŞ * Abstract:
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationTree based ensemble models regularization by convex optimization
Tree based ensemble models regularization by convex optimization Bertrand Cornélusse, Pierre Geurts and Louis Wehenkel Department of Electrical Engineering and Computer Science University of Liège B-4000
More informationMachine learning challenges for big data
Machine learning challenges for big data Francis Bach SIERRA Project-team, INRIA - Ecole Normale Supérieure Joint work with R. Jenatton, J. Mairal, G. Obozinski, N. Le Roux, M. Schmidt - December 2012
More informationUnsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
More informationFeature Selection vs. Extraction
Feature Selection In many applications, we often encounter a very large number of potential features that can be used Which subset of features should be used for the best classification? Need for a small
More informationClass-specific Sparse Coding for Learning of Object Representations
Class-specific Sparse Coding for Learning of Object Representations Stephan Hasler, Heiko Wersing, and Edgar Körner Honda Research Institute Europe GmbH Carl-Legien-Str. 30, 63073 Offenbach am Main, Germany
More informationServer Load Prediction
Server Load Prediction Suthee Chaidaroon (unsuthee@stanford.edu) Joon Yeong Kim (kim64@stanford.edu) Jonghan Seo (jonghan@stanford.edu) Abstract Estimating server load average is one of the methods that
More informationA Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks
A Systemic Artificial Intelligence (AI) Approach to Difficult Text Analytics Tasks Text Analytics World, Boston, 2013 Lars Hard, CTO Agenda Difficult text analytics tasks Feature extraction Bio-inspired
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More information