Discriminative Decision Function Based Scoring Method Used in Speaker Verification

Size: px
Start display at page:

Download "Discriminative Decision Function Based Scoring Method Used in Speaker Verification"

Transcription

1 Chinese Journal of Electronics Vol.21, No.4, Oct Discriminative Decision Function Based Scoring Method Used in Speaker Verification LIANG Chunyan, ZHANG Xiang and YAN Yonghong The Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing , China Abstract Decision function of log likelihood ratio derived from classical hypothesis testing theory is widely used in Gaussian mixture model based speaker recognition system. This paper introduces a discriminative decision function based scoring method for speaker recognition with the state-of-the-art Joint factor analysis JFA system. In the scoring module of JFA system, an approximate form of the decision function is proposed. Based on the approximation, we present a discriminative decision function by re-estimating the contribution of each speech sound unit to the decision function to further improve the performance of speaker verification. The discriminative decision function is used to exploit the individual Gaussian component for better classification. The experiments are carried on the core conditions of National institute of standards and technology NIST 2010 speaker recognition evaluation data. The experimental results show that the proposed scoring method outperforms the conventional frame-by-frame strategy on the whole. Key words Speaker verification, Joint factor analysis JFA, Discriminative decision function. I. Introduction The task of speaker verification is to determine whether a given segment of speech is spoken by the hypothesized speaker [1,2]. The task can be treated as a hypothesis-testing problem. Given a trial including both the test utterance and the target speaker, a decision should be made to tell True or False based on the comparison between the log likelihood score of the trial and a threshold. Gaussian mixture models GMMs have always been the dominant approach in speaker verification [1]. In this approach, GMMs are applied to model data distribution and the Log likelihood ratio LLR derived from hypothesis testing theory is used as decision function. In recent years, Joint factor analysis JFA [3,4] has become the state-of-the-art technique in speaker verification. It has been proposed to solve the problem of speaker and session variability in GMMs framework. Many sites used JFA in the latest NIST evaluations, and there are many ways in the step of scoring [5 7]. Frame-by-frame scoring method is the most conventional one, where the whole feature file of each utterance is processed based on a full GMMs log-likelihood evaluation. It treats the GMMs simply as a probability density function of the feature vectors from a target speaker. In this study, we propose a scoring method based on discriminative decision function which is applied to expand a single GMM into a set of individual Gaussian components. In the proposed method, we re-estimate the contribution of each speech sound unit to the decision function to further improve the performance of speaker verification. The rest of this paper is organized as follows. We briefly introduce the theory of JFA in Section II. The traditional frameby-frame scoring method is presented in Section III. We propose the discriminative decision function based scoring strategy in Section IV. Experiment results are shown in Section V. Finally, we give the conclusion in Section VI. II. Joint Factor Analysis JFA has obtained wide attention during the last few years and become the state-of-the-art system in the field of speaker recognition. JFA model is used to solve the problem of speaker and session variability in GMMs framework. In this model, the speaker and channel dependent mean supervector M can be represented as a sum of two supervectors: M = s + c 1 where s is the speaker supervector and c is the channel supervector, both of which are normally distributed. They can be respectively represented by s = m + Vy+ Dz 2 c = Ux 3 where m is the speaker-independent mean supervector, that is the mean supervector of the Universal background model UBM, V is the speaker loading matrix with high speaker variability eigenvoices, D is the diagonal loading matrix describing remaining speaker variability not covered by V,and Manuscript Received June 2011; Accepted Apr This work is supported by the National Natural Science Foundation of China No , No , No , No , No and the Strategic Priority Research Program of the Chinese Academy of Sciences No.XDA

2 Discriminative Decision Function Based Scoring Method Used in Speaker Verification 693 U is the channel loading matrix with high intersession variability eigenchannels. y, z and x are the speaker factor, diagonal factor and channel factor respectively, which are all assumed to be standard normally distributed random variables. The underlying task in JFA is to train the hyperparameters U, V and D on a large training set. In the Bayesian framework, posterior distribution of the factors knowing their priors can be computed using the enrollment data. The likelihood of test utterance χ is then computed by integrating over the posterior distribution of y and z, and the prior distribution of x [8]. III. Traditional Frame-by-Frame Scoring Method The frame-by-frame scoring method is based on a full GMM log-likelihood evaluation [7]. The log-likelihood of test utterance χ and model s is computed as an average frame log-likelihood. The formula is as follows log P χ s log ω cn o t; s c, Σ c log po t s 4 where o t is the feature vector at frame t, T is the length in frames for test utterance χ, C is the number of Gaussians in the GMM and s = s + Ux is the supervector of the target model after channel adaptation while Ux is the channel supervector for the test utterance. Similarly, when calculating the log-likelihood of utterance χ and the UBM, the mean supervector of UBM is also compensated as m = m + Ux. This is equivalent to set the mean supervectors of both the target model and the UBM into the same channel space where the test utterance lies, which can effectively solve the acoustic mismatch problem between the training and test environment. Thus, the average verification score is obtained by computing the log-likelihood ratio between the compensated target speaker model s and UBM m, for the test utterance χ, Λχ log po t s log po t m 5 IV. Discriminative Decision Function Based Scoring Method 1. The approximation of decision function If we define po t to denote the total probability of both the speaker model s and UBM m, given a feature frame o t, that is po t=po t s +po t m 6 Then the Eq.5 can be written as Λχ log pot s log pot m po t po t po t s log po t s +po t m po t m log 7 po t s +po t m In a GMM λ, the probability po t λ for an observed feature frame o t is po t λ = ω cpo t λ c= go t λ c 8 Two terms of the Taylor series logx x 1areusedtoobtain the approximation of Eq.7 and we discard the 1 since the change will not affect the classification accuracy. Λχ 1 po t s T po t s +po t m po t m po t s +po t m 1 = T If we define go t s c go t m c C j=1 got s j +got m j go t s c go t m c C j=1 got s j +got m j 9 Φ c = 1 go t s c go t m c T C,, 2,,C 10 j=1 got s j +got m j as the difference of average occupation probability among the whole observation series for Gaussian component c between the adapted speaker model and UBM, Eq.9 can be rewritten in the following form of inner product Λχ =w bη 11 where w = [1,, 1] is a unit weight vector and bη = [Φ 1,, Φ C] t denotes the difference vector of occupation probability for a trial η. From Eq.11, we can see that, given a trial η, thevalueof the decision function, hence the decision of True or False for the trial, is completely determined by a weight vector w and a difference vector bη. The average occupation probability Φ c can be thought to represent the occurrence frequency of Gaussian mixture component c among the whole observation sequences. We call the difference vector bη as the trial s information vector, which is used to map the trial into a vector. The values in weight vector w canbeviewedasthecontribution to the decision function of the corresponding elements in the trial s information vector. Hence, we can name w the contribution factor, which can also be considered as a classifier between the true information vectors and the false ones. In Eq.11, the values in w are the same, which indicates that the contributions of the differences of average occupation probability corresponding to all the Gaussian components are equal. In GMMs for speaker verification, the Gaussian components can be considered to model the underlying broad phonetic sounds that characterize a person s voice [1]. Hence, Φ c, c =1,,C, can be thought to represent the differences between the average occupation probability for the event that the feature vector of the test utterance is accounted for by each corresponding speech sound unit characterized by the target

3 694 Chinese Journal of Electronics 2012 speaker model and that for the UBM. The contributions to the decision function of the sound units are determined by w. Actually, some of the sound units have more discriminative information for different speakers, which should be given heavy weight. In contrast, the sound units which are less discriminative should be less weighted. In the following, we will show how to obtain a discriminative contribution factors w to further improve the speaker verification performance. 2. MSE criterion Suppose we have a training set consisting of N + + N trials, in which true trials are denoted as {x i}, i =1,,N +, and false trials as {y j}, j = 1,,N. Each of the trials is mapped into a difference vector of occupation probability bx i, i =1,,N +, and by j, j =1,,N. Thus, the score of the decision function for a trial x can be written as score = w t bx. We can first obtain the discriminative contribution factor w based on Minimizing the sum-of-squares error MSE criterion [9]. w = arg min w E{wt bx yx 2 } 12 where E denotes expectation and yx is the ideal output for trial x. Let the ideal output for true trial vectors be 1 and 0 for false trial vectors, i.e. ytrue = 1andyfalse = 0, the criterion above can be approximated using the training set as [ N + N ] w = arg min w t bx i w t by j 2 w j=1 13 We construct matrix M + and M respectively using all the information vectors of true and false trials as follows bx 1 t by 1 t bx 2 t M + =., M = by 2 t 14. bx N+ t by N t And we define [ ] M + M = M Then, the problem of Eq.13 becomes 15 w = arg min Mw o 2 16 w where o is a vector consisting of N + ones followed by N zeros i.e., the ideal outputs for the training trials. The problem of Eq.16 can be solved using the method of normal equations M t Mw = M t o 17 And Eq.17 can be rearranged by M t M w = M t +1 + M t 0 = M t 1 18 where 1 is the vector of all ones and 0 is an all-zeros vector. If we define R = M t M, w can be obtained by w = R 1 M t In the MSE criterion, the classifier focuses on all the training samples but not those which are easily classified wrongly, so the discriminability of w trained by Eq.19 is limited. Based on Eq.19, we then use the Generalized linear discriminant sequence GLDS kernel based Support vector machine SVM to obtain the optimal w. 3. GLDS kernel method for the discriminative training of contribution factor w Combining the solution of Eq.19 with the scoring equation form 11, we have The above equation can become score = b t w = b t R 1 M t score = b t R 1 b+ 21 where b + =1/N +M t +1 and R =1/N +R. We compare two trials x and y by mapping them into trial information vectors b x and b y first and then computing the GLDS kernel as [10] K GLDS = b t xr 1 b y 22 To reduce training time, we factor R = U t U using the Cholesky decomposition. Then K GLDS =Ub x t Ub y 23 If we transform all the trial information vectors by Ub x,the kernel is a simple inner product. This will dramatically reduce the time used in SVM training. Finally, SVM training procedure will find the corresponding α i for each support vector b i and a universal d. Thus the optimal contribution factor w can be solved as follows l w = α iy ir 1 b i + d 24 where d =[d 0 0] t. Given a new trial z, we firstly convert it to the corresponding information vector b z. Then the discriminative decision function based on the optimal contribution factor w can be expressed as l t score = α iy ir 1 b i + d b z 25 V. Experiments 1. Experiments setup The experiments for different JFA systems based on the two kinds of scoring methods the traditional frame- by-frame and the proposed discriminative decision function based scoring methods are carried out on the NIST 2010 speaker recognition evaluation corpus. The NIST SRE 2010 is similar to SRE 2008 but different from prior evaluations by including in the training and test conditions for the core test not only conversational telephone speech recorded over ordinary telephone channels, but also such speech recorded over a room microphone channel, and conversational speech from an interview scenario recorded over a room microphone channel. We respectively name the above three conditions telephone, microphone and interview for short. In this study, we focus on three types of trials: telephone-telephone, interview-interview

4 Discriminative Decision Function Based Scoring Method Used in Speaker Verification 695 and interview-telephone. Equal error rate EER and the minimum Decision cost function mindcf are used as metrics for evaluation [11,12]. In our experiments, we use Mel-frequency cepstral coefficients MFCCs as the acoustic cepstral features. 18 cepstral coefficients are computed and first order derivatives over 5 frames are appended to each feature vector, which results in a dimensionality of 36. These feature vectors are modeled using GMMs and JFA is used to treat the problem of speaker and session variability. The gender dependent UBM models with 1024 mixture components are trained using the NIST SRE side training corpus. The Switchboard II, Switchboard Cellular corpus as well as the telephone data from NIST SRE 2005 and 2006 corpus is used to train the speaker loading matrix with 300 speaker factors. And the NIST SRE 2004 corpus is used to train the diagonal matrix. For channel loading matrix, a telephone loading matrix with 100 channel factors is trained based on the phone data from NIST SRE 2004, 2005 and 2006 corpus for the telephone-telephone condition. A common channel loading matrix also with 100 channel factors for both the interview-interview and interview-telephone conditions is trained based on the telephone and microphone data from NIST SRE 2004, 2005 and 2006 corpus as well as the MIXER5 interview development corpus. The true and false trials for telephone-telephone, interview-interview and interview-telephone conditions provided in NIST SRE 2008 are used for training the contribution factor w respectively for the corresponding test conditions in NIST SRE Experiments of Taylor series approximation Since we obtain an approximate decision function, from which the discriminative decision function based scoring method is derived, the effect of using the Taylor series should be examined. Fig.1 shows the relationship of LLR score obtained from the traditional decision function and the approximation form with two terms of Taylor series. We tested on utterances respectively for male and female speakers and each utterance is scored both on Eqs.5 and 11. It can be seen that the relationship between scores from the two scoring forms is nearly linear, which means that in the purpose of classification, the effect of using Taylor series can be ignored. 3. Experiments on NIST SRE 2010 In this subsection, we list the results of JFA systems using frame-by-frame and Discriminative decision function DDF based scoring methods on the three test conditions in NIST SRE Table 1 lists the performance of the JFA systems based on the two scoring methods for the telephone-telephone condition. From Table 1, we can see that the proposed scoring method outperforms the conventional frame-by-frame strategy for both male and female speakers. Our system can achieve 14.85% relative improvement in EER and 5.53% relative improvement in mindcf for male speakers and relative gains of 16.12% EER and 16.12% mindcf for female speakers. Table 1. Comparison of different scoring methods for the telephone-telephone task EER% mindcf EER% mindcf Frame-by-frame DDF The performance of different JFA systems based on our method and the traditional frame-by-frame one for the interview-interview task is shown in Table 2. As can be seen from Table 2, our method has achieved relative 11.27% and 7.28% improvement in EER and mindcf for male speakers as well as 6.21% and 4.22% improvement in EER and mindcf for female speakers. Table 2. Comparison of different scoring methods for the interview-interview task EER% mindcf EER% mindcf Frame-by-frame DDF Table 3 compares the proposed system with the frame-byframe one for the interview-telephone condition. It demonstrates that except for the measurement of EER for male speakers, the performance of our proposed system is comparable or even better than that of the frame-by-frame one. Relative gains of 5.54% in mindcf for male speakers and 8.39% in EER for female speakers are obtained. We have noticed that the performance of male speakers for the interview-telephone task is not very comparable. This may due to the fact that the number of interview-telephone trials both true and false from NIST SRE 2008 is too small to train the contribution factor w well. Table 3. Comparison of different scoring methods for the interview-telephone task EER% mindcf EER% mindcf Frame-by-frame DDF Fig. 1. Relationship of scores obtained from traditional decision function and approximate form. a ;b 4. Speed The aim of this experiment was to show the approximate scoring time for the two different systems to compare their complexity. The time measured included reading necessary

5 696 Chinese Journal of Electronics 2012 data connected with the trial and computing the likelihood ratio. Each measuring was repeated 5 times and averaged. Table 4 shows the average scoring time per trial. From Table 4, we can see that proposed scoring method is faster than the traditional frame-by-frame one. Table 4. Comparison of average scoring time per trial using frame-by-frame and DDF based scoring methods Scoring time cost s Frame-by-frame 3.75 DDF 2.01 VI. Conclusion In this paper, we have introduced a discriminative decision function based scoring method used in speaker verification with the JFA system. Experiments show that the proposed method is effective and outperforms the traditional frame-byframe scoring method on the whole. As well, the computing complexity of the proposed method is much lower than the frame-by-frame scoring method. References [1] D.A. Reynolds, T.F. Quatieri and R.B. Dunn, Speaker verification using adapted Gaussian mixture models, Digital Signal Processing, Vol.10, No.1-3, pp.19 41, [2] X. Zhang, X. Xiao, H Wang, J. Zhang and Y. Yan, Multiclass maximum a posteriori linear regression for speaker verification, Chinese Journal of Electronics, Vol.19, No.4, pp , [3] M.H. Sanchez, L. Ferrer, E. Shriberg, A. Stolcke, Constrained cepstral speaker recognition using matched UBM and JFA training, Proc. of Interspeech, Florence, Italy, pp , [4] P. Kenny, P. Ouellet, N. Dehak, V. Gupta and P. Dumouchel, A study of inter-speaker variability in speaker verification, IEEE Trans. on Audio, Speech and Language Processing, Vol. 16, No.5, pp , [5] N.Dehak,P.Kenny,R.Dehak,P.OuelletandP.Dumouchel, Front-end factor analysis for speaker verification, IEEE Trans. on Audio, Speech and Language Processing, Vol.19, No. 4, pp , [6] N. Brümmer, L. Burget, J. Cernocky, O. Glembek et al., Fusion of heterogeneous speaker recognition systems in the stbu submission for the NIST speaker recognition evaluation 2006, IEEE Trans. on Audio, Speech and Language Processing, Vol.15, No.7, pp , [7] O. Glembek, L. Burget, N. Dehak, N. Brümmer and P. Kenny, Comparision of scoring methods used in speaker recognition with joint factor analysis, Proceeding of the International Conference on Acoustic Speech and Signal Processing, Taipei, Taiwan, pp , [8] P. Kenny and P. Dumouchel, Experiments in speaker verification using factor analysis likelihood ratios, Proceedings of Odyssey 2004, Toledo, Spain, pp , [9] R. Duda and P. Hart, Pattern Classification and Scene Analysis, Wiley, New York, [10] W. Campbell, Generalized linear discriminant sequence kernels for speaker recognition, Proceedings of the International Conference on Acoustics Speech and Signal Processing, Orlando, Florida, USA, Vol.1, pp , [11] The NIST year 2008 speaker recognition evaluation plan, [12] The NIST year 2010 speaker recognition evaluation plan, LIANG Chunyan received the B.E. degree in Communication Engineering from Shandong Normal University in Now she is a M.S. & Ph.D. candidate in Key Laboratory of Speech Acoustics and Content Understanding at Institute of Acoustics, Chinese Academy of Sciences. Her research interests include speaker recognition and language recognition. liangchunyan@hccl.ioa.ac.cn ZHANG Xiang received B.E. degree in Electronic Information Engineering from Shangdong University in 2006 and Ph.D. degree from Key Laboratory of Speech Acoustics and Content Understanding at Institute of Acoustics, Chinese Academy of Sciences. His research interests include speaker recognition, language identification, speaker diarization, and audio watermarking. YAN Yonghong received B.E. degree from Tsinghua University in 1990, and Ph.D. degree from Oregon Graduate Institute OGI. He worked in OGI as an Assistant Professor 1995, Associate Professor 1998 and Associate Director 1997 of Center for Spoken Language Understanding. He worked in Intel from 1998 to 2001, chaired Human Computer Interface Research Council, worked as Principal Engineer of Microprocessor Research Laboratory and Director of Intel China Research Center. Currently he is a professor and director of Think IT Laboratory. His research interests include speech processing and recognition, language/speaker recognition, and human computer interface. He has published more than 100 papers and holds 40 patents.

IEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered

IEEE Proof. Web Version. PROGRESSIVE speaker adaptation has been considered IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 1 A Joint Factor Analysis Approach to Progressive Model Adaptation in Text-Independent Speaker Verification Shou-Chun Yin, Richard Rose, Senior

More information

EFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE

EFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE Uludağ Üniversitesi Mühendislik-Mimarlık Fakültesi Dergisi, Cilt 18, Sayı 1, 2013 ARAŞTIRMA EFFECTS OF BACKGROUND DATA DURATION ON SPEAKER VERIFICATION PERFORMANCE Cemal HANİLÇİ * Figen ERTAŞ * Abstract:

More information

AS indicated by the growing number of participants in

AS indicated by the growing number of participants in 1960 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 7, SEPTEMBER 2007 State-of-the-Art Performance in Text-Independent Speaker Verification Through Open-Source Software Benoît

More information

ABC System description for NIST SRE 2010

ABC System description for NIST SRE 2010 ABC System description for NIST SRE 2010 May 6, 2010 1 Introduction The ABC submission is a collaboration between: Agnitio Labs, South Africa Brno University of Technology, Czech Republic CRIM, Canada

More information

On sequence kernels for SVM classification of sets of vectors: application to speaker verification

On sequence kernels for SVM classification of sets of vectors: application to speaker verification On sequence kernels for SVM classification of sets of vectors: application to speaker verification Major part of the Ph.D. work of In collaboration with Jérôme Louradour Francis Bach (ARMINES) within E-TEAM

More information

ADAPTIVE AND DISCRIMINATIVE MODELING FOR IMPROVED MISPRONUNCIATION DETECTION. Horacio Franco, Luciana Ferrer, and Harry Bratt

ADAPTIVE AND DISCRIMINATIVE MODELING FOR IMPROVED MISPRONUNCIATION DETECTION. Horacio Franco, Luciana Ferrer, and Harry Bratt ADAPTIVE AND DISCRIMINATIVE MODELING FOR IMPROVED MISPRONUNCIATION DETECTION Horacio Franco, Luciana Ferrer, and Harry Bratt Speech Technology and Research Laboratory, SRI International, Menlo Park, CA

More information

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.

More information

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA

SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,

More information

Training Universal Background Models for Speaker Recognition

Training Universal Background Models for Speaker Recognition Odyssey 2010 The Speaer and Language Recognition Worshop 28 June 1 July 2010, Brno, Czech Republic Training Universal Bacground Models for Speaer Recognition Mohamed Kamal Omar and Jason Pelecanos IBM

More information

Online Diarization of Telephone Conversations

Online Diarization of Telephone Conversations Odyssey 2 The Speaker and Language Recognition Workshop 28 June July 2, Brno, Czech Republic Online Diarization of Telephone Conversations Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman Department of

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

Discriminative Multimodal Biometric. Authentication Based on Quality Measures

Discriminative Multimodal Biometric. Authentication Based on Quality Measures Discriminative Multimodal Biometric Authentication Based on Quality Measures Julian Fierrez-Aguilar a,, Javier Ortega-Garcia a, Joaquin Gonzalez-Rodriguez a, Josef Bigun b a Escuela Politecnica Superior,

More information

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA

APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Deep Neural Network Approaches to Speaker and Language Recognition

Deep Neural Network Approaches to Speaker and Language Recognition IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 10, OCTOBER 2015 1671 Deep Neural Network Approaches to Speaker and Language Recognition Fred Richardson, Senior Member, IEEE, Douglas Reynolds, Fellow, IEEE,

More information

Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition

Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition Ruben Vera-Rodriguez 1, John S.D. Mason 1 and Nicholas W.D. Evans 1,2 1 Speech and Image Research Group, Swansea University,

More information

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Channel-dependent GMM and Multi-class Logistic Regression models for language recognition

Channel-dependent GMM and Multi-class Logistic Regression models for language recognition Channel-dependent GMM and Multi-class Logistic Regression models for language recognition David A. van Leeuwen TNO Human Factors Soesterberg, the Netherlands david.vanleeuwen@tno.nl Niko Brümmer Spescom

More information

Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics

Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics Secure-Access System via Fixed and Mobile Telephone Networks using Voice Biometrics Anastasis Kounoudes 1, Anixi Antonakoudi 1, Vasilis Kekatos 2 1 The Philips College, Computing and Information Systems

More information

ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA. Multimedia Communications Department, EURECOM, Sophia Antipolis, France 2

ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA. Multimedia Communications Department, EURECOM, Sophia Antipolis, France 2 3rd European ignal Processing Conference (EUIPCO) ADAPTIVE AND ONLINE PEAKER DIARIZATION FOR MEETING DATA Giovanni oldi, Christophe Beaugeant and Nicholas Evans Multimedia Communications Department, EURECOM,

More information

Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition

Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition , Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology

More information

Developing an Isolated Word Recognition System in MATLAB

Developing an Isolated Word Recognition System in MATLAB MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling

More information

How To Filter Spam Image From A Picture By Color Or Color

How To Filter Spam Image From A Picture By Color Or Color Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among

More information

Gender Identification using MFCC for Telephone Applications A Comparative Study

Gender Identification using MFCC for Telephone Applications A Comparative Study Gender Identification using MFCC for Telephone Applications A Comparative Study Jamil Ahmad, Mustansar Fiaz, Soon-il Kwon, Maleerat Sodanil, Bay Vo, and * Sung Wook Baik Abstract Gender recognition is

More information

Hardware Implementation of Probabilistic State Machine for Word Recognition

Hardware Implementation of Probabilistic State Machine for Word Recognition IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN

Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN PAGE 30 Membering T M : A Conference Call Service with Speaker-Independent Name Dialing on AIN Sung-Joon Park, Kyung-Ae Jang, Jae-In Kim, Myoung-Wan Koo, Chu-Shik Jhon Service Development Laboratory, KT,

More information

Available from Deakin Research Online:

Available from Deakin Research Online: This is the authors final peered reviewed (post print) version of the item published as: Adibi,S 2014, A low overhead scaled equalized harmonic-based voice authentication system, Telematics and informatics,

More information

Artificial Neural Network for Speech Recognition

Artificial Neural Network for Speech Recognition Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Convention Paper Presented at the 135th Convention 2013 October 17 20 New York, USA

Convention Paper Presented at the 135th Convention 2013 October 17 20 New York, USA Audio Engineering Society Convention Paper Presented at the 135th Convention 2013 October 17 20 New York, USA This Convention paper was selected based on a submitted abstract and 750-word precis that have

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations

Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Automatic Detection of Laughter and Fillers in Spontaneous Mobile Phone Conversations Hugues Salamin, Anna Polychroniou and Alessandro Vinciarelli University of Glasgow - School of computing Science, G128QQ

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

Music Mood Classification

Music Mood Classification Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may

More information

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION

BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium Peter.Vanroose@esat.kuleuven.ac.be

More information

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers

Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Automatic Detection of Emergency Vehicles for Hearing Impaired Drivers Sung-won ark and Jose Trevino Texas A&M University-Kingsville, EE/CS Department, MSC 92, Kingsville, TX 78363 TEL (36) 593-2638, FAX

More information

Direct Loss Minimization for Structured Prediction

Direct Loss Minimization for Structured Prediction Direct Loss Minimization for Structured Prediction David McAllester TTI-Chicago mcallester@ttic.edu Tamir Hazan TTI-Chicago tamir@ttic.edu Joseph Keshet TTI-Chicago jkeshet@ttic.edu Abstract In discriminative

More information

ALIZE/SpkDet: a state-of-the-art open source software for speaker recognition

ALIZE/SpkDet: a state-of-the-art open source software for speaker recognition ALIZE/SpkDet: a state-of-the-art open source software for speaker recognition Jean-François Bonastre 1, Nicolas Scheffer 1, Driss Matrouf 1, Corinne Fredouille 1, Anthony Larcher 1, Alexandre Preti 1,

More information

Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

More information

MUSICAL INSTRUMENT FAMILY CLASSIFICATION

MUSICAL INSTRUMENT FAMILY CLASSIFICATION MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.

More information

Log-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network

Log-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network Recent Advances in Electrical Engineering and Electronic Devices Log-Likelihood Ratio-based Relay Selection Algorithm in Wireless Network Ahmed El-Mahdy and Ahmed Walid Faculty of Information Engineering

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

NTT DOCOMO Technical Journal. Shabette-Concier for Raku-Raku Smartphone Improvements to Voice Agent Service for Senior Users. 1.

NTT DOCOMO Technical Journal. Shabette-Concier for Raku-Raku Smartphone Improvements to Voice Agent Service for Senior Users. 1. Raku-Raku Smartphone Voice Agent UI Shabette-Concier for Raku-Raku Smartphone Improvements to Voice Agent Service for Senior Users We have created a new version of Shabette-Concier for Raku-Raku for the

More information

Towards better accuracy for Spam predictions

Towards better accuracy for Spam predictions Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS

AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection

More information

QMeter Tools for Quality Measurement in Telecommunication Network

QMeter Tools for Quality Measurement in Telecommunication Network QMeter Tools for Measurement in Telecommunication Network Akram Aburas 1 and Prof. Khalid Al-Mashouq 2 1 Advanced Communications & Electronics Systems, Riyadh, Saudi Arabia akram@aces-co.com 2 Electrical

More information

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890

Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Speech Recognition on Cell Broadband Engine UCRL-PRES-223890 Yang Liu, Holger Jones, John Johnson, Sheila Vaidya (Lawrence Livermore National Laboratory) Michael Perrone, Borivoj Tydlitat, Ashwini Nanda

More information

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the

More information

ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING SHORT TEST AND TRAINING SESSIONS

ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING SHORT TEST AND TRAINING SESSIONS 18th European Signal Processing Conference (EUSIPCO-21) Aalborg, Denmark, August 23-27, 21 ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING SHORT TEST AND TRAINING SESSIONS Christos Tzagkarakis and

More information

Automatic Emotion Recognition from Speech

Automatic Emotion Recognition from Speech Automatic Emotion Recognition from Speech A PhD Research Proposal Yazid Attabi and Pierre Dumouchel École de technologie supérieure, Montréal, Canada Centre de recherche informatique de Montréal, Montréal,

More information

General Framework for an Iterative Solution of Ax b. Jacobi s Method

General Framework for an Iterative Solution of Ax b. Jacobi s Method 2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,

More information

Tagging with Hidden Markov Models

Tagging with Hidden Markov Models Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,

More information

Fault Analysis in Software with the Data Interaction of Classes

Fault Analysis in Software with the Data Interaction of Classes , pp.189-196 http://dx.doi.org/10.14257/ijsia.2015.9.9.17 Fault Analysis in Software with the Data Interaction of Classes Yan Xiaobo 1 and Wang Yichen 2 1 Science & Technology on Reliability & Environmental

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Circle Object Recognition Based on Monocular Vision for Home Security Robot Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang

More information

UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION

UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION Dennis L. Sun Department of Statistics Stanford University Gautham J. Mysore Adobe Research ABSTRACT Supervised and semi-supervised

More information

Less naive Bayes spam detection

Less naive Bayes spam detection Less naive Bayes spam detection Hongming Yang Eindhoven University of Technology Dept. EE, Rm PT 3.27, P.O.Box 53, 5600MB Eindhoven The Netherlands. E-mail:h.m.yang@tue.nl also CoSiNe Connectivity Systems

More information

The Artificial Prediction Market

The Artificial Prediction Market The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

More information

Note on growth and growth accounting

Note on growth and growth accounting CHAPTER 0 Note on growth and growth accounting 1. Growth and the growth rate In this section aspects of the mathematical concept of the rate of growth used in growth models and in the empirical analysis

More information

TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY

TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY 4 4th International Workshop on Acoustic Signal Enhancement (IWAENC) TRAFFIC MONITORING WITH AD-HOC MICROPHONE ARRAY Takuya Toyoda, Nobutaka Ono,3, Shigeki Miyabe, Takeshi Yamada, Shoji Makino University

More information

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features

Semantic Video Annotation by Mining Association Patterns from Visual and Speech Features Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

The CUSUM algorithm a small review. Pierre Granjon

The CUSUM algorithm a small review. Pierre Granjon The CUSUM algorithm a small review Pierre Granjon June, 1 Contents 1 The CUSUM algorithm 1.1 Algorithm............................... 1.1.1 The problem......................... 1.1. The different steps......................

More information

How to Improve the Sound Quality of Your Microphone

How to Improve the Sound Quality of Your Microphone An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies Andreas Maier, Julian Exner, Stefan Steidl, Anton Batliner, Tino Haderlein, and Elmar Nöth Universität Erlangen-Nürnberg,

More information

Nonparametric Tests for Randomness

Nonparametric Tests for Randomness ECE 461 PROJECT REPORT, MAY 2003 1 Nonparametric Tests for Randomness Ying Wang ECE 461 PROJECT REPORT, MAY 2003 2 Abstract To decide whether a given sequence is truely random, or independent and identically

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior

Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior Automatic Calibration of an In-vehicle Gaze Tracking System Using Driver s Typical Gaze Behavior Kenji Yamashiro, Daisuke Deguchi, Tomokazu Takahashi,2, Ichiro Ide, Hiroshi Murase, Kazunori Higuchi 3,

More information

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction

Speech: A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction : A Challenge to Digital Signal Processing Technology for Human-to-Computer Interaction Urmila Shrawankar Dept. of Information Technology Govt. Polytechnic, Nagpur Institute Sadar, Nagpur 440001 (INDIA)

More information

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS

PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS The 21 st International Congress on Sound and Vibration 13-17 July, 2014, Beijing/China PERCENTAGE ARTICULATION LOSS OF CONSONANTS IN THE ELEMENTARY SCHOOL CLASSROOMS Dan Wang, Nanjie Yan and Jianxin Peng*

More information

Internet Traffic Prediction by W-Boost: Classification and Regression

Internet Traffic Prediction by W-Boost: Classification and Regression Internet Traffic Prediction by W-Boost: Classification and Regression Hanghang Tong 1, Chongrong Li 2, Jingrui He 1, and Yang Chen 1 1 Department of Automation, Tsinghua University, Beijing 100084, China

More information

Speech Signal Processing: An Overview

Speech Signal Processing: An Overview Speech Signal Processing: An Overview S. R. M. Prasanna Department of Electronics and Electrical Engineering Indian Institute of Technology Guwahati December, 2012 Prasanna (EMST Lab, EEE, IITG) Speech

More information

Author Gender Identification of English Novels

Author Gender Identification of English Novels Author Gender Identification of English Novels Joseph Baena and Catherine Chen December 13, 2013 1 Introduction Machine learning algorithms have long been used in studies of authorship, particularly in

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

More information

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James

More information

Ericsson T18s Voice Dialing Simulator

Ericsson T18s Voice Dialing Simulator Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Integration of Negative Emotion Detection into a VoIP Call Center System

Integration of Negative Emotion Detection into a VoIP Call Center System Integration of Negative Detection into a VoIP Call Center System Tsang-Long Pao, Chia-Feng Chang, and Ren-Chi Tsao Department of Computer Science and Engineering Tatung University, Taipei, Taiwan Abstract

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network

The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network , pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and

More information

Reliable and Cost-Effective PoS-Tagging

Reliable and Cost-Effective PoS-Tagging Reliable and Cost-Effective PoS-Tagging Yu-Fang Tsai Keh-Jiann Chen Institute of Information Science, Academia Sinica Nanang, Taipei, Taiwan 5 eddie,chen@iis.sinica.edu.tw Abstract In order to achieve

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

DUOL: A Double Updating Approach for Online Learning

DUOL: A Double Updating Approach for Online Learning : A Double Updating Approach for Online Learning Peilin Zhao School of Comp. Eng. Nanyang Tech. University Singapore 69798 zhao6@ntu.edu.sg Steven C.H. Hoi School of Comp. Eng. Nanyang Tech. University

More information

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

A Novel Decentralized Time Slot Allocation Algorithm in Dynamic TDD System

A Novel Decentralized Time Slot Allocation Algorithm in Dynamic TDD System A Novel Decentralized Time Slot Allocation Algorithm in Dynamic TDD System Young Sil Choi Email: choiys@mobile.snu.ac.kr Illsoo Sohn Email: sohnis@mobile.snu.ac.kr Kwang Bok Lee Email: klee@snu.ac.kr Abstract

More information

Speech recognition for human computer interaction

Speech recognition for human computer interaction Speech recognition for human computer interaction Ubiquitous computing seminar FS2014 Student report Niklas Hofmann ETH Zurich hofmannn@student.ethz.ch ABSTRACT The widespread usage of small mobile devices

More information

Neovision2 Performance Evaluation Protocol

Neovision2 Performance Evaluation Protocol Neovision2 Performance Evaluation Protocol Version 3.0 4/16/2012 Public Release Prepared by Rajmadhan Ekambaram rajmadhan@mail.usf.edu Dmitry Goldgof, Ph.D. goldgof@cse.usf.edu Rangachar Kasturi, Ph.D.

More information

Separation and Classification of Harmonic Sounds for Singing Voice Detection

Separation and Classification of Harmonic Sounds for Singing Voice Detection Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay

More information

Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network

Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network Forecasting of Economic Quantities using Fuzzy Autoregressive Model and Fuzzy Neural Network Dušan Marček 1 Abstract Most models for the time series of stock prices have centered on autoregressive (AR)

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

ARMORVOX IMPOSTORMAPS HOW TO BUILD AN EFFECTIVE VOICE BIOMETRIC SOLUTION IN THREE EASY STEPS

ARMORVOX IMPOSTORMAPS HOW TO BUILD AN EFFECTIVE VOICE BIOMETRIC SOLUTION IN THREE EASY STEPS ARMORVOX IMPOSTORMAPS HOW TO BUILD AN EFFECTIVE VOICE BIOMETRIC SOLUTION IN THREE EASY STEPS ImpostorMaps is a methodology developed by Auraya and available from Auraya resellers worldwide to configure,

More information