A Study of the Cosine DistanceBased Mean Shift for Telephone Speech Diarization


 Harold Jefferson
 3 years ago
 Views:
Transcription
1 TASL A Study of the Cosne DstanceBased Mean Shft for Telephone Speech Darzaton Mohammed Senoussaou, Patrck Kenny, Themos Stafylaks and Perre Dumouchel Abstract Speaker clusterng s a crucal step for speaker darzaton. The short duraton of speech segments n telephone speech dalogue and the absence of pror nformaton on the number of clusters dramatcally ncrease the dffculty of ths problem n darzng spontaneous telephone speech conversatons. We propose a smple teratve Mean Shft algorthm based on the cosne dstance to perform speaker clusterng under these condtons. Two varants of the cosne dstance Mean Shft are compared n an exhaustve practcal study. We report state of the art results as measured by the Darzaton Error Rate and the Number of Detected Speakers on the LDC CallHome telephone corpus. Index Terms Speaker darzaton, clusterng, Mean Shft, cosne dstance. S I. INTRODUCTION PEAKER darzaton conssts n splttng an audo stream nto homogeneous regons correspondng to speech of partcpatng speakers. As the problem s usually formulated, darzaton requres performng two prncpal steps, namely segmentaton and speaker clusterng. The am of segmentaton s to fnd speaker change ponts n order to form segments known as speaker turns that contan speech of a gven speaker. The am of speaker clusterng s to lnk unlabeled segments accordng to a gven metrc n order to determne the ntrnsc groupng n data. The challenge of speaker clusterng ncreases by vrtue of the absence of any pror knowledge about the consttuent number of speakers n the stream. Model selecton based on the Bayesan nformaton crteron BIC s the most popular method for speaker segmentaton [1][]. BIC can also be used to estmate the number of speakers n a recordng and other Bayesan methods have recently been proposed for ths purpose [3][4]. Herarchcal Agglomeratve Clusterng HAC s by far the most wdespread approach to the speaker clusterng problem. Other methods, ncludng hybrd approaches contnue to be developed [4] [5]. Manuscrpt receved June 1, 13, revsed August 31, 13 and accepted September 4, 13. Ths work was supported by the Natural Scence and Engneerng Research Councl of Canada. Copyrght c 13 IEEE. Personal use of ths materal s permtted. However, permsson to use ths materal for any other purposes must be obtaned from the IEEE by sendng a request to M. Senoussaou s wth Centre de recherche nformatque de Montréal CRIM, Montréal, Qc, H3A 1B9, Canada and wth École de technologe supéreure ÉTS, Montréal, QC, Canada emal: P. Kenny and T. Stafylaks are wth Centre de recherche nformatque de Montréal CRIM, Montréal, QC, H3A 1B9, Canada emal: P. Dumouchel s wth École de technologe supéreure ÉTS, Montréal, QC, Canada emal: In ths work, we focus on the speaker clusterng task rather than speaker segmentaton. We propose a clusterng method whch s capable of estmatng the number of speakers partcpatng n a telephone conversaton, a challengng problem consderng that speaker turns are generally of very short duraton [4]. The method n queston s the socalled Mean Shft MS algorthm. Ths approach s borrowed from the feld of computer vson where t s wdely used to detect the number of colors and for mage segmentaton purposes. The MS algorthm s a nonparametrc teratve modeseekng algorthm ntroduced by Fukunaga [6]. Despte ts frst appearance n 1975, MS remaned n oblvon except for works such as [7] that amed to generalze the orgnal verson. The MS algorthm reappeared n 0 wth the work of Comancu [8] n mage processng. Recently, Stafylaks et al. [9][10] has shown how to generalze the basc Eucldean space MS algorthm to noneucldean manfolds so that objects other than ponts n Eucldean space can be clustered. Ths generalzed method was appled to the problem of speaker clusterng n a context where speaker turns were characterzed by multvarate Gaussan dstrbutons. Our choce of the MS algorthm s manly motvated by ts nonparametrc nature. Ths characterstc offers the major advantage of not havng to make assumptons about the shape of data dstrbuton, n contrast to conventonal probablstc darzaton methods. Recently [11], we presented a new extenson of the Eucldean Mean Shft that works wth a cosne dstance metrc. Ths new algorthm was shown to be very effectve for speaker clusterng n large populatons where each speaker was represented by a whole sde of a telephone conversaton. Ths work was motvated by the success of cosne smlarty matchng n the speaker verfcaton feld [1][13][14][15]. Cosne dstance has also been successfully tested n speaker darzaton of the CallHome telephone corpus [16][17]. In ths work, frstly, we propose to test the cosnebased MS algorthm on the darzaton of multspeaker wre telephone recordngs. We do not assume that the number of partcpatng speakers s gven. Secondly, we compare two clusterng mechansms that explot the cosnebased MS algorthm wth respect to darzaton performance as measured by the number of speakers detected and the standard darzaton error metrc and wth respect to executon tmes. Although darzaton on telephone conversatons s an mportant and dffcult task, there are not, to our knowledge, any publshed studes on the use of the MS algorthm to solve ths problem. Unlke broadcast news speech, the shortness of
2 TASL the speaker turn duraton n telephone speech typcally one second makes the task of properly representng these segments n a feature space more dffcult. In order to deal wth ths problem, we represent each speaker turn by an  vector a representaton of speech segments by vectors of fxed dmenson, ndependent of segment duratons [1]. Ivector features have been used successfully not only n speaker recognton [1][13][14][15] and speaker darzaton and clusterng [16][17][1][11] but also n language recognton [][3]. Although probablstc classfers such as Probablstc Lnear Dscrmnant Analyss have become predomnant n applyng vector methods to speaker recognton, smple cosne dstance based classfers reman compettve [1][13] and we wll use ths approach n developng the speaker darzaton algorthms presented here. Note that n [4], the authors show that cosne dstance provdes a better metrc than Eucldean dstance n GMMsupervector space. In [16], the authors ntroduced a darzaton system where vectors were used to represent speaker turns and cosne dstance based kmeans clusterng was used to assocate speaker turns wth ndvdual speakers. Tested on twospeaker conversatons, ths approach outperformed a BICbased herarchcal agglomeratve clusterng system by a wde margn. But, n order for kmeans clusterng to work, the number of speakers n a gven conversaton needs to be known n advance, so t s not straghtforward to extend ths approach to the general darzaton problem where the number of speakers partcpatng n the conversaton needs to be determned a smple heurstc s presented n [17]. The man contrbuton of ths paper s to show how usng the Mean Shft algorthm n place of kmeans enables ths problem to be dealt wth very effectvely. As our test bed we use the CallHome telephone speech corpus development/test provded by NIST n the year 00 speaker recognton evaluaton SRE. Ths conssts of spontaneous telephone conversatons nvolvng varyng numbers of speakers. The CallHome dataset has been the subject of several studes [17][18][19][][1]. The rest of ths paper s organzed as follows. In Secton II we frst provde some background materal on the vector feature space that wll be used n ths work. In Secton III, we gve some prelmnares on the orgnal verson of the Mean Shft algorthm and explan how we nclude the cosne dstance by ntroducng a smple modfcaton. Two ways of explotng the MS algorthm for clusterng purposes wll also be gven. In Secton IV, we present dfferent methods of normalzng vectors for darzaton such normalzatons turn out to be very mportant for our approach. Thereafter, we perform a detaled expermental study and analyss n Sectons V and VI before concludng ths work n Secton VII. II. IVECTORS FOR SPEAKER DIARIZATION The supervector representaton has been appled wth great success to the feld of speaker recognton, especally when t was exploted n the wellknown generatve model named Jont Factor Analyss JFA [5]. In hghdmensonal Input: audo sgnal Output: segmentaton Feature extracton/intal segmentaton vector clusterng Mean Shft Darzaton system vector extracton Fg. 1. Skeleton of the Mean shft vector darzaton system: Segmentaton of the speech sgnal s followed by extractng vectors for each segment and then the vectors are clustered usng Mean Shft n our case. supervector space, JFA attempts to jontly model speaker and channel varabltes usng a large amount of background data. When a relatvely small amount of speaker data s avalable.e. durng enrolment and test stages, JFA enables effectve speaker modelng by suppressng channel varablty from the speech sgnal. A major advance n the area of speaker recognton was the ntroducton of the low dmensonal feature vectors known as vectors [1]. We can defne an vector as the mappng usng a Factor Analyss or a Probablstc Prncpal Component Analyss of a hghdmensonal supervector to a lowdmensonal space called total varablty space here the word total s used to refer to both speaker and channel varabltes. Unlke JFA that proposes to dstngush between speaker and channel effects n the supervector space, vector methods seek to model these effects n a low dmensonal space where many standard pattern recognton methods can be brought to bear on the problem. Mathematcally, the mappng of a supervector X to an  vector x s expressed by the followng formula: X = X UBM + Tx. 1 where X UBM s the supervector of the Unversal Background Model UBM and the rectangular matrx T s the socalled Total Varablty matrx. More mathematcal detals of  vectors and ther estmaton can be found n [1][5][6]. Ivectors have successfully been deployed n many felds other than speaker recognton [16][17][1][][3]. Methods successful n one feld can often be translated to other felds by dentfyng the sources of useful and nusance varablty. Thus n speaker recognton, speaker varablty s useful but t counts as nusance varablty n language recognton. In the darzaton problem, the speaker turn represented by an vector n our case s the fundamental representaton unt or what we usually call a sample n Pattern Recognton termnology. Moreover, an aggregaton of homogenous  vectors wthn one conversaton represents a cluster speaker n our case or what s commonly known as a class. Thus, the darzaton problem becomes one of clusterng vectors [11][16][17][1].
3 TASL III. THE MEAN SHIFT ALGORITHM The Mean Shft algorthm can be vewed as a clusterng algorthm or as a way of fndng the modes n a nonparametrc dstrbuton. In ths secton we wll present the ntutve dea behnd the Mean Shft modeseekng process as well as the mathematcal dervatons of ths algorthm. Addtonally, we present two varants of ths algorthm whch can be appled for clusterng purposes. Fnally, the extenson of the tradtonal MS to the cosnebased MS s presented. A. The ntutve dea behnd Mean Shft The ntutve dea of Mean Shft s qute natural and smple. Startng from a gven vector x n a set S = { x 1, x,..., x n } of unlabeled data whch are vectors n our case we can reach a statonary pont called a densty mode through the teratve process depcted n Algorthm 1. Note that the Algorthm 1 refers to the orgnal Mean Shft process. The mathematcal convergence proof of the sequence of successve postons {y } =1,... s found n [6][8]. Algorthm 1 Mean Shft Intuton dea =1, y = x Center a wndow around y // Intalzaton repeat µ h y // estmate the sample mean of data fallng wthn the wndow.e. neghborhood of y n terms of Eucldean dstance y +1 = µ h y Move the wndow from x to y +1 = +1 untl Stablzaton // a mode has been found B. Mathematcal development Mean Shft s a member of the Kernel Densty Estmaton KDE famly of algorthms also known as Parzen wndowng. Estmatng the probablty densty functon of a dstrbuton usng a lmted sample of data s a fundamental problem n pattern recognton. The standard form of the estmated kernel densty functon ˆf x at a randomly selected pont x s gven by the followng formula 1 : ˆf x = 1 nh d n " k x! x # h =1 where kx s a kernel functon and h s ts radal wdth, referred to as the kernel bandwdth. Ignorng the selecton of kernel type, h s the only tunable parameter n the Mean shft algorthm; ts role s to smooth the estmated densty functon. In order to ensure certan propertes such as asymptotc unbasedness and consstency, the kernel and bandwdth h should satsfy some condtons that are dscussed n detal n [6]. In general, the purpose of KDE s to estmate the densty 1 Note that for smplcty we gnore some constants n the mathematcal dervatons. functon but the Mean shft procedure s only concerned wth locatng the modes of the densty functon f x and not the values of the densty functon at these ponts. To fnd the modes, the Mean shft algorthm dervaton requres calculatng the gradent of the densty functon f x. The estmate of the gradent of the densty functon f x s gven by the gradent of the estmate of the densty functon ˆf x as follows [6][8][7][8][9]: ˆ!f x "!ˆf x = 1 n x # x *!k nh d h =1 = n x # x *x # x nh d+ k+ h. =1 A smple type of kernel s the Epanechnkov kernel gven by the followng formula: # kx = 1! x x " x >1 Let gx be the unform kernel: " gx = # 1 x! x >1 Note that t satsfes: k!x = "c gx 6 where c s a constant and the prme s the dervaton operator. Then we can wrte!ˆf x as:!ˆf x = n # x nh d+ " xg x " x h =1 * n #, = * n # g x " x  x g, =1, nh d+ h /, +, =1./ n #, g +, =1 The expresson: n " x g =1 # m h x = n " g =1 # x! x h x! x h! x x " x h x " x h  / / " x /. /. / s what we refer to as the Mean Shft vector m h x. Note that the Mean Shft vector m h x s just the dfference of the current poston nstance vector x from the next poston presented by the weghed sample mean vector of all data. Indeed, the weghts n the mean formula are gven by the bnary outputs.e. 0 or 1 of the flat kernel gx
4 TASL For smplcty, let us denote the unform kernel wth bandwdth h by gx, x, h so that: = 1 x! x g x, x, h # " h. 9 0 x! x > h In other words, gx, x, h selects a subset S h x of n x samples by analogy wth Parzen wndows we refer to ths subset as a wndow n whch the Eucldean parwse dstances wth x are less or equal to the threshold bandwdth h: S h { }. 10 x! x : x " x # h Therefore, we can rewrte the Mean Shft vector as: m h x = µ h x! x 11 where µ h x s the sample mean of the n x samples of S h x: µ h x = 1 " x. 1 n x x!s h x The teratve processng of calculatng the sample mean followed by data shftng whch produces the sequence {y } =1,... referred to n Algorthm1 converges to a mode of the data dstrbuton. C. Mean Shft for speaker clusterng The Mean Shft algorthm can be exploted to deal wth the problem of speaker clusterng n the case where the number of clusters speakers n our case s unknown, as well as other problems such as the segmentaton steps nvolved n mage processng and object trackng [8]. In the followng subsectons, we present two clusterng mechansms based on the MS algorthm, namely, the Full and the Selectve clusterng strateges. Full strategy One may apply the teratve Mean Shft procedure at each data nstance. In general, some of the MS processes wll converge to the same densty mode. The number of densty modes after prunng represents the number of detected clusters and nstances that converge to the same mode are deemed to belong to the same cluster we call these ponts the basn of attracton of the mode. In ths work we refer to ths approach as Full strategy. Selectve strategy Unlke the Full Mean Shft clusterng strategy, we can adapt ths strategy to run the MS process on a subset of data only. The dea s to keep track of the number of vsts to each data pont that occurs durng the evoluton of a Mean Shft process. After the convergence of the frst Mean Shft process the samples that have been vsted are assgned to the frst cluster. We then run a second process startng from one of the unvsted samples and create a second cluster. We contnue to run MS processes one after another untl we have no unvsted data samples. Some of the samples may be allocated to more than one cluster by ths procedure then majorty votng s needed to reconcle these conflcts. Note that the computatonal complexty depends on the number of samples n the Full strategy and t depends only on the number of clusters n the case of the Selectve strategy. A MATLAB mplementaton of the Selectve strategy can be found onlne. In ths work the expermental results of the Full and Selectve clusterng strateges are compared n Secton VI. D. Mean Shft based on cosne dstance The success of the cosne dstance n speaker recognton s well known [1][13][14][15]. A ratonale for usng cosne dstance nstead of Eucldean dstance can be suppled by postulatng a normal dstrbuton for the speaker populaton as n PLDA [30]. Suppose we are gven a par of vectors and we wsh to test the hypothess that they belong to the same speaker cluster aganst the hypothess that they belong to dfferent clusters. Because most of the populaton mass s concentrated n the neghborhood of the orgn, speakers n ths regon are n danger of beng confused wth each other. In the case of a par of vectors whch are close to the orgn, the same speaker hypothess wll only be accepted f the vectors are relatvely close together. On the other hand, f the vectors are far from the orgn, they can be relatvely far apart from each other wthout nvaldatng the same speaker hypothess. Hence, n order to ncorporate ths pror knowledge regardng the dstrbuton of the speaker means nto the MS algorthm, we may ether use a the Eucldean dstance and a varable bandwdth that ncreases wth the dstance from the orgn or b fxed bandwdth and the cosne smlarty. The latter approach s evdently preferable. The cosne dstance between two vectors x and y s gven by: # Dx, y =1! x " y x y. 13 The orgnal Mean Shft algorthm based on a flat kernel reles on the Eucldean dstance to fnd ponts fallng wthn the wndow as shown n 10. In [11] we proposed the use of the cosne metrc nstead of the Eucldean one to buld a new verson of the Mean shft algorthm. Only one modfcaton needs be ntroduced n 10; we set S h x! { x : Dx, x " h} 14 where Dx, x s the cosne dstance between x and x gven by the formula 13. Ths corresponds to redefnng the unform kernel as:
5 TASL = 1 Dx, x! h #. 15 g x, x, h " 0 Dx, x > h E. Conversatondependent bandwdth It s known from the lterature [31] that one of the practcal lmtatons of Mean Shft algorthm s the need to fx the bandwdth h. Usng a fxed bandwdth s not generally approprate, as the local structure of samples can change the data that needs to be clustered. We have found that varyng the bandwdth from one conversaton to another turns out to be useful n darzaton based on Mean Shft algorthm. In order to deal wth the dsparty caused by the varable duraton of conversatons, we adopt a verson of the varable bandwdth scheme proposed n [10]. Ths s desgned to smooth the densty estmator n the case of short conversatons where the number of segments to be clustered s small. The varable bandwdth s controlled by two parameters! and the fxed bandwdth h. For a conversaton c, the conversatondependent bandwdth!h c s gven by "!h c =1! nc! 1! h 16 # n c! + 1! h where n c s the number of segments n the conversaton. Note that! h c! h wth equalty f nc s very large. F. Cluster prunng An artfact of the Mean Shft algorthm s that there s nothng to prevent t from producng clusters wth very small numbers of segments. To counter ths tendency, we smply prune clusters contanng a small number of samples less than or equal to a constant p by mergng them wth ther nearest neghbors. IV. IVECTOR NORMALIZAION FOR DIARIZATION By desgn, vectors are ntended to represent a wde range of speech varabltes. Hence, raw vectors need to be normalzed n ways whch vary from one applcaton to another. Based on the above defntons of class and sample n relaton to our problem see Secton II, we wll present n the followng sectons some methods to normalze vectors whch are sutable for speaker darzaton. A. Prncpal components analyss PCA In [16] t was shown that projectng vectors onto the conversatondependent PCA axes wth hgh varance helps to compensate for ntrasesson varablty. A further weghtng wth the square root of the correspondng egenvalues was also appled to these axes n order to emphasze ther mportance. The authors of [16] recommend choosng the PCA dmensonalty so as to retan 50 of the data varance. We wll denote ths quantty by r. Ideally each retaned PCA axs represents the varablty due to a sngle speaker n the conversaton. Note that that ths type of PCA s local n the sense that analyss s done on a flebyfle bass. Thus has the advantage that no background data s requred to mplement t. B. Wthn Class Covarance Normalzaton WCCN Normalzng data varances usng a Wthn Class Covarance matrx has become common practce n the Speaker Recognton feld [1][13][15]. The dea behnd ths normalzaton s to penalze axes wth hgh ntraclass varance by rotatng data usng a decomposton of the nverse of the Wthn Class Covarance matrx. C. Between Class Covarance Normalzaton BCCN By analogy wth the WCCN approach, we propose a new normalzaton method based on the maxmzaton of the drectons of between class varance by normalzng the  vectors wth the decomposton of the between class covarance matrx B. The between class covarance matrx s gven by the followng formula: B = 1 I " n x! x x! x t 17 n =1 where the sum ranges over I conversaton sdes n a background tranng set, x = 1! x j s the sample mean of n k j=1 speaker turns wthn the conversaton sde and x s the sample mean of all vectors. A. CallHome data V. IMPLEMENTATION DETAILS We use the CallHome dataset dstrbuted by NIST durng the year 00 speaker recognton evaluaton [18]. CallHome s a multlngual 6 languages dataset of multspeaker telephone recordngs of 1 to 10 mnutes duraton. Fg. depcts the development part of the dataset whch contans 38 conversatons, broken down by the number of speakers to 4 #!"!!" #!"!!" #!"!!" #!"!"!" " #" #"!" #"!"!" " " Fg. CallHome development data set broken down by categores representng the number of partcpatng speakers n conversatons. Fg. 3 CallHome test set broken down by categores representng the number of partcpatng speakers n conversatons. n "!" " " "*+,*"."*+,*" /"*+,*" "*+,+." "*+,+." "*+,+." #"*+,+." "*+,+." /"*+,+."
6 TASL speakers. The CallHome test set contans 500 conversatons, broken down by the number of speakers n Fg. 3. Note that the number of speakers ranges from to 4 n the development set and from to 7 n the test set, so that there s a danger of overtunng on the development set. For our purposes the development set serves to decde whch types of vector normalzaton to use, to fx the bandwdth parameter h n 15, 16 and to determne a strategy for prunng sparsely populated clusters. Because there s essentally only one scalar parameter to be tuned, our approach s not at rsk for overtunng on the development set. B. Feature extracton 1 Speech parameterzaton Every 10ms, Mel Frequency Cepstral Coeffcents MFCC are extracted from a 5 ms hammng wndow 19 MFC Coeffcents + energy. As s tradtonal n darzaton, no feature normalzaton s appled. Unversal background model We use a genderndependent UBM contanng 51 Gaussans. Ths UBM s traned wth the LDC releases of Swtchboard II, Phases and 3; Swtchboard Cellular, Parts 1 and ; and NIST SRE telephone speech only. 3 Ivector extractor We use a genderndependent vector extractor of dmenson 100, traned on the same data as UBM together wth data from the Fsher corpus. C. Ivector normalzaton Among the normalzaton methods presented n Secton IV, only the wthn and the between class covarance matrces need background data to be estmated. In order to estmate them we used telephone speech whole conversaton sdes from the 04 and 05 NIST speaker recognton evaluatons. D. Intal segmentaton The focus n ths work s speaker clusterng rather than segmentaton. Followng the authors of [4] [16], we unformly segmented speech ntervals found by a voce actvty detector nto segments of about one second of duraton. Ths naïve approach to speaker turn segmentaton s tradtonal n darzng telephone speech where speaker turns tend to be very short and Vterb resegmentaton s generally appled n subsequent processng. Note that the results presented n [16] show that usng reference slence detector offers no sgnfcant mprovement n comparson to ther own speech detector. E. Evaluaton protocol In order to evaluate the performances of dfferent systems we use the NIST Darzaton Error Rate DER as the prncpal measure system performance. Usng the NIST scorng scrpt mdevalv1.pl 3 we evaluate the DER of the concatenated.rttm fles produced for all conversatons n the development and test sets. As s tradtonal n speaker darzaton of telephone speech, we gnore overlappng speech segments and 3 we tolerate errors less than 50 ms n locatng segment boundares. In addton to DER, the Number of Detected Speakers NDS and ts average calculated over all fles ANDS are also useful performance evaluaton metrcs n the context of clusterng wth unknown numbers of speakers. We adopt a graphcal llustraton of DER vs. NDS to represent systems behavors Fgs. 4 and 5. These graphs are obtaned by sweepng out the bandwdth parameter h. On these graphs, the actual number of speakers s gven by the vertcal sold lne and the estmated number s gven by the dashed lne. VI. RESULTS AND DISCUSSIONS In ths secton we provde a detaled study of the effect of the vector normalzaton methods descrbed n Secton VIC. A. Parameter tunng on the development set Fg. 4 Results on the development set obtaned wth PCA vector normalzaton: Full Mean Shft performances DER/Number of estmated speakers. The mnmum of DER, the correspondng bandwdth h and the number of detected speakers #Spk are also gven for each PCA reducton factor r = 80, 60, 50 and 30. In order to establsh a benchmark we frst ran the two r: 80 DER: 1.60 #Spk: 370 h: r: 50 DER: #Spk: 177 h: r: 80 DER: #Spk: 5 h: r: 50 DER: 1.8 #Spk: 33 h: Fg. 5 Results on the development set obtaned wth PCA vector normalzaton: Selectve Mean Shft performances DER/Number of estmated speakers. The mnmum of DER, the correspondng bandwdth h and the number of detected speakers #Spk are also gven for each PCA reducton factor r = 80, 60, 50 and r: 60 DER: 1.16 #Spk: 75 h: r: 30 DER: #Spk: 45 h: r: 60 DER: 1.70 #Spk: 38 h: 0.3 r: 30 DER: 1.18 #Spk: 165 h:
7 TASL versons of Mean Shft wth PCA normalzaton of vectors. Each graph n Fgs. 4 and 5 corresponds to a percentage of retaned egenvalues r = 80, 60, 50 and 30 respectvely. In Fgs. 4 and 5 we observe that although the results for the two strateges wth r = 30 are slghtly better than those wth r = 50, the graphs are rregular n the former case so that takng r = 50 as n [16] seems to be the better course. Note that the optmal DER for all confguratons s reached wth an overestmaton of the number of speakers. Fortunately overestmaton s preferable to underestmaton, as t can be remeded by prunng sparsely populated clusters. Impact of length normalzaton We began by testng the effect of length normalzaton of raw vectors before applyng PCA. Surprsngly, ths smple operaton mproves the DER by absolute row 3  Len.n  n Table 1. Wth length normalzaton and r = 50, DER decreases from 11.9 see Fg. 4 to 10 Full strategy and from 1. see Fg. 5 to 10. Selectve strategy. Furthermore, the number of detected speakers NDS n the case of Selectve strategy decreases from 33 to 81, thus approachng the actual value of 103. However, n the case of Full strategy the detected NDS ncreases form 177 to 316. TABLE 1 RESULTS ON THE DEVELOPEMNT TEST SET ILLUSTRATING THE EFFECT OF DIFFERENT NORMALIZATION METHODS DER IS THE DIARIZATION ERROR RATE, NDS THE NUMBER OF DETECTED SPEAKERS, h THE BANDWIDTH AND p THE PRUNINING PARARMETER THE ACTUAL NUMBER OF SPEAKERS IS 103. Full MS Selectve MS Norm DER DER method NDS h p NDS h p Len. n WCC BCC Var. h Prun Impact of wthn class covarance normalzaton In ths experment we frst normalze vectors usng the Cholesky decomposton of the nverse of the WCC matrx, and follow ths wth length normalzaton and PCA projecton. As we see n row 4 of Table 1 WCC normalzaton causes performance degradaton. The DER ncreases from 10 to 11.7 n the Full case and from 10. to 11.7 n the Selectve case compared to both prevous normalzaton methods, namely PCA and length normalzaton. These results were not n lne wth our expectatons derved from our experence n speaker recognton; they may be due to an nteracton between the PCA and WCC normalzatons. Impact of between class covarance normalzaton We proceeded n a smlar way to WCC normalzaton. We project data usng the Cholesky decomposton of the BCC matrx followed by length normalzaton and PCA projecton. In row 5 of Table 1 we notce a remarkable twofold mprovement compared wth row. On the one hand, we obtan a DER decrease from 10 to 7.6 for the Full strategy case and from 10. to 7.7 for the Selectve case. On the other hand, we detect a number of speakers much nearer to the actual value of 103, partcularly n the Selectve case 189 speakers. Conversatondependent bandwdth Mean Shft We appled the varable bandwdth scheme gven n formula 16 to the prevous BCC normalzaton system. In row 6 of Table 1, we observe a slght mprovement n DER for both strateges. Cluster prunng Although we succeed n reducng the DER from ~1 to ~7 for both strateges, the estmated number of speakers correspondng to the mnmum of DER s stll hgher than the actual value. As dscussed n Secton IIIF we prune clusters contanng a small number of samples less than or equal to a constant p n order to counter ths tendency. The correspondng results appear n the last row of Table 1. We observe that for the Full strategy, mergng clusters havng one nstance p = 1 reduces the estmated number of speakers from 300 to 109 whle the DER slghtly ncreases from 7.5 to 8.3. For the Selectve strategy, wth p = 3 we get a nce mprovement regardng Number of Detected Speakers 111 speakers nstead of 3 whle the DER s essentally unaffected, decreasng from 7.6 to 7.5. B. Results on the test set As we explaned when dscussng the evaluaton protocol Secton VE, we now present the results obtaned on the test set by usng parameters bandwdth and the prunng factor p tuned on the development set. Table presents the most mportant results. The term Fx. h n row 3 of Table refers to the best system usng fxed bandwdth presented n row 5 of Table 1 BCC. In ths system we used respectvely BCC, length normalzaton followed by length normalzaton and PCA projecton wth r = 50 as optmzed on the development set. In row 4 of table Var. h, the system s exactly the same as the prevous one Fx. h system but wth a varable bandwdth. Fnally, the last row of Table Prun. shows the mpact of clusters prunng on the varable bandwdth system Var. h. TABLE RESULTS ON TEST DATA SET USING OPTIMAL PARAMETERS ESTIMATED ON THE DEVELOPMENT SET. THE TOTAL ACTUAL NUMBER OF SPEAKERS IS 183. Full MS Selectve MS Norm DER DER method NDS h p NDS h p Fx. h Var. h Prun From the results n Table we observe the usefulness of the varable bandwdth n reducng the DER from 14.3 to 1.7 n the Full MS case and from 13.9 to 1.6 n the Selectve case. Observe also that the number of detected speakers NDS s reduced from 3456 to 550 n the Full MS strategy and from 3089 to 310 n the Selectve case. Fnally, n the test set, cluster prunng leads to a degradaton of the
8 TASL Raw vectors BCC normalzaton Length normalzaton whch descrbes the evaluaton protocol, these works present results broken down by the number of partcpatng speakers. Darzaton System DER from 1.6 to 14.3 usng the Selectve strategy, n contrast to what s observed on the development set where the DER showed a slght mprovement. However, cluster prunng on the test set for the Full case s surprsngly helpful to the pont that the DER 1.4 concdes perfectly wth the one optmzed on the development set see the bandwdth h n row 7 of Table 1. Gven that the above results are obtaned usng the parameters tuned on an ndependent dataset, ths confrms the generalzaton capablty of the cosnebased Mean Shft for both clusterng strateges. Among the publcatons reportng results on the CallHome dataset [17][18][19][][1], only Vaquero s thess presents results based on the total DER calculated over all fles [1]. He uses speaker factors rather than vectors to represent speech segments and he used a multstage system based prncpally on Herarchcal Agglomeratve Clusterng HAC, kmeans and Vterb segmentaton. He also estmated tunable parameters on an ndependent development set consstng solely of twospeaker recordngs. However, he was constraned to provde the actual number of speakers as stoppng crteron for HAC n order to acheve a total DER of 13.7 on the test CallHome set. Wthout ths constrant, the performance was Compared to hs results, we were able to acheve a 37 relatve mprovement n the total DER see Table. In summary, we presented some results on development and test sets from whch we can draw the followng conclusons: Length normalzaton of the raw vectors before PCA projecton helps n reducng DER. PCA wth r = 50 offers the best confguraton. WCC normalzaton degrades performance. BCC normalzaton, followed by length normalzaton and PCA, helps to decrease both DER and NDS. Varable bandwdth combned wth cluster prunng p = 1 appled after length normalzaton, PCA projecton and BCC normalzaton help n reducng DER and NDS n the Full case. Both strateges, namely Full and Selectve, perform equvalently well on development and test sets. In Fg. 6 we depct the best vectors normalzaton protocol that we adopt n ths study. C. Results brokendown by the number speakers PCA projecton Fg. 6 The best protocol of vector normalzaton for the MS darzaton systems. In order to compare our results wth those of [17][18][19][] we need to adopt the same conventon for presentng darzaton results. As mentoned n Secton VE TABLE 3 FULL MEAN SHIFT RESULTS ON TESTSET DEPICTED AS A FUNCTION OF THE NUMBER OF PATRICIPATING SPEAKERS. Speakers h / p number Dev. Param. Test param. DER Fx. h ANDS / 0 DER Var. h ANDS / 0 DER Fx. h ANDS / 0 DER Var. h ANDS / 0 Indeed, the offcal development set conssted of conversatons wth just to 4 speakers so t s hard to avod tunng on the test set f one wshes to optmze performance on conversatons wth large numbers of speakers. In Tables 3 and 4 we present results broken down by the number of speakers on the test set for the Full and Selectve Mean shft algorthms, wth two tunngs, one on the development rows 5 and the other on the test set rows Recall that the tunable parameters are the nature of the bandwdth.e. fxed or varable, ts value.e. h and the prunng factor p. It s apparent from the tables that all of the Mean Shft mplementatons generalze well from the development set to the test set. From Table 3 we observe frstly that the Full MS mplementaton does not need any fnal cluster prunng.e. p = 0 when we optmze takng account of the number of partcpatng speakers see last column of Table 3. Second, estmatng the number of speakers works better wth a fxed bandwdth see rows 3 and 7 of Table 3 and the DERs are almost comparable to those obtaned wth a conversatondependent bandwdth.e. Var. h. Generally speakng, varable bandwdth helps n reducng DER for recordngs havng small number of speakers, 3, 4 speakers. Fnally, the most mportant observaton from Table 3 s the hgh generalzaton capablty of the Full MS especally n the fxed bandwdth case. Comparng rows and 3 wth rows 6 and 7 of Table 3 we see that the optmal parameters for the test set are the same as those for the development set. Dev. Param. Test param. TABLE 4 SELECTIVE MEAN SHIFT RESULTS ON TEST DATA SET DEPICTED AS A FUNCTION OF THE NUMBER OF PATRICIPATING SPEAKERS. Speakers h / p number DER Fx. ANDS / 3 DER Var. ANDS / 3 DER Fx. ANDS / 3 DER Var. ANDS / 3
9 TASL !" #"!" #"!" #" 1/G" G/H" #/#" /I"!/" G/#" G/8" #/8" H/I" #/" /#" G/8" 1" G/H" I/#" I/1" #/1" /" /8" /#" /G" G/" H/I" H/1" /1" /1" /G" /G" 8/I" 8/#" From the results depcted n Table 4 we observe that the fnal cluster prunng s necessary n the Selectve MS case. Compared to Full MS results n Table 3, we observe that DERs are smlar but the Selectve strategy outperforms the Full one regardng the average number of detected speakers ANDS. The combnaton of the varable bandwdth wth the fnal cluster prunng p = 3 enables us to get the best results, both for DER and Average Number of Detected Speakers see rows 4 and 5 and rows 8 and 9 n table 4. The ANDS values are n fact very close to the actual numbers row 6 vs. row 1 wth a slght overestmaton, except n the 6speaker fles case where there s a slght underestmaton 5.8. Fnally, we observe that the Full strategy generalzes better than the Selectve one n the sense that we were able to reach the best performance on the test usng development tunable parameters. Vterb resegmentaton Refnng segment boundares between speaker turns usng Vterb resegmentaton s a standard procedure for mprovng darzaton system performance. Results reported n Table 5 show ts effectveness when combned wth the Mean Shft algorthms. Note that the results wthout Vterb resegmentaton gray entres n table 5 are the results presented n the 6 th and 8 th rows n tables 3 and 4. TABLE 5 IMPACT OF VITERBI RESEGMENTATION ON THE TESTSET RESULTS USING PARAMETRES ESTIMATED ON TEST DATA DEPICTED AS A FUNCTION OF THE Full MS Selectve MS!" NUMBER OF PATRICIPATING SPEAKERS AND MEASURED WITH DER. Fx. h  Vterb Vterb Var. h  Vterb Vterb Fx. h Var. h *++,"."/"01" 3+.4,"."/"0!" 567*"."/"08"  Vterb Vterb Vterb Vterb n [19] estmatng the number of speakers was done separately from speaker clusterng. We compare graphcally n Fg. 7 the results as measured by DER of our best confguratons wth Vterb resegmentaton of the Full and Selectve strateges.e. Full and Selectve systems presented n the 4 th and 8 th rows of Table 5 respectvely wth those n [19][][17]. It s evdent that our results as measured by DER are n lne wth the stateoftheart. To be clear, snce our results were taken from Tables 3 and 4, there was some tunng on the test set as n Dalmasso et al. [19], Castaldo et al. [], and Shum et al. [17] Furthermore, the comparson based on the average number of detected speakers s not possble except n the case of Dalmasso et al. [19]. In Table 6 we compare our best results from Table 4 usng ths crteron wth those of [19]. The results are smlar although the Mean Shft algorthm tends to overestmate the speaker number. TABLE 6 COMPARISON WITH DALMASSO RESULTS BASED ON THE AVRAGE OF THE NUMBER OF DETECTED SPEAKERS ANDS. Actual Number of speakers Dalmasso et al. [19] Selectve MS D. Tme complexty 5DEF":;"56<="""""""""""""""""""""""""" Fg. 7 Comparson of Full and Selectve Mean Shft clusterng algorthms wth stateoftheart results based on DER for each category of CallHome test set recordngs havng same number of speakers. Tme complexty s not a major concern n ths study but Fg. 8 llustrates the dfference between the Full and the Selectve strateges n ths regard. The average tme for the Full case s seconds per fle vs seconds for the Selectve case Comparson wth exstng stateoftheart results We conclude ths secton wth a comparson between our results and those obtaned by other authors on the Call Home data although there are several factors whch make backtoback comparsons dffcult. Contrary to [1] and our work, the authors of [17][19][] dd not use a development set ndependent of the test set for parameter tunng. Furthermore n [19] and [], the authors assumed pror hypotheses about the maxmum number of speakers wthn a slce of speech, and Tme s Full Selectve Recordngs of CallHome development set Fg. 8 Tme complextes of the Full and Selectve strateges calculated n seconds on each conversaton of the development set. The horzontal lnes ndcate the processng tmes averaged over all fles.
10 TASL VII. CONCLUSION Ths paper provdes a detaled study of the applcaton of the nonparametrc Mean Shft algorthm to the problem of speaker clusterng n darzng telephone speech conversatons usng two varants of the basc clusterng algorthm the Full and Selectve versons. We have suppled n the Appendx a convergence proof whch justfes our extenson of the Mean Shft algorthm from the Eucldean dstance metrc to the cosne dstance metrc. We have shown how, together wth an vector representaton of speaker turns, ths smple approach to the speaker clusterng problem can handle several dffcult problems  short speaker turns, varyng numbers of speakers and varyng conversaton duratons. Wth a sngle pass clusterng strategy that s, wthout Vterb resegmentaton we were able to acheve a 37 relatve mprovement as measured by global darzaton error rate on the Call Home data usng as a benchmark [1], the only other study that evaluates performance n ths way. We have seen how our results usng other metrcs are smlar to the stateofthe art as reported by other authors [16][17][19][]. We have seen that refnng speaker boundares wth Vterb resegmentaton s also helpful. Usng segment boundares obtaned n ths way could serve as a good ntalzaton for a second pass of Mean Shft clusterng. An nterestng complcaton that would arse n explorng ths avenue s that speaker turns would be of much more varable duraton than n the frst pass based on the unform segmentaton descrbed n Secton V.D. Snce the uncertanty entaled n estmatng an  vector n the case of short speaker turns than n the case of long speaker turns, ths suggests that takng account of ths uncertanty as n [3] would be helpful. APPENDIX In ths appendx we present the mathematcal convergence proof of the cosne dstancebased Mean Shft. Indeed, ths proof s very smlar the one of theorem 1 presented n [8]. Theorem 1 [8]: f the kernel k has a convex and monotoncally decreasng profle, the sequence { ˆf } =1,... converges, and s monotoncally ncreasng. Let us suppose that all vectors n our dataset are constraned to lve n the unt sphere by normalzng ther Eucldeannorm durng MS convergence process. ˆf j+1! ˆf n # j = c k 1! y " x +1 j #! k 1! y " y, / +.. j=1* h h  Due to the convexty of the profle: kx! kx 1 " k #x 1 x! x 1 and snce gx =!kx from 6 than: kx! kx 1 " gx 1 x 1! x we obtan: ˆf +1! ˆf n " c g 1! y # x j * 1! y # x j 0, j=1 h +,! 1! y +1 # x j n = c g 1! y # x j y x +1! y 0 j j=1 h h we know from 8 and 11 that the +1 th poston y +1 s equal to the weghted mean vector, so Thus: wth equalty ff y +1 = y. The sequence n # g 1! y " x j n # y +1 = g 1! y " x j x j. j=1 h j=1 h ˆf +1! ˆf n " c g 1! y # x j y * y +1! y +1 j=1 h h n = c g 1! y # x j 1! y * +1 # y " 0 j=1 h h { ˆf } =1,... s bounded and monotoncally ncreasng, and so s convergent. Ths argument does not show that {y } =1,... s convergent t may be possble to construct pathologcal examples n whch { ˆf } =1,... converges but {y } =1,... does not but t establshes convergence of the Mean Shft algorthm n the same sense as convergence of the EM algorthm s demonstrated n [33]. ACKNOWLEDGMENT Frst we would lke to thank the edtor as well as revewers for ther helpful comments. Then, we would lke to thank Stephen Shum and Najm Dehak from MIT for ther useful dscussons and feedback and also for sharng ther ntal segmentaton wth us. We would lke also to thank our colleague Vshwa Gupta for hs help wth Vterb resegmentaton software and our colleague Perre Ouellet for hs help wth other software. REFERENCES [1] G. Schwarz, Estmatng the dmenson of a model, Ann. Statst. 6, [] S. S. Chen and P. Gopalakrshnan, Clusterng va the bayesan nformaton crteron wth applcatons n speech recognton, n ICASSP 98, vol., Seattle, USA, 1998, pp [3] F. Valente, Varatonal Bayesan methods for audo ndexng, Ph.D. dssertaton, Eurecom, Sep 05. [4] P. Kenny, D. Reynolds and F. Castaldo, Darzaton of Telephone Conversatons usng Factor Analyss, Selected Topcs n Sgnal Processng, IEEE Journal of, vol.4, no.6, pp , Dec. 10. [5] Margarta Kott, Vasslk Moschou, Constantne Kotropoulos, Speaker segmentaton and clusterng, Sgnal Processng, Volume 88, Issue 5, May 08, Pages , ISSN , /j.sgpro [6] K. Fukunaga and L. Hostetler, The estmaton of the gradent of a densty functon, wth applcatons n pattern recognton, IEEE Trans. on Informaton Theory, vol. 1, no. 1, pp. 3 40, January h  /./
11 TASL [7] Y. Cheng, Mean Shft, Mode Seekng, and Clusterng, IEEE Trans. PAMI, vol. 17, no. 8, pp , [8] D. Comancu and P. Meer, Mean shft: A robust approach toward feature space analyss, IEEE Trans. Pattern Analyss and Machne Intellgence, vol. 4, no. 5, pp , May 0. [9] T. Stafylaks, V. Katsouros, and G. Carayanns, Speaker clusterng va the mean shft algorthm, n Odyssey 10: The Speaker and Language Recognton Workshop  Odyssey10, Brno, Czech Republc, June 10. [10] T. Stafylaks, V. Katsouros, P. Kenny, and P. Dumouchel, Mean Shft Algorthm for Exponental Famles wth Applcatons to Speaker Clusterng, Proc. Odyssey Speaker and Language Recognton Workshop, Sngapore, June 1. [11] M. Senoussaou, P. Kenny, P. Dumouchel and T. Stafylaks, Effcent Iteratve Mean Shft based Cosne Dssmlarty for MultRecordng Speaker Clusterng, n Proceedngs of ICASSP, 13. [1] N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Frontend factor analyss for speaker verfcaton, IEEE Transactons on Audo, Speech, and Language Processng, Vol. 19, No. 4, May 11, pp [13] N. Dehak, R. Dehak, J. Glass, D. Reynolds, and P. Kenny, "Cosne Smlarty Scorng wthout Score Normalzaton Technques," Proc. IEEE Odyssey Workshop, Brno, Czech Republc, June 10. [14] N. Dehak, Z. Karam, D. Reynolds, R. Dehak, W. Campbell, and J. Glass, "A ChannelBlnd System for Speaker Verfcaton," Proc. ICASSP, pp , Prague, Czech Republc, May 11. [15] M. Senoussaou, P. Kenny, N. Dehak and P. Dumouchel, An vector Extractor Sutable for Speaker Recognton wth both Mcrophone and Telephone Speech, n Proc Odyssey Speaker and Language Recognton Workshop, Brno, Czech Republc, June 10. [16] S. Shum, N. Dehak, E. Chuangsuwanch, D. Reynolds, and J. Glass, "Explotng IntraConversaton Varablty for Speaker Darzaton," Proc. Interspeech, pp , Florence, Italy, August 11. [17] S. Shum, N. Dehak, and J. Glass, "On the Use of Spectral and Iteratve Methods for Speaker Darzaton," Proc. Interspeech, Portland, Oregon, September 1. [18] A. Martn and M. Przybock, Speaker recognton n a multspeaker envronment, n Proceedngs of Eurospeech, 01. [19] E. Dalmasso, P. Laface, D. Colbro, C. Var, Unsupervsed Segmentaton and Verfcaton of MultSpeaker Conversatonal Speech, Proc. Interspeech 05. [] F. Castaldo, D. Colbro, E. Dalmasso, P. Laface, and C. Var, Streambased speaker segmentaton usng speaker factors and egenvoces, n Proceedngs of ICASSP, 08. [1] C. Vaquero AvlésCasco, Robust Darzaton For Speaker Characterzaton Darzacon Robusta Para Caracterzacon De Locutores, Ph.D. dssertaton, Zaragoza Unversty, 11. [] N. Dehak, P. TorresCarrasqullo, D. Reynolds, and R. Dehak, "Language Recognton va Ivectors and Dmensonalty Reducton," Proc. Interspeech, pp , Florence, Italy, August 11. [3] D. Martnez, Oldrcht Plchot, Lukas Burget, Ondrej Glembek and Pavel Matejka, Language Recognton n Vectors Space, Proceedngs of Interspeech, Florence, Italy, August 11. [4] H. Tang, S.M. Chu, M. HasegawaJohnson and T.S. Huang, Partally Supervsed Speaker Clusterng, Pattern Analyss and Machne Intellgence, IEEE Transactons on, vol.34, no.5, pp.959, 971, May 1. [5] P. Kenny, Jont factor analyss of speaker and sesson varablty: theory and algorthms. Techncal report CRIM06/0814, 06. [6] P. Kenny, G. Boulanne, and P. Dumouchel, Egenvoce modelng wth sparse tranng data, IEEE Transactons on Speech and Audo Processng, May 05. [7] D. Comancu, V. Ramesh, and P. Meer, Kernelbased object trackng. IEEE Transactons on Pattern Analyss and Machne Intellgence, 55, [8] B. Georgescu, I. Shmshon, and P. Meer, Mean shft based clusterng n hgh dmensons: A texture classfcaton example, n Proceedngs of Internatonal Conference on Computer Vson pp [9] U. Ozertem, D. Erdogmus, R. Jenssen, Mean shft spectral clusterng. Pattern Recognton, Volume 41, Issue 6, June 08, Pages [30] D. GarcaRomero, Analyss of vector length normalzaton n GaussanPLDA speaker recognton systems, n Proceedngs of Interspeech, Florence, Italy, Aug. 11. [31] D. Comancu, V. Ramesh, and P. Meer, The Varable Bandwdth Mean Shft and DataDrven Scale Selecton, Proc Eghth Intl Conf. Computer Vson, vol. I, pp , July 01. [3] P. Kenny, T. Stafylaks, P. Ouellet, J. Alam, and P. Dumouchel, PLDA for Speaker Verfcaton wth Utterances of Arbtrary Duraton, In Proceedng of ICASSP, Vancouver, Canada, May 13. [33] A. P. Dempster, N. M. Lard, and D. B. Rubn, Maxmum lkelhood from ncomplete data va the EM algorthm, Journal of the Royal Statstcal Socety, Seres B Methodologcal, vol. 39, no. 1, pp. 1 38, M. Senoussaou receved the Engneer degree n Artfcal Intellgence n 05 and Magster Masters degree n 07 from Unversté des Scences et de la Technologe d Oran, Algera. Currently h s a PhD student n the École de technologe supéreure ÉTS of Unversté du Québec, Canada and also wth Centre de recherche nformatque de Montréal CRIM, Canada. Hs research nterests are concentrated to the applcaton of Pattern Recognton and Machne learnng methods to the speaker verfcaton and Darzaton problems. P. Kenny receved the BA degree n Mathematcs from Trnty College, Dubln and the MSc and PhD degrees, also n Mathematcs, from McGll Unversty. He was a professor of Electrcal Engneerng at INRS Telecommuncatons n Montreal from 1990 to1995 when he started up a company Spoken Word Technologes to spn off INRSs speech recognton technology. He joned CRIM n 1998 where he now holds the poston of prncpal research scentst. Hs current research nterests are n textdependent and textndependent speaker recognton wth partcular emphass on Bayesan methods such as Jont Factor Analyss and Probablstc Lnear Dscrmnant Analyss. T. Stafylaks receved the Dploma degree n electrcal and computer engneerng from the Natonal Techncal Unversty of Athens NTUA, Athens, Greece, and the M.Sc. degree n communcaton and sgnal processng from Imperal College London, London, U.K., n 04 and 05, respectvely. He receved hs Ph.D. from NTUA on speaker darzaton, whle workng for the Insttute for Language and Speech Processng, Athens as research assstant. Snce 11, he s a postdoc researcher at CRIM and ETS, under the supervson of Patrck Kenny and Perre Dumouchel, respectvely. Hs current nterests are speaker recognton and darzaton, Bayesan modelng and multmeda sgnal analyss. P. Dumouchel receved B.Eng. McGll Unversty, M.Sc. INRSTélécommuncatons, PhD INRS Télécommuncatons, has over 5 years of experence n the feld of speech recognton, speaker recognton and emoton detecton. Perre s Charman and Professor at the Software Engneerng and IT Department at École de technologe supéreure ETS of Unversté du Québec, Canada.
The Development of Web Log Mining Based on ImproveKMeans Clustering Analysis
The Development of Web Log Mnng Based on ImproveKMeans Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.
More informationL10: Linear discriminants analysis
L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss
More informationCS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering
Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that
More informationWhat is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
More informationFace Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)
Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton
More informationNonlinear data mapping by neural networks
Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal
More informationAn InterestOriented Network Evolution Mechanism for Online Communities
An InterestOrented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne
More informationModule 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..
More informationbenefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).
REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or
More information8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by
6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng
More informationMultivariate EWMA Control Chart
Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant
More informationFeature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College
Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure
More informationForecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract  Stock market s one of the most complcated systems
More informationDescriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications
CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary
More informationI. SCOPE, APPLICABILITY AND PARAMETERS Scope
D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable
More informationA Secure PasswordAuthenticated Key Agreement Using Smart Cards
A Secure PasswordAuthentcated Key Agreement Usng Smart Cards Ka Chan 1, WenChung Kuo 2 and JnChou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,
More informationVision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION
Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble
More informationv a 1 b 1 i, a 2 b 2 i,..., a n b n i.
SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are
More informationSingle and multiple stage classifiers implementing logistic discrimination
Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul  PUCRS Av. Ipranga,
More informationCommunication Networks II Contents
8 / 1  Communcaton Networs II (Görg)  www.comnets.unbremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP
More informationLecture 18: Clustering & classification
O CPS260/BGT204. Algorthms n Computatonal Bology October 30, 2003 Lecturer: Pana K. Agarwal Lecture 8: Clusterng & classfcaton Scrbe: Daun Hou Open Problem In HomeWor 2, problem 5 has an open problem whch
More information9.1 The Cumulative Sum Control Chart
Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s
More informationINVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMAHDR NETWORKS
21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS
More informationThe Analysis of Outliers in Statistical Data
THALES Project No. xxxx The Analyss of Outlers n Statstcal Data Research Team Chrysses Caron, Assocate Professor (P.I.) Vaslk Karot, Doctoral canddate Polychrons Economou, Chrstna Perrakou, Postgraduate
More informationRecurrence. 1 Definitions and main statements
Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.
More informationCS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements
Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there
More informationPSYCHOLOGICAL RESEARCH (PYC 304C) Lecture 12
14 The Chsquared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed
More informationData Visualization by Pairwise Distortion Minimization
Communcatons n Statstcs, Theory and Methods 34 (6), 005 Data Vsualzaton by Parwse Dstorton Mnmzaton By Marc Sobel, and Longn Jan Lateck* Department of Statstcs and Department of Computer and Informaton
More informationCalculation of Sampling Weights
Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a twostage stratfed cluster desgn. 1 The frst stage conssted of a sample
More informationHYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION
HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION Abdul Ghapor Hussn Centre for Foundaton Studes n Scence Unversty of Malaya 563 KUALA LUMPUR Emal: ghapor@umedumy Abstract Ths paper
More information"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *
Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC
More informationClustering Gene Expression Data. (Slides thanks to Dr. Mark Craven)
Clusterng Gene Epresson Data Sldes thanks to Dr. Mark Craven Gene Epresson Proles we ll assume we have a D matr o gene epresson measurements rows represent genes columns represent derent eperments tme
More informationLETTER IMAGE RECOGNITION
LETTER IMAGE RECOGNITION 1. Introducton. 1. Introducton. Objectve: desgn classfers for letter mage recognton. consder accuracy and tme n takng the decson. 20,000 samples: Startng set: mages based on 20
More informationDEFINING %COMPLETE IN MICROSOFT PROJECT
CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMISP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,
More informationAn Alternative Way to Measure Private Equity Performance
An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate
More informationLuby s Alg. for Maximal Independent Sets using Pairwise Independence
Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent
More informationSupport Vector Machines
Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.
More informationMAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date
Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPPATBDClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller
More informationLogistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification
Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson
More informationData Broadcast on a MultiSystem Heterogeneous Overlayed Wireless Network *
JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819840 (2008) Data Broadcast on a MultSystem Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,
More informationForecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network
700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School
More informationCHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol
CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL
More informationA Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy Scurve Regression
Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy Scurve Regresson ChengWu Chen, Morrs H. L. Wang and TngYa Hseh Department of Cvl Engneerng, Natonal Central Unversty,
More informationA DATA MINING APPLICATION IN A STUDENT DATABASE
JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (5357) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng BüyükbakkalköyIstanbul
More information320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:
More informationHow Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence
1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh
More informationTime Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University
Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton
More informationInstitute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic
Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange
More informationUsing Mixture Covariance Matrices to Improve Face and Facial Expression Recognitions
Usng Mxture Covarance Matrces to Improve Face and Facal Expresson Recogntons Carlos E. homaz, Duncan F. Glles and Raul Q. Fetosa 2 Imperal College of Scence echnology and Medcne, Department of Computng,
More informationState function: eigenfunctions of hermitian operators> normalization, orthogonality completeness
Schroednger equaton Basc postulates of quantum mechancs. Operators: Hermtan operators, commutators State functon: egenfunctons of hermtan operators> normalzaton, orthogonalty completeness egenvalues and
More information1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)
6.3 /  Communcaton Networks II (Görg) SS20  www.comnets.unbremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes
More informationOn the Optimal Control of a Cascade of HydroElectric Power Stations
On the Optmal Control of a Cascade of HydroElectrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;
More informationA Probabilistic Theory of Coherence
A Probablstc Theory of Coherence BRANDEN FITELSON. The Coherence Measure C Let E be a set of n propostons E,..., E n. We seek a probablstc measure C(E) of the degree of coherence of E. Intutvely, we want
More informationA Computer Technique for Solving LP Problems with Bounded Variables
Dhaka Unv. J. Sc. 60(2): 163168, 2012 (July) A Computer Technque for Solvng LP Problems wth Bounded Varables S. M. Atqur Rahman Chowdhury * and Sanwar Uddn Ahmad Department of Mathematcs; Unversty of
More informationQuestions that we may have about the variables
Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent
More informationAn Enhanced SuperResolution System with Improved Image Registration, Automatic Image Selection, and Image Enhancement
An Enhanced SuperResoluton System wth Improved Image Regstraton, Automatc Image Selecton, and Image Enhancement YuChuan Kuo ( ), ChenYu Chen ( ), and ChouShann Fuh ( ) Department of Computer Scence
More information1 Approximation Algorithms
CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons
More informationGeorey E. Hinton. University oftoronto. Email: zoubin@cs.toronto.edu. Technical Report CRGTR961. May 21, 1996 (revised Feb 27, 1997) Abstract
The EM Algorthm for Mxtures of Factor Analyzers Zoubn Ghahraman Georey E. Hnton Department of Computer Scence Unversty oftoronto 6 Kng's College Road Toronto, Canada M5S A4 Emal: zoubn@cs.toronto.edu Techncal
More informationCan Auto Liability Insurance Purchases Signal Risk Attitude?
Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? ChuShu L Department of Internatonal Busness, Asa Unversty, Tawan ShengChang
More informationFast Fuzzy Clustering of Web Page Collections
Fast Fuzzy Clusterng of Web Page Collectons Chrstan Borgelt and Andreas Nürnberger Dept. of Knowledge Processng and Language Engneerng OttovonGuerckeUnversty of Magdeburg Unverstätsplatz, D396 Magdeburg,
More informationIMPACT ANALYSIS OF A CELLULAR PHONE
4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng
More informationNasdaq Iceland Bond Indices 01 April 2015
Nasdaq Iceland Bond Indces 01 Aprl 2015 Fxed duraton Indces Introducton Nasdaq Iceland (the Exchange) began calculatng ts current bond ndces n the begnnng of 2005. They were a response to recent changes
More informationSIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA
SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA E. LAGENDIJK Department of Appled Physcs, Delft Unversty of Technology Lorentzweg 1, 68 CJ, The Netherlands Emal: e.lagendjk@tnw.tudelft.nl
More informationLearning from Multiple Outlooks
Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel maayanga@tx.technon.ac.l she@ee.technon.ac.l
More informationA FASTER EXTERNAL SORTING ALGORITHM USING NO ADDITIONAL DISK SPACE
47 A FASTER EXTERAL SORTIG ALGORITHM USIG O ADDITIOAL DISK SPACE Md. Rafqul Islam +, Mohd. oor Md. Sap ++, Md. Sumon Sarker +, Sk. Razbul Islam + + Computer Scence and Engneerng Dscplne, Khulna Unversty,
More informationGender Classification for RealTime Audience Analysis System
Gender Classfcaton for RealTme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,
More informationAryabhata s Root Extraction Methods. Abhishek Parakh Louisiana State University Aug 31 st 2006
Aryabhata s Root Extracton Methods Abhshek Parakh Lousana State Unversty Aug 1 st 1 Introducton Ths artcle presents an analyss of the root extracton algorthms of Aryabhata gven n hs book Āryabhatīya [1,
More informationA hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm
Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):18841889 Research Artcle ISSN : 09757384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel
More informationExtending Probabilistic Dynamic Epistemic Logic
Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σalgebra: a set
More informationTraffic State Estimation in the Traffic Management Center of Berlin
Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,
More informationThe Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15
The Analyss of Covarance ERSH 830 Keppel and Wckens Chapter 5 Today s Class Intal Consderatons Covarance and Lnear Regresson The Lnear Regresson Equaton TheAnalyss of Covarance Assumptons Underlyng the
More informationPerformance Analysis and Coding Strategy of ECOC SVMs
Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.6776 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School
More informationEye Center Localization on a Facial Image Based on MultiBlock Local Binary Patterns
Eye Center Localzaton on a Facal Image Based on MultBloc Local Bnary Patterns Anatoly tn, Vladmr Khryashchev, Olga Stepanova Yaroslavl State Unversty Yaroslavl, Russa anatolyntnyar@gmal.com, vhr@yandex.ru,
More informationTHE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek
HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo
More informationA DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATIONBASED OPTIMIZATION. Michael E. Kuhl Radhamés A. TolentinoPeña
Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATIONBASED OPTIMIZATION
More informationCalculating the high frequency transmission line parameters of power cables
< ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,
More informationVoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays
VoIP Playout Buffer Adjustment usng Adaptve Estmaton of Network Delays Mroslaw Narbutt and Lam Murphy* Department of Computer Scence Unversty College Dubln, Belfeld, Dubln, IRELAND Abstract The poor qualty
More informationECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble
1 ECE544NA Fnal Project: Robust Machne Learnng Hardware va Classfer Ensemble Sa Zhang, szhang12@llnos.edu Dept. of Electr. & Comput. Eng., Unv. of Illnos at UrbanaChampagn, Urbana, IL, USA Abstract In
More informationBayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending
Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success
More informationThe Current Employment Statistics (CES) survey,
Busness Brths and Deaths Impact of busness brths and deaths n the payroll survey The CES probabltybased sample redesgn accounts for most busness brth employment through the mputaton of busness deaths,
More informationA Fast Incremental Spectral Clustering for Large Data Sets
2011 12th Internatonal Conference on Parallel and Dstrbuted Computng, Applcatons and Technologes A Fast Incremental Spectral Clusterng for Large Data Sets Tengteng Kong 1,YeTan 1, Hong Shen 1,2 1 School
More informationDocument Clustering Analysis Based on Hybrid PSO+Kmeans Algorithm
Document Clusterng Analyss Based on Hybrd PSO+Kmeans Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,
More informationThe OC Curve of Attribute Acceptance Plans
The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4
More informationOn Mean Squared Error of Hierarchical Estimator
S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta
More informationAnalysis of Premium Liabilities for Australian Lines of Business
Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton
More informationMANY machine learning and pattern recognition applications
1 Trace Rato Problem Revsted Yangqng Ja, Fepng Ne, and Changshu Zhang Abstract Dmensonalty reducton s an mportant ssue n many machne learnng and pattern recognton applcatons, and the trace rato problem
More informationCausal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes causeandeffect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
More informationImplementation of Deutsch's Algorithm Using Mathcad
Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages  n "Machnes, Logc and Quantum Physcs"
More informationActivity Scheduling for CostTime Investment Optimization in Project Management
PROJECT MANAGEMENT 4 th Internatonal Conference on Industral Engneerng and Industral Management XIV Congreso de Ingenería de Organzacón Donosta San Sebastán, September 8 th 10 th 010 Actvty Schedulng
More informationDamage detection in composite laminates using cointap method
Damage detecton n composte lamnates usng contap method S.J. Km Korea Aerospace Research Insttute, 45 EoeunDong, YouseongGu, 35333 Daejeon, Republc of Korea yaeln@kar.re.kr 45 The contap test has the
More informationStudy on CET4 Marks in China s Graded English Teaching
Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes
More informationSketching Sampled Data Streams
Sketchng Sampled Data Streams Florn Rusu, Aln Dobra CISE Department Unversty of Florda Ganesvlle, FL, USA frusu@cse.ufl.edu adobra@cse.ufl.edu Abstract Samplng s used as a unversal method to reduce the
More informationA Multimode Image Tracking System Based on Distributed Fusion
A Multmode Image Tracng System Based on Dstrbuted Fuson Ln zheng Chongzhao Han Dongguang Zuo Hongsen Yan School of Electroncs & nformaton engneerng, X an Jaotong Unversty X an, Shaanx, Chna Lnzheng@malst.xjtu.edu.cn
More informationRealistic Image Synthesis
Realstc Image Synthess  Combned Samplng and Path Tracng  Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random
More informationStatistical Approach for Offline Handwritten Signature Verification
Journal of Computer Scence 4 (3): 181185, 2008 ISSN 15493636 2008 Scence Publcatons Statstcal Approach for Offlne Handwrtten Sgnature Verfcaton 2 Debnath Bhattacharyya, 1 Samr Kumar Bandyopadhyay, 2
More informationNaive Rule Induction for Text Classification based on Keyphrases
Nave Rule Inducton for Text Classfcaton based on Keyphrases Nktas N. Karankolas & Chrstos Skourlas Department of Informatcs, Technologcal Educatonal Insttute of Athens, Greece. Abstract In ths paper,
More informationBERNSTEIN POLYNOMIALS
OnLne Geometrc Modelng Notes BERNSTEIN POLYNOMIALS Kenneth I. Joy Vsualzaton and Graphcs Research Group Department of Computer Scence Unversty of Calforna, Davs Overvew Polynomals are ncredbly useful
More informationEstimating the Number of Clusters in Genetics of Acute Lymphoblastic Leukemia Data
Journal of Al Azhar UnverstyGaza (Natural Scences), 2011, 13 : 109118 Estmatng the Number of Clusters n Genetcs of Acute Lymphoblastc Leukema Data Mahmoud K. Okasha, Khaled I.A. Almghar Department of
More informationThe eigenvalue derivatives of linear damped systems
Control and Cybernetcs vol. 32 (2003) No. 4 The egenvalue dervatves of lnear damped systems by YeongJeu Sun Department of Electrcal Engneerng IShou Unversty Kaohsung, Tawan 840, R.O.C emal: yjsun@su.edu.tw
More informationCapital asset pricing model, arbitrage pricing theory and portfolio management
Captal asset prcng model, arbtrage prcng theory and portfolo management Vnod Kothar The captal asset prcng model (CAPM) s great n terms of ts understandng of rsk decomposton of rsk nto securtyspecfc rsk
More information