ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING SHORT TEST AND TRAINING SESSIONS
|
|
|
- Ashley Dawson
- 9 years ago
- Views:
Transcription
1 18th European Signal Processing Conference (EUSIPCO-21) Aalborg, Denmark, August 23-27, 21 ROBUST TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING SHORT TEST AND TRAINING SESSIONS Christos Tzagkarakis and Athanasios Mouchtaris Department of Computer Science, University of Crete and Institute of Computer Science (FORTH-ICS) Foundation for Research and Technology - Hellas Heraklion, Crete, Greece {tzagarak, mouchtar}@ics.forth.gr ABSTRACT In this paper two methods for noise-robust text-independent speaker identification are described and compared against a baseline method for speaker identification based on the Gaussian Mixture Model (GMM). The two methods proposed in this paper are: (a) a statistical approach based on the Generalized Gaussian Density (GGD), and (b) a Sparse Representation Classification (SRC) method. The performance evaluation of each method is examined in a database containing twelve speakers. The main contribution of the paper is to investigate whether the SRC and GGD approaches can achieve robust speaker identification performance under noisy conditions using short duration testing and training data, in relevance to the baseline method. Our simulations indicate that the SRC approach significantly outperforms the other two methods under the short test and training sessions restriction, for all the signal-to-noise ratios (SNR) cases that were examined. 1. INTRODUCTION Speaker recognition systems are essential in a variety of security and commercial applications, such as information retrieval, control of financial transactions, control of entrance into safe or reserved areas and buildings, etc. [1]. Speaker recognition can be based on both the separate or combined use of several biometric features [2] (voice, face, fingerprints, etc.). In our study, we focus on speaker recognition using only voice patterns. Speaker recognition can be categorized into speaker verification and speaker identification. In speaker verification, a speaker claims to be of a certain identity and his/her voice is used to verify this claim. On the other hand, speaker identification is the task of determining an unknown speaker s identity. Generally speaking, speaker verification is a one-to-one match where one speaker s voice is matched to one template ( voice print or voice model ) whereas speaker identification is a one-to-n match where the voice is compared against N templates. Speaker recognition methods can also be divided into text-dependent and text-independent methods. The former require the speaker to provide utterances of keywords or sentences, the same text being used for both training and recognition. In text-independent recognition, the decision does not rely on a specific text being spoken. In our study we focus on textindependent speaker identification. In order to correctly identify a person, each speaker in the database is usually assigned a specific speaker model consistently describing the extracted speech features. During the identification process, the system returns the speaker s identity based on the closest matching of the test utterance against all speaker models. This procedure has proven to be effective under acoustic conditions in matched training and testing [3]. However, in practical applications where speech signals are corrupted by noise due to either the environment in which the speaker is present (e.g. the user is crossing a This work has been funded in part by the PEOPLE-IAPP AVID- MODE grant, within the 7th European Community Framework Program, and in part by the University of Crete Research Committee, Grant KA2739. busy street) or due to the voice transmission medium (e.g. the user is speaking through a cell-phone), robust identification is a challenging problem. The most popular approach for speaker identification is based on Gaussian Mixture Models (GMM) [3] (a brief description is given in Section 2.1). Other classifiers such as Support Vector Machines (SVM) [4] have also been used for this task. Recently, the focus of the speaker recognition research community has been given both on the study of features that are more robust in noise environments and on finding more robust and efficient identification algorithms. Specifically, in [5] robust features based on mel-scale frequency cepstral coefficients (MFCCs [6]) are proposed, in combination with a projection measure technique for speaker identification. In [7], the speech features are based on a harmonic decomposition of the signal where a reliable frame weighting method is adopted for noise compensation. In [8], the descriptors introduced are based on the AM-FM representation of the speech signal, while in [9] the proposed features are derived from auditory filtering and cepstral analysis (in both cases a GMM is used to model the feature space). In [1, 11] the noise robust speaker identification problem under mismatched testing and training conditions is studied. In [1], the identification is performed in the space of adapted GMMs where Bhattacharyya shape is used to measure the closeness of speaker models, while in [11] a multicondition model training and missing feature theory is adopted to deal with the training and testing mismatch, where this model is incorporated into a GMM for noise robust speaker identification. An important aspect in speaker identification is that in real-time applications the system should be able to respond within a short time duration about the identity of the speaker. However, when the number of the enrolled speakers in the database grows significantly, it is quite difficult for the system to quickly assign the speaker with a specific identity. For addressing such real-time efficiency concerns, in [12] a method based on approximating GMM likelihood scoring with an approximated cross entropy is proposed. In [13], the GMM-based speaker models are clustered using a k-means algorithm so as to select only a small proportion of speaker models used in likelihood computations. These approaches achieve a more efficient operation compared to state-of-the-art, without degrading the identification performance in large population databases. In this paper, we study the problem of noise-robust textindependent speaker identification under the assumption of short testing and training sessions. There are two reasons for following this approach: (i) it is often not feasible to have large amounts of training data from all the speakers and (ii) in order to speed up the identification process, the testing data (i.e., the speaker utterance to be identified) should be as short as possible. Towards this direction, two methods are proposed and tested under noisy conditions (additive white Gaussian noise), and compared to a baseline GMM method [3]. The first approach is adopted from the music classification task [14], while the second method is based on sparse representation classification which was recently proposed and applied on robust face recognition [15]. EURASIP, 21 ISSN
2 2. CLASSIFICATION METHODS In the current section a brief description of the methods used to perform the identification process is given. For the feature extraction task it is assumed that the speech signal/utterance is segmented into overlapping frames. In this paper we use the MFCC features [6]. 2.1 Gaussian Mixture Model Gaussian Mixture Models (GMMs) have been applied with great success in the text-independent speaker identification problem [3]. The approach is to model the probability density function (PDF) of the feature space of each speaker in the dataset (training phase) as a sum of Gaussian functions, and then use the maximum a- posteriori rule to identify the speaker. A Gaussian mixture density is a weighted sum of M multidimensional Gaussian densities, where the mixture density can be represented as { } λ i = p i m, μm,σ i i m, m = 1,...,M, (1) where for the i th speaker, p i m is the weight of the m th mixture (prior probability), μ i m is the corresponding mean vector, Σi m is the covariance matrix, and M is the total number of Gaussian mixtures. Each speaker is represented by a GMM and the corresponding model λ, whose parameters are computed via the Expectation-Maximization (EM) algorithm applied on the training features. For the speaker identification task (testing phase), the estimated speaker identity (speaker index) is obtained based on the maximum a-posteriori probability for a given sequence of observations as follows S q = arg max 1 i S p(λ i X)=arg max 1 i S p(x λ i )p(λ i ). (2) p(x) In the above equation, X =[x 1,x 2...x n ] denotes the sequence of n feature vectors, and S is the total number of speakers. For equally likely speakers and since p(x) is the same for all speaker models the above equation becomes S q = arg max 1 i S p(x λ i). (3) For independent observations and using logarithms, the identification criterion becomes n S q = arg max log p(x t λ i ), (4) 1 i S t=1 where p(x t λ i )= M m=1 p i { m (2π) K/2 Σ i m 1/2 exp K being the dimension of each feature vector. 1 } 2 (x t μ i m )T Σ i m 1 (xt μm) i, 2.2 Statistical Modeling based on Generalized Gaussian Density In this subsection, we briefly describe a statistical approach which treats the speaker identification problem as a multiple hypothesis problem. We previously proposed this approach within the context of music genre classification in [14], and in this paper we are interested to test its applicability for the speaker identification task. Let us assume that there are S speakers and that we have represented the speaker to be identified as S q, given a set X of feature vectors x t =(x 1,x 2,...,x K ) T. Each speaker is assigned a hypothesis H i. The goal is to select one hypothesis out of S, which best describes the data from S q. Under the common assumption of equal prior probabilities of the hypotheses, the optimal rule resulting in the minimum probability of classification error is to select the hypothesis with the highest likelihood among the S. Thus, S q is assigned to the speaker corresponding to the hypothesis H j if p(x t H j ) p(x t H i ), i j, i = 1,...,S. (5) For solving this problem, a parametric approach is adopted where each conditional probability density p(x H i ) is modeled by a member of a family of PDFs, denoted by p(x;θ i ) where θ i is a set of model parameters. Under this assumption, the extracted features for the i th speaker are represented by the estimated model parameter θˆ i, computed in the feature extraction stage. For assigning S q to the closest speaker identity: 1. Compute the Kullback-Leibler Divergence (KLD) between the density of the speaker to be identified p(x;θ q ) and the density p(x;θ i ) associated with the i th speaker identity in the database, i = 1,...,S: D(p(x;θ q ) p(x;θ i )) = p(x;θ q )log p(x;θ q) dx. (6) p(x;θ i ) 2. Assign S q to the identity corresponding to the smallest value of the KLD. A chain rule holds for the KLD and is applied in order to combine the KLDs from multiple data sets or dataset dimensions. This rule states that the KLD between two joint PDFs, p(x,y) and q(x,y), where X,Y are assumed to be independent data sets, is given by D(p(X, Y) q(x, Y)) = D(p(X) q(x)) + D(p(Y) q(y)). (7) The proposed method is based on fitting a Generalized Gaussian Density (GGD) on the PDF of the data set (features). In fact, independence among MFCC vector components is assumed, thus a GGD for each scalar component is estimated. This task can be achieved by estimating the two parameters of the GGD (α, β), which is defined as β p(x; α,β)= 2αΓ(1/β) e ( x /α)β, (8) where Γ( ) is the Gamma function, and the GGD parameters are computed using Maximum Likelihood (ML) estimation. Substitution of (8) into (6) gives the following closed form for the KLD between two GGDs ( β1 α 2 Γ(1/β 2 ) ) D(p α1,β 1 p α2,β 2 )=log β 2 α 1 Γ(1/β 1 ) ( α1 ) β β2 Γ( 2 +1 ) β + 1 α 2 Γ( 1 β ) 1. (9) β 1 1 Based on the independence assumption for the MFCC coefficients, (7) yields the following expression for the overall normalized distance between two test utterances U 1, U 2 D(U 1 U 2 )= 1 K K D(p U1,k p U2,k), (1) k=1 where K is the order of the MFCCs (dimension of a feature vector). 2.3 Sparse Representation Classification The approach of classification based on sparse representation is described in this subsection. This approach was initially applied in face recognition in [15], and is first applied in speaker identification in this paper. Let us assume that the n i training samples corresponding to the feature vectors of the i th speaker are arranged as columns of a matrix V i =[v i,1 v i,2... v i,ni ] R K n i, (11) 587
3 where K is the dimension of each (column) feature vector. Given a new test sample (feature vector) x t R K that belongs to the i th class, x t can be expressed as a linear combination of the training samples associated with class i 1 x t = c i,1 v i,1 + c i,2 v i, c i,ni v i,ni = V i c i, (12) where c i, j R are scalars. Let us also define the matrix V for the entire training set as the concatenation of the N = n n S training samples of all S classes (speakers): V =[V 1 V 2... V S ]=[v 1,1 v 1,2... v i, j... v S,nS ]. (13) P( X >x) 1 1 Empirical Normal Exponential Gamma Weibull GGD The linear representation of x t can be rewritten as x t = Vc,where c =[,...,,c i,1,c i,2,...,c i,ni,,...,] T R N, (14) is a coefficient vector whose elements are zero except those associated with the i th class. As a result, if S is large enough, c will be sufficient sparse. This observation motivates us to solve the following optimization problem for a sparse solution ĉ = argmin c c, s.t. x t = Vc, (15) where denotes the l norm, which counts the number of nonzero elements in a vector. The optimization problem in (15) is an NP-hard problem. However, an approximate solution can be obtained if the l norm is substituted by the l 1 norm as follows ĉ = argmin c c 1, s.t. x t = Vc, (16) where 1 denotes the l 1 norm of a vector. The efficient solution of the optimization problem in (16) has been studied extensively. Orthogonal Matching Pursuit (OMP) [16] is a popular solution to this problem, and this method is used in our simulations. In the ideal case, the non-zero entries in ĉ will be associated with the columns of matrix V from a single class i, andthetest sample will be assigned to that class. However, because of modeling errors and/or noise, there are small non-zero entries in ĉ that correspond to multiple classes. To overcome this problem, we perform an iterative procedure where we classify x t to each one of the possible classes and use the training vectors of this class for reconstructing x t. In other words, in each repetition we retain only the coefficients in ĉ that correspond to a particular class, and use the training vectors of this class as a basis to represent x t.weintroduce for each class a function δ i : R N R N, which selects the coefficients associated only with the class i. Then, each test feature vector is classified to the class that minimizes the l 2 norm residual for i = 1,...,S. min i r i (x t )= x t Vδ i (ĉ) 2 (17) 3. EXPERIMENTAL RESULTS In this section, we examine the identification performance of the three methods described in Section 2, regarding the correct speaker identification rate. For this purpose, several simulations under noisy conditions were conducted. The speech signals used for the simulations were obtained from the VOICES corpus, available by OGI s CSLU [17], which consists of twelve speakers (seven male and five female speakers). For all simulations, 2-dimensional MFCC coefficients were extracted from the speech utterances in a segmentby-segment basis. The frame duration was kept at 2 msec with 1 msec of frame shift. Before the feature extraction task, the training as well as the test utterances were pre-filtered using a low-pass filter of the form H(z) =1.97z 1, and then a silence detector algorithm based on the short-term energy and zero-crossing measures of Data Amplitude, x Figure 1: Example Amplitude Probability Density curves of the 8 th MFCC coefficient from the training data (2 sec) of the 1 th speaker. speech segments was applied 1. All the speech signals in the corpus have a sampling rate of 225 Hz. For the GMM-based identification results, a GMM with a diagonal covariance matrix was chosen for the simulations. The number of mixtures depended on the amount of training data (see description of Experiment 1 below). For the GGD-based identification case, Amplitude Probability Density (APD) curves (P( X > x)) are adopted to show that the GGD best matches the actual density of the data. An example for a part of the VOICES corpus is given in Figure 1, where we compare the empirical APD (solid line) against the APD curves obtained for the GGD, Weibull, Gamma, Exponential and the Gaussian models. The results in the figure correspond to the 8 th MFCC coefficient of the training data (2 sec duration) corresponding to the 1 th speaker (independence among feature vector components is assumed). Clearly, the GGD follows more closely the empirical APD than the other densities. This trend was observed in the majority of the training utterances used in our experiments. Thus, the GGD model is expected to give better results than the other densities when applied directly to the MFCC coefficients of the twelve speakers. The performance evaluation follows the philosophy as described in [3]: each sequence of feature vectors {x t } is divided into overlapping segments of L feature vectors, where the first two segments have the following form x 1,x 2,x 3,...,x } {{ L,x } L+1,x L+2,... 1 st segment x 1,x 2,x 3,...,x L,x L+1,x L+2,... } {{ } 2 nd segment The comparison between the identified speaker of each segment and the actual speaker of the test utterance is repeated for each speaker in the corpus, and the total correct identification rate is computed as the percentage of the correctly identified segments of length L over all test utterances # correctly identified segments correct ident. rate = 1%. (18) total# of segments In the previous sections, it was mentioned that in this paper the focus is given on noise robust speaker identification using short training and testing sessions. Towards this direction, white Gaussian noise is added on the test utterances, the SNR taking the values 1 speechcore 588
4 of 1, 15, 2, 25 db. In addition, the test segment lengths L vary from 1 to 5 with a step size of L = 4. Length L = 1 corresponds to.1 sec, length L = 5 corresponds to.5 sec, and so forth. The training utterances have a duration of 5, 1, 15 and 2 seconds, corresponding to a quite short training session. The training for all methods is performed using the clean speech data. The testing data have a duration of approximately 2 sec. 3.1 Experiment 1 Identification using GMM In this experiment, during the training process the MFCC coefficients for each speaker are collected. For each speaker, the corresponding MFCC data are modeled using a diagonal GMM. The number of mixtures was chosen to be 4, for the 5 and 1 sec training data, and 8 for the 15 and 2 sec training data. These choices of parameters were found experimentally to produce the best performance for the GMM-based identification. Clearly, the number of mixtures is small due to the small size of the training dataset. During the identification process, the identification rule (4) is used, and the correct identification rate is computed as in (18). 3.2 Experiment 2 Identification using KLD based on GGD The same experimental steps as in Experiment 1 are also followed here. Thus, for each speaker the MFCC vectors are collected during the training process. We estimate the GGD parameters (α,β) for each vector component, assuming independence among the MFCC components. During the identification process, a test utterance contains multiple MFCC vectors as explained. For each MFCC component of the test vectors, the GGD parameters (α,β) are estimated. In order to identify a speaker, we compute the KLD between the GGD model of the test data and each of the GGD models of the speakers in the dataset (per vector component). This procedure results in 2 distance values (since each MFCC vector contains 2 components). The final step is to compute the mean of these distances, as in (1). The identity of the speaker whose data result in the minimum distance is identified as the final result. The correct identification rate is computed as in (18). 3.3 Experiment 3 Identification using SRC In this subsection, the experimental procedure for the SRC approach is described. First, consider that from the training speech data of each speaker a number of n i of MFCC vectors is extracted. Consider a test utterance length of L frames. Adopting the notations from the theory of SRC in Section 2.3, the training matrix V has dimension 2 (12 n i ) and the test sample (feature) vector x t is a 2 1 vector. The test segment consists of L distinct test samples x t. Thus, the optimization problem of the form (P l ) : ĉ l = argmin c l c l 1, s.t. x t,l = Vc l, for l = 1,...,L (19) is solved L times for each different x t,l. The Orthogonal Matching Pursuit [16] is used to solve this problem. Each solution ĉ l of the problem (P l ) is used to get an identity i (for i = 1,...,12) of one of the 12 speakers in the dataset. Thus, a segment of length L vectors will provide L identification results. The predominant identity is found based on the majority of the decisions and the identification rate is computed as in (18). 3.4 Discussion In this subsection, the main observations of the results in Figures (2.a)-(2.d) are discussed. The percentage of correct identification results is given as a function of the length of the test utterance. We are mainly interested to examine the performance of the described methods for short test sessions. The four figures correspond to training data of duration 5, 1, 15, and 2 sec respectively, so as to examine the effect of using a short training dataset. The correct identification rates as a function of the test utterances segment length L are depicted. The black, red and green curves correspond to the SRC, GMM and KLD-GGD method, respectively. There are twelve curves in total, where the first part of each legend name indicates the corresponding method and the last part indicates the SNR value used for this method, e.g. SRC 1dB means that the black solid curve depicts the identification performance of the SRC approach under noise conditions of 1dB. From the Figures (2.a)-(2.d) we notice that the SRC method is superior over the GMM and KLD- GGD approach, especially for short test and training sessions, and is quite robust to noise. The GMM performance improves as the training and test data duration increases because the large amount of feature vectors increases the accuracy of the GMM model, however its sensitivity to noise is clearly indicated. The KLD-GGD approach does not have high correct identification rates even in the case where the amount of training and test data is 2 and 5 sec, respectively. Based on the results, we can assume that the GGD parameters (α,β) are not well-estimated in the case where the test data have short duration. The main point regarding the SRC method that has to be highlighted is that even in the case where the training data duration is 5 sec and the test utterance segments length is as low as 2 sec, the performance is greater than 8% for SNR values 15, 2 and 25 db. Even in the extreme case of 1 db SNR, the correct identification rate is above 7% for at least 2 sec test utterance segments length. Additionally, for lower test sessions than 2 sec the identification results for SRC are significantly better than the baseline method. For example, for 2 sec training data and 1.5 sec of test data, the SRC method gives correct identification above 7% for all SNR values. For the same case, for 1 db SNR, GMM results in correct identification of slightly more than only 2%. This is important for applications where a decision must be made using a small amount of test data, without having enough training data for a given number of speakers, and the speaker is located in a noisy environment. 4. CONCLUSIONS In this paper, we presented two methods for noise robust speaker identification using short-time training and testing data. They were both compared to a baseline GMM-based system. The first method was previously proposed for music genre classification, based on modeling the MFCC coefficients of the speakers using the GGD model. The second identification method was based on the recently proposed SRC algorithm. It was shown through experimental evaluation that the SRC approach performs significantly better than the other two methods when the amount of testing and training data is small, and is very robust to noise. Our future research plans include testing the SRC method with a larger set of speakers and a wide variety of noise types. REFERENCES [1] F. Bimbot, J. F. Bonastre, C. Fredouille, G. Gravier, I. M. Chagnolleau, S. Meignier, T. Merlin, J. O. Garcia, D. P. Delacretaz, and D. A. Reynolds, A tutorial on text-independent speaker verification, EURASIP Journal on Applied Signal Processing, vol. 4, pp , 24. [2] M. E. Sargin, Y. Yemez, E. Erzin, and A. M. Tekalp, Audiovisual synchronization and fusion using canonical correlation analysis, IEEE Trans. on Multimedia, vol. 9(7), pp , November 27. [3] D. A. Reynolds and R. C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models, IEEE Trans. on Speech and Audio Processing, vol. 3(1), pp , January [4] J. C. Wang, C. H. Yang, J. F. Wang, and H. P. Lee, Robust speaker identification and verification, IEEE Comp. Intelligence Magazine, vol. 2(2), pp , May 27. [5] K. H. Yuo, T. H. Hwang, and H. C. Wang, Combination of autocorrelation-based features and projection measure technique for speaker identification, IEEE Trans. on Speech and Audio Processing, vol. 13(4), pp , July 25. [6] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing. Prentice-Hall, Englewood Cliffs, NJ, USA,
5 1 Training Data Duration: 5 sec 1 Training Data Duration: 1 sec KLD GGD 1dB KLD GGD 15dB KLD GGD 2dB KLD GGD 25dB GMM 1dB GGM 15dB GGM 2dB GMM 25dB SRC 1dB SRC 15dB SRC 2dB SRC 25dB (a) Training Data Duration: 15 sec (b) Training Data Duration: 2 sec (c) (d) Figure 2: Speaker identification performance as a function of the test data duration for different number of SNR values. The duration of the training data is: (a) 5 sec, (b) 1 sec, (c) 15 sec and (d) 2 sec. [7] H. Fujihara, T. Kitahara, M. Goto, K. Komatani, T. Ogata, and H. G. Okuno, Speaker identification under noisy environments by using harmonic structure extraction and reliable frame weighting, in in Proc. Int. Conf. on Spoken Language Proc. (INTERSPEECH), Pittsburgh, Pennsylvania, USA, September 26. [8] M. Grimaldi and F. Cummins, Speaker identification using instantaneous frequencies, IEEE Trans. on Audio, Speech and Language Processing, vol. 16(6), pp , August 28. [9] Y. Shao and D. Wang, Robust speaker identification using auditory features and computational auditory scene analysis, in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, Nevada, USA, April 28. [1] K. Kumar, Q. Wu, Y. Wang, and M. Savvides, Noise robust speaker identification using Bhattacaryya distance in adapted gaussian models space, in Proc. European Signal Processing Conf. (EUSIPCO), Lausanne, Switzerland, August 28. [11] J. Ming, T. J. Hazen, J. R. Glass, and D. A. Reynolds, Robust speaker recognition in noisy conditions, IEEE Trans. on Audio, Speech and Language Processing, vol. 15(5), pp , July 27. [12] H. Aronowitz and D. Burshtein, Efficient speaker recognition using approximated cross entropy (ACE), IEEE Trans. on Audio, Speech and Language Processing, vol. 15(7), pp , September 27. [13] V. R. Apsingekar and P. L. D. Leon, Speaker model clustering for efficient speaker identification in large population applications, IEEE Trans. on Audio, Speech and Language Processing, vol. 17(4), pp , May 29. [14] C. Tzagkarakis, A. Mouchtaris, and P. Tsakalides, Musical genre classification via Generalized Gaussian and Alpha- Stable modeling, in Proc. IEEE Int. Conf. Acoust., Speech and Signal Processing (ICASSP), vol. 5, May 26. [15] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, Robust face recognition via sparse representation, IEEE Trans. Patt. Anal. Mach. Intell., vol. 31(2), pp , February 29. [16] J. A. Tropp and A. C. Gilbert, Signal recovery from random measurements via orthogonal matching pursuit, IEEE Trans. on Information Theory, vol. 53(12), pp , December 27. [17] A. Kain, High Resolution Voice Transformation. PhD thesis, OGI School of Science and Engineering at Oregon Health and Science University, October
Establishing the Uniqueness of the Human Voice for Security Applications
Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.
Developing an Isolated Word Recognition System in MATLAB
MATLAB Digest Developing an Isolated Word Recognition System in MATLAB By Daryl Ning Speech-recognition technology is embedded in voice-activated routing systems at customer call centres, voice dialling
Statistical Machine Learning from Data
Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA
SPEAKER IDENTIFICATION FROM YOUTUBE OBTAINED DATA Nitesh Kumar Chaudhary 1 and Shraddha Srivastav 2 1 Department of Electronics & Communication Engineering, LNMIIT, Jaipur, India 2 Bharti School Of Telecommunication,
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Emotion Detection from Speech
Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA
APPLYING MFCC-BASED AUTOMATIC SPEAKER RECOGNITION TO GSM AND FORENSIC DATA Tuija Niemi-Laitinen*, Juhani Saastamoinen**, Tomi Kinnunen**, Pasi Fränti** *Crime Laboratory, NBI, Finland **Dept. of Computer
The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications
Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.
MUSICAL INSTRUMENT FAMILY CLASSIFICATION
MUSICAL INSTRUMENT FAMILY CLASSIFICATION Ricardo A. Garcia Media Lab, Massachusetts Institute of Technology 0 Ames Street Room E5-40, Cambridge, MA 039 USA PH: 67-53-0 FAX: 67-58-664 e-mail: rago @ media.
Bag of Pursuits and Neural Gas for Improved Sparse Coding
Bag of Pursuits and Neural Gas for Improved Sparse Coding Kai Labusch, Erhardt Barth, and Thomas Martinetz University of Lübec Institute for Neuro- and Bioinformatics Ratzeburger Allee 6 23562 Lübec, Germany
engin erzin the use of speech processing applications is expected to surge in multimedia-rich scenarios
engin erzin Associate Professor Department of Computer Engineering Ph.D. Bilkent University http://home.ku.edu.tr/ eerzin [email protected] Engin Erzin s research interests include speech processing, multimodal
Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition
, Lisbon Investigations on Error Minimizing Training Criteria for Discriminative Training in Automatic Speech Recognition Wolfgang Macherey Lars Haferkamp Ralf Schlüter Hermann Ney Human Language Technology
Solutions to Exam in Speech Signal Processing EN2300
Solutions to Exam in Speech Signal Processing EN23 Date: Thursday, Dec 2, 8: 3: Place: Allowed: Grades: Language: Solutions: Q34, Q36 Beta Math Handbook (or corresponding), calculator with empty memory.
Recent advances in Digital Music Processing and Indexing
Recent advances in Digital Music Processing and Indexing Acoustics 08 warm-up TELECOM ParisTech Gaël RICHARD Telecom ParisTech (ENST) www.enst.fr/~grichard/ Content Introduction and Applications Components
Subspace Analysis and Optimization for AAM Based Face Alignment
Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China [email protected] Stan Z. Li Microsoft
Artificial Neural Network for Speech Recognition
Artificial Neural Network for Speech Recognition Austin Marshall March 3, 2005 2nd Annual Student Research Showcase Overview Presenting an Artificial Neural Network to recognize and classify speech Spoken
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
Component Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
Hardware Implementation of Probabilistic State Machine for Word Recognition
IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery
Clarify Some Issues on the Sparse Bayesian Learning for Sparse Signal Recovery Zhilin Zhang and Bhaskar D. Rao Technical Report University of California at San Diego September, Abstract Sparse Bayesian
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
Introduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning
Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step
Annotated bibliographies for presentations in MUMT 611, Winter 2006
Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
JPEG compression of monochrome 2D-barcode images using DCT coefficient distributions
Edith Cowan University Research Online ECU Publications Pre. JPEG compression of monochrome D-barcode images using DCT coefficient distributions Keng Teong Tan Hong Kong Baptist University Douglas Chai
Automatic Evaluation Software for Contact Centre Agents voice Handling Performance
International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,
Music Genre Classification
Music Genre Classification Michael Haggblade Yang Hong Kenny Kao 1 Introduction Music classification is an interesting problem with many applications, from Drinkify (a program that generates cocktails
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION
BLIND SOURCE SEPARATION OF SPEECH AND BACKGROUND MUSIC FOR IMPROVED SPEECH RECOGNITION P. Vanroose Katholieke Universiteit Leuven, div. ESAT/PSI Kasteelpark Arenberg 10, B 3001 Heverlee, Belgium [email protected]
Online Diarization of Telephone Conversations
Odyssey 2 The Speaker and Language Recognition Workshop 28 June July 2, Brno, Czech Republic Online Diarization of Telephone Conversations Oshry Ben-Harush, Itshak Lapidot, Hugo Guterman Department of
TCOM 370 NOTES 99-4 BANDWIDTH, FREQUENCY RESPONSE, AND CAPACITY OF COMMUNICATION LINKS
TCOM 370 NOTES 99-4 BANDWIDTH, FREQUENCY RESPONSE, AND CAPACITY OF COMMUNICATION LINKS 1. Bandwidth: The bandwidth of a communication link, or in general any system, was loosely defined as the width of
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
Statistical Analysis of Life Insurance Policy Termination and Survivorship
Statistical Analysis of Life Insurance Policy Termination and Survivorship Emiliano A. Valdez, PhD, FSA Michigan State University joint work with J. Vadiveloo and U. Dias Session ES82 (Statistics in Actuarial
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
Ericsson T18s Voice Dialing Simulator
Ericsson T18s Voice Dialing Simulator Mauricio Aracena Kovacevic, Anna Dehlbom, Jakob Ekeberg, Guillaume Gariazzo, Eric Lästh and Vanessa Troncoso Dept. of Signals Sensors and Systems Royal Institute of
Environmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
Separation and Classification of Harmonic Sounds for Singing Voice Detection
Separation and Classification of Harmonic Sounds for Singing Voice Detection Martín Rocamora and Alvaro Pardo Institute of Electrical Engineering - School of Engineering Universidad de la República, Uruguay
Low-resolution Character Recognition by Video-based Super-resolution
2009 10th International Conference on Document Analysis and Recognition Low-resolution Character Recognition by Video-based Super-resolution Ataru Ohkura 1, Daisuke Deguchi 1, Tomokazu Takahashi 2, Ichiro
School Class Monitoring System Based on Audio Signal Processing
C. R. Rashmi 1,,C.P.Shantala 2 andt.r.yashavanth 3 1 Department of CSE, PG Student, CIT, Gubbi, Tumkur, Karnataka, India. 2 Department of CSE, Vice Principal & HOD, CIT, Gubbi, Tumkur, Karnataka, India.
Two-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Speech recognition for human computer interaction
Speech recognition for human computer interaction Ubiquitous computing seminar FS2014 Student report Niklas Hofmann ETH Zurich [email protected] ABSTRACT The widespread usage of small mobile devices
Adaptive Face Recognition System from Myanmar NRC Card
Adaptive Face Recognition System from Myanmar NRC Card Ei Phyo Wai University of Computer Studies, Yangon, Myanmar Myint Myint Sein University of Computer Studies, Yangon, Myanmar ABSTRACT Biometrics is
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
How To Fix Out Of Focus And Blur Images With A Dynamic Template Matching Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 10 April 2015 ISSN (online): 2349-784X Image Estimation Algorithm for Out of Focus and Blur Images to Retrieve the Barcode
Maximum Likelihood Estimation of ADC Parameters from Sine Wave Test Data. László Balogh, Balázs Fodor, Attila Sárhegyi, and István Kollár
Maximum Lielihood Estimation of ADC Parameters from Sine Wave Test Data László Balogh, Balázs Fodor, Attila Sárhegyi, and István Kollár Dept. of Measurement and Information Systems Budapest University
Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. Philip Kostov and Seamus McErlean
Using Mixtures-of-Distributions models to inform farm size selection decisions in representative farm modelling. by Philip Kostov and Seamus McErlean Working Paper, Agricultural and Food Economics, Queen
Coding and decoding with convolutional codes. The Viterbi Algor
Coding and decoding with convolutional codes. The Viterbi Algorithm. 8 Block codes: main ideas Principles st point of view: infinite length block code nd point of view: convolutions Some examples Repetition
Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
Signal Detection C H A P T E R 14 14.1 SIGNAL DETECTION AS HYPOTHESIS TESTING
C H A P T E R 4 Signal Detection 4. SIGNAL DETECTION AS HYPOTHESIS TESTING In Chapter 3 we considered hypothesis testing in the context of random variables. The detector resulting in the minimum probability
Gamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
Reject Inference in Credit Scoring. Jie-Men Mok
Reject Inference in Credit Scoring Jie-Men Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
Support Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano
Signal Detection. Outline. Detection Theory. Example Applications of Detection Theory
Outline Signal Detection M. Sami Fadali Professor of lectrical ngineering University of Nevada, Reno Hypothesis testing. Neyman-Pearson (NP) detector for a known signal in white Gaussian noise (WGN). Matched
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution
A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Subjective SNR measure for quality assessment of. speech coders \A cross language study
Subjective SNR measure for quality assessment of speech coders \A cross language study Mamoru Nakatsui and Hideki Noda Communications Research Laboratory, Ministry of Posts and Telecommunications, 4-2-1,
Inference of Probability Distributions for Trust and Security applications
Inference of Probability Distributions for Trust and Security applications Vladimiro Sassone Based on joint work with Mogens Nielsen & Catuscia Palamidessi Outline 2 Outline Motivations 2 Outline Motivations
Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
Least-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
Chapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2
Advanced Engineering Forum Vols. 6-7 (2012) pp 82-87 Online: 2012-09-26 (2012) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/aef.6-7.82 Research on Clustering Analysis of Big Data
Gender Identification using MFCC for Telephone Applications A Comparative Study
Gender Identification using MFCC for Telephone Applications A Comparative Study Jamil Ahmad, Mustansar Fiaz, Soon-il Kwon, Maleerat Sodanil, Bay Vo, and * Sung Wook Baik Abstract Gender recognition is
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song
, pp.347-354 http://dx.doi.org/10.14257/ijmue.2014.9.8.32 A Sound Analysis and Synthesis System for Generating an Instrumental Piri Song Myeongsu Kang and Jong-Myon Kim School of Electrical Engineering,
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition
Automatic Cross-Biometric Footstep Database Labelling using Speaker Recognition Ruben Vera-Rodriguez 1, John S.D. Mason 1 and Nicholas W.D. Evans 1,2 1 Speech and Image Research Group, Swansea University,
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS
AUTOMATIC PHONEME SEGMENTATION WITH RELAXED TEXTUAL CONSTRAINTS PIERRE LANCHANTIN, ANDREW C. MORRIS, XAVIER RODET, CHRISTOPHE VEAUX Very high quality text-to-speech synthesis can be achieved by unit selection
CONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Relations Between Time Domain and Frequency Domain Prediction Error Methods - Tomas McKelvey
COTROL SYSTEMS, ROBOTICS, AD AUTOMATIO - Vol. V - Relations Between Time Domain and Frequency Domain RELATIOS BETWEE TIME DOMAI AD FREQUECY DOMAI PREDICTIO ERROR METHODS Tomas McKelvey Signal Processing,
Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm
1 Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm Hani Mehrpouyan, Student Member, IEEE, Department of Electrical and Computer Engineering Queen s University, Kingston, Ontario,
Biometric Authentication using Online Signatures
Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu [email protected], [email protected] http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,
Visual-based ID Verification by Signature Tracking
Visual-based ID Verification by Signature Tracking Mario E. Munich and Pietro Perona California Institute of Technology www.vision.caltech.edu/mariomu Outline Biometric ID Visual Signature Acquisition
Server Load Prediction
Server Load Prediction Suthee Chaidaroon ([email protected]) Joon Yeong Kim ([email protected]) Jonghan Seo ([email protected]) Abstract Estimating server load average is one of the methods that
Figure1. Acoustic feedback in packet based video conferencing system
Real-Time Howling Detection for Hands-Free Video Conferencing System Mi Suk Lee and Do Young Kim Future Internet Research Department ETRI, Daejeon, Korea {lms, dyk}@etri.re.kr Abstract: This paper presents
Vision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu [email protected] [email protected] Abstract A vehicle tracking and grouping algorithm is presented in this work
Probabilistic user behavior models in online stores for recommender systems
Probabilistic user behavior models in online stores for recommender systems Tomoharu Iwata Abstract Recommender systems are widely used in online stores because they are expected to improve both user
Probabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg [email protected] www.multimedia-computing.{de,org} References
Section 5.0 : Horn Physics. By Martin J. King, 6/29/08 Copyright 2008 by Martin J. King. All Rights Reserved.
Section 5. : Horn Physics Section 5. : Horn Physics By Martin J. King, 6/29/8 Copyright 28 by Martin J. King. All Rights Reserved. Before discussing the design of a horn loaded loudspeaker system, it is
Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability
Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller
Principal components analysis
CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k
Lecture 8: Signal Detection and Noise Assumption
ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,
Kristine L. Bell and Harry L. Van Trees. Center of Excellence in C 3 I George Mason University Fairfax, VA 22030-4444, USA [email protected], hlv@gmu.
POSERIOR CRAMÉR-RAO BOUND FOR RACKING ARGE BEARING Kristine L. Bell and Harry L. Van rees Center of Excellence in C 3 I George Mason University Fairfax, VA 22030-4444, USA [email protected], [email protected] ABSRAC
