This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Size: px
Start display at page:

Download "This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and"

Transcription

1 This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier s archiving and manuscript policies are encouraged to visit:

2 Expert Systems with Applications 36 (2009) Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection Halis Altun *, Gökhan Polat Nigde University, Faculty of Engineering, Department of Electrical and Electronics, Kampus, Nigde, Turkey article Keywords: Emotion detection Feature selection Speech analysis Machine learning info abstract This paper deals with the strategies for feature selection and multi-class classification in the emotion detection problem. The aim is two-fold: to increase the effectiveness of four feature selection algorithms and to improve accuracy of multi-class classifiers for emotion detection problem under different frameworks and strategies. Although, a large amount of research has been conducted to determine the most informative features in emotion detection, it is still an open problem to identify reliably discriminating features. As it is believed that highly informative features are more critical factor than classifier itself, recent studies have been focused on identifying the features that contribute more to the classification problem. In this paper, in order to improve the performance of multi-class SVMs in emotion detection, 58 features extracted from recorded speech samples are processed in two new frameworks to boost the feature selection algorithms. Evaluation of the final feature sets validates that the frameworks are able to select more informative subset of the features in terms of class-separability. Also it is found that among four feature selection algorithms, a recently proposed one, LSBOUND, significantly outperforms the others. The accuracy rate obtained in the proposed framework is the highest achievement reported so far in the literature for the same dataset. Ó 2008 Elsevier Ltd. All rights reserved. 1. Introduction Emotion detection is currently very active research field. It is considered as indispensable capability of automated systems in the future which are supposed to interact with human beings. In this respect, emotion recognition from speech is currently very active research area, which has attracted the interest of the research community (Altun, Shawe-Taylor, & Polat, 2007; Altun & Polat, 2007; Cowie et al., 2001; Fragopanagos & Taylor, 2005; Pantic, Sebe, Cohn, & Huang, 2005; Polat & Altun, 2007a; Reynolds, Ishikawa, & Tsujino, 2006; Shami & Verhelst, 2007; Ververidis & Kotropoulos, 2004, 2005; Wang & Guan, 2004). As it is considered a pattern classification problem, an automatic emotion detection system might be composed of at least three main components: feature extraction, feature selection and classification. The final aim of emotion detection research typically is to build a system which is able to detect emotion in spoken speech. Although, a considerable effort towards identifying the best classifiers for emotion detection is witnessed in literature (Polat & Altun, 2007a; Shami & Verhelst, 2007; Yacoub, Simske, Lin, & Burns, 2003), a recent tendency in emotion recognition is to seek a subset of speech-related features, * Corresponding author. Tel.: ; fax: addresses: halisaltun@nigde.edu.tr, haltun@nigde.edu.tr (H. Altun), gpolat51@yahoo.com (G. Polat). from a high-dimensional feature space, that would give better accuracy in classifying emotional states (Fernandez & Picard, 2005; Polat & Altun, 2007b; Schuller, Arsic, Wallhoff, & Rigoll, 2006; Xiao, Dellandrea, Dou, & Chen, 2005). However, despite of a large amount of research it is still an open problem to identify reliable informative features for this task (Juslin & Scherer, 2005). Xiao et al. (2005) employed Fisher s discriminant ratio to select the more informative features from 50 speech-related features. Their results indicate that the energy distribution in the frequency domain and the speech rate contribute significantly in emotion detection. In a study by Yacoub et al. (2003), 37 prosodic features related to pitch, loudness and segments have been extracted. Out of 37 features, 19 are selected using standard Forward Selection algorithm, which ranks the features using 10-fold cross validation. Then, a comparison between Neural Network (NN), Support Vector Machines (SVM), k-nearest neighbour, and decision tree classifiers has been carried out. They have reported that in classifying multiple emotions, prosodically close emotion classes should be grouped together and separate features and classifiers should be employed across groups. In a recent study, Ververidis and Kotropoulos (2005) have performed emotion detection using a Gaussian Mixture Model (GMM). A more sophisticated feature selection algorithm, Sequential Floating Forward Selection (SFFS), was employed to find the best 10 features among 65 global statistical features. Results showed that SFFS improved the accuracy of the Naïve /$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi: /j.eswa

3 8198 H. Altun, G. Polat / Expert Systems with Applications 36 (2009) Bayes classifier by 3%, compared to the results of their previous work where SFS feature selection in the same setting was used. Fernandez and Picard (2005) highlighted results from an extensive investigation developing new features, comparing them with classical features using machine learning techniques to recognize five emotional states. SFFS with the leave-one-out (LOO) generalization error of a k-nearest neighbor (k-nn) classifier was used in the feature selection phase to rank 87 features. The discriminative ability of the selected feature set is evaluated by SVMs and their generalization error estimated through 15-fold cross validation. Feature selection for emotion detection in noisy speech has been discussed by Schuller et al. (2006). They employed Information Gain Ratio to select the best features, out of a set of The selected features were given to an SVM classifier in order to evaluate their impact on the performance of the classifier. In the reported works above, a single wrapper or filter type feature selection algorithm was employed to find the most informative features. However, the quality of the features selected is highly dependent on the algorithm employed. Despite the researches on finding the best representative features that give a higher accuracy, the subset of the selected best features are completely dependent on the ability of the algorithm used to rank the set of features. Consequently, each feature selection algorithm will end up with a different subset of features as the best feature subset. Therefore, there is an obvious need to define a framework within which it is more likely to obtain a reliable subset of features. In this paper recently proposed two frameworks (Polat & Altun, 2007a) and three multi-class classifiers are employed to evaluate the ability of the frameworks in determining more informative feature set. The underlying property of the frameworks is to decompose a multi-class classification problem into binary classification as either one-vs-rest or one-vs-one problem. Although some variable selection methods, such as Sequential Forward Selection (SFS) method, treat the multi-class case directly rather than decomposing it into several two-class problems, it will be shown that decomposing the problem into binary-classification and reconstruction of a final feature subset from a set of candidate feature subsets results in an improved performance in terms of classifier accuracy. Feature selection algorithms are chosen to be one wrapper type, one filter type and two recently proposed embedded feature selection algorithms. The SFS algorithm with LOO Cross Validation (LOOCV) error of a k-nn classifier is chosen as a wrapper type feature selection algorithm. The embedded type algorithms are two state of the art feature selection algorithms, namely the algorithms based on Least Squared SVM Bound (LSBOUND) (Zhou & Mao, 2005) and on W2R2 concept (Weston et al., 2000), respectively. Finally, a filter approach based on Mutual Information (MUTINF) is used (Zaffalon & Hutter, 2002). 2. Feature extraction The data used in the experiments is extracted from the Berlin Emotional Speech Database-EmoDB (Burkhardt, Paeschke, Rolfes, Sendlmeier, & Weiss 2005). The utterances in the EmoDB are expressed by 10 professional actors (5 male and 5 female) in seven emotional styles: anger, boredom, disgust, fear, happiness, sadness, and neutral. The total number of speech samples is 493, and the set comprises 286 female and 207 male voices. In our experiments, 338 samples corresponding to 4 emotional classes, namely anger, happiness, neutral and sadness, have been used. Fifty-eight features have been extracted from the speech samples as explained in Dogan (2006) and given in Table 1. There are four groups of features. Seventeen of them are related to prosodic features based on statistical properties of the fundamental frequency F0; among them, four new features have been defined. These four features Table 1 The set of speech related features (# indicates the number of features in each group). Group (#) PROSODIC (17) SUBBAND (5) MFCC (20) LPC (16) are based on the concept of the maximum and minimum region of F0. An F0 region is defined as a maximum F0 region if the fundamental frequency satisfies the criteria F0 P F0 m where F0 m is the mean of the pitch frequency given by F0 m ¼ 1 n X n i¼1 F0 i Description of features Maximum, minimum, mean and standard deviation of F0 Mean and standard deviation of F0 in the maximum region Mean and standard deviation of F0 in the minimum region. Max, min, mean and standard deviation of positive slope in F0 Max, min, mean and standard deviation of negative slope in F0 Voiced unvoiced ratio Sub-band energies Mel-frequency cepstrum coefficients 16th order LPC coefficients where F0 i the estimated pitch frequency in the current speech frame of 20msn. Otherwise, the region is defined as a minimum F0 region where F0 satisfies the criteria F0<F0 m. Another five features are formed from the sub-band energies of the utterances, using sixth order elliptic filters with center frequencies of 400, 800, 1200, 1600 and 3200 Hz, respectively. Furthermore, 20 Mel-Frequency Cepstrum Coefficients (MFCC) and 16th order Linear Predictive Coding (LPC) parameters have been extracted from the speech samples as feature vectors. 3. Feature selection algorithms As it is well known, the presence of irrelevant features may reduce the accuracy of classifiers. In this sense, feature selection is typically used to achieve three objectives: reduce the size of the feature set in order to improve the prediction performance of a classifier, to provide a fast and more computationally efficient classifier, and to provide a better understanding of the underlying process that generated the data (Guyon & Elisseeff, 2003). Two common methods are employed in feature selection algorithms: the filter and the wrapper methods (Jain & Zongker, 1997). The main difference between the two methods arises from the evaluation criterion. The selection of feature subsets based in the wrapper method relies on the performance of the classifier, while the filter method employs intrinsic properties of data, such as mutual information or a Mahalanobis class separability measure, as the criterion for feature subset evaluation. In most pattern recognition applications, the wrapper method is reported to outperform the filter method. However, wrapper methods are computationally expensive compared to filter methods. Mladenić, Brank, Grobelnik, and Milic-Frayling (2004) show that feature scoring and selection based on the normal vector obtained from a linear SVM combines very well with all the classifiers considered in their study. As the linear SVM classifier has an output prediction of the form, F(x) = sgn(w T x + b), where F(x) is the prediction function, a feature with a weight close to zero will have a smaller effect on the prediction, so the feature can be removed to obtain a more predictive subset of features. Similar approaches have been proposed in the literature for feature selection (Bennett, Embrechts, Breneman, & Song, 2003; Guyon & Elisseeff, 2003; Abe & Kudo, 2006). The idea in these approaches is to employ a relatively simple linear predictor in the feature selection algorithm and then train a more complex non-linear predictor on the subset of features. In this study, we have employed four feature selection algorithms which are detailed below; a wrapper, a filter and two embedded feature selection algorithms, respectively.

4 H. Altun, G. Polat / Expert Systems with Applications 36 (2009) Sequential Forward Selection (SFS) In SFS, features that are not already selected are considered for selection on the basis of their impact on the objective function. Let J(h) be our objective function. Given a feature set X ¼ fx i ji ¼ 1; 2;...; mg, the aim in the SFS algorithm is to find a subset Y k # X where k < m, starting from an empty subset Y 0 ={ }. A feature x + which has not been selected before, will be a candidate to add into the subset based the condition x þ ¼ argmax½jðy k þ x þ ÞŠ. xry k On each iteration, exactly one feature is added to the feature subset. The process stops after an iteration where no feature additions result in an improvement in accuracy or if the pre-specified number of features has been reached. A more sophisticated version of SFS is Sequential Floating Forward Selection (SFFS). However, as it has been shown by Reunanen (2003), contrary to common believe, intensive search techniques like SFFS do not necessarily outperform the simpler and faster method like SFS. SFS is suitable to perform a feature selection task in a multi-class classification problem. We refer to SFS algorithm as SFS (classic) if it is employed directly without proposed framework. On the other hand, we call the algorithm SFS (proposed) if it is employed in the proposed frameworks. We have used SFS with the LOOCV error of a k-nn classifier as objective function in our experiments Least Square Bound Feature Selection (LSBOUND) Recently a new feature selection algorithm has been proposed (Zhou & Mao, 2005). It is based on a new filter-like evaluation criterion, called the Least Square (LS) Bound measure and has the advantage of both filter and wrapper methods. A criterion for feature selection is derived from the leave-one-out cross validation (LOOCV) procedure of the least squares support vector machines (LS-SVM) and is closely related to an upper bound of LOOCV classification results. As the estimation of the upper bound implicitly involves the training of the classifier only once, without repeated use of cross validation, the computational complexity of the algorithm is significantly reduced compared with the classical wrapper method. It has been proved that when an LS-SVM classifier is trained on the entire training set, if the corresponding Lagrangian multiplier a 0 p of the training sample x p is positive, the following inequality holds in the leave-one-out procedure (Zhou & Mao, 2005) y p f p ðx p Þ 6 a 0 p ½ðDp min Þ2 þ 2=cŠ 1 where a 0 p is the corresponding Lagrangian multiplier of x p when the LS-SVM classifier is trained on the entire training set, D p min is the distance between x p and its nearest neighbour and c is a positive value which penalizes errors. f p (x p ) is the validation result for the sample x p in the leave-one-out procedure. If y p f p (x p ) is negative the sample x p is considered as a leave-one-out error, and if y p f p (x p ) is positive x p is correctly classified in the leave-one-out procedure. The bound on the right-hand side of the inequality is related to both the corresponding training result and the nearest neighbour. Negative value of the bound indicates that the sample must be correctly classified in the leave-one-out procedure. On the other hand a positive value of the bound indicates, to some extent, the probability of misclassification, and hence can be used to evaluate the goodness of the feature set. Combining the bounds for all training data together, the following measure is proposed, called LS Bound measure, as the evaluation criterion for feature selection: M ¼ Xl p¼1 ða 0 p ½ðDp min Þ2 þ 2=cŠ 1Þ þ where () + = max(0,x). ð1þ ð2þ A simple greedy heuristic search algorithm such as SFS or a more complex search algorithm such as SFFS can be used with the LS Bound measure M in Eq. (2) to form a new feature selection algorithm. In our study we use this approach with a linear kernel Mutual Information Based Feature Selection (MUTINF) The mutual information (MUTINF) is a widely used information-theoretic measure for the stochastic dependency of discrete random variables. In the context of a filter approach, one may employ mutual information to discard irrelevant features in order to find a small subset of features on the basis of low values of mutual information. This approach relies on empirical estimates of the mutual information between each variable and the target as follows: Z IðiÞ ¼ x i Z y pðx i ; yþ log pðx i; yþ pðx i ÞpðyÞ dxdy where p(x i ) and p(y) are the probability densities of x i and y, and p(x i,y) is the joint density. In the case of discrete or nominal variables, the integral becomes a sum IðiÞ ¼ X x i X y PðX ¼ x i ; Y ¼ yþ log PðX ¼ x i; Y ¼ yþ PðX ¼ x i ÞPðY ¼ yþ where the probabilities are then estimated from frequency counts of the input variables and class distribution. In our study we have used an implementation of mutual information based feature selection explained in Zaffalon and Hutter (2002) R2W2 R2W2 is a state of the art feature selection algorithm especially designed for binary classification tasks using an SVM classifier (Weston et al., 2000). It can be considered as a wrapper approach and exploits indirectly the maximal margin principle for feature selection. The idea in this approach is to find a weight vector over the features in order to minimize the objective function of an SVM. For a training set of size m belonging to a sphere of size R and are separable with margin M, a bound on the expected generalisation of the SVM is given by EP err 6 1 m E R ¼ 1 M m fr2 W 2 ða 0 Þg ð5þ As optimization of the objective function is accomplish using gradient descent, a new SVM optimization problem is constructed after each gradient step making the procedure computationally expensive for large datasets. In our study, R2W2 is used together with an RBF kernel which is defined as follows: Kðx 1 ; x 2 Þ¼exp kx 1 x 2 k 2r 2 which is implemented in the Spider Package (Weston, Elisseeff, Bakır, & Sinz, 2006). 4. Feature selection frameworks in the emotion detection problem The feature selection approach used in this work is based on two recently proposed frameworks. The effectiveness of the proposed frameworks is evaluated and results are given in literature (Altun & Polat, 2007; Altun et al., 2007). Feature selection tasks in emotion detection are performed as depicted in Fig. 1. The underlying properties of the frameworks are to decompose a ð3þ ð4þ ð6þ

5 8200 H. Altun, G. Polat / Expert Systems with Applications 36 (2009) SET2 ¼ [N S i i¼1 ð8þ multi-class classification problem into binary classification problems and then perform feature selection for each sub-problems using one of four feature selection algorithms. Then two feature construction operators; namely the intersection and the unification, are defined to construct the final feature set from each subsets. At the final stage, multi-class classifiers are employed to determine the emotional state using the final feature set. In the first stage, the emotion detection problem has been treated in a one-vs-rest framework. This setting is called FRM1. In this framework, the problem is to discriminate one class of emotion from the rest and class-specific features are expected to be selected by the feature selection algorithms. As there are M classes of emotional states, M subsets of features will be selected by each of the feature selection algorithms. In the second framework, which is labeled as FRM2, the problem is organized in a one-vs-one manner. In this approach the feature selection algorithms are expected to select highly class-specific features which are informative in discriminating one class of emotion form another one. The number of subsets produced in FRM2 will be MðM 1Þ=2. In FRM1, each feature selection algorithm produces four subsets of features, S i, which are corresponding to one of four decomposed binary classification problems. In the feature construction stage, these subsets, S i, are processed to obtain the best feature set. In this stage, two strategies in construction of a final feature set have been followed. Firstly, the intersection operator given in (7) is performed on the subsets, S i. The feature that occurs more than one subset is selected to form the final set which is labeled as SET1. The operation is defined as follows: SET1 ¼ [N Fig. 1. Framework for feature selection in the emotion detection. i;j¼1 i j S i \ S j where N = M is equal to the total number of subsets, S i. In the second strategy, the subsets of features, S i, are simply combined together to form the final set which is labeled as SET2. This task corresponds to performing a unification operation on the subsets S i as follows: ð7þ where N = M equals the total number of subsets, S i. The number of selected features in the SET1 is expected to be less than the number of features in the SET2, due to the intersection operation. In FRM2, as the emotion detection is organized as one-vs-one binary classification problems, each of the feature selection algorithms will then produce six subsets of features. Then the same steps are followed as in FRM1: two final sets are formed from the subsets of features, S i, by performing the intersection and union operations, respectively. The final feature sets are labeled as SET1 and SET2, respectively, following the convention employed in the FRM1. The final stage in the proposed frameworks is to detect the emotional states using multi-class classifiers as illustrated in Fig. 1. Support Vector Machines (SVM) has been chosen as classifier for this task. A classic way of performing a multi-class classification using a binary SVM is to cast multi-classification problem into binary ones. This can be accomplished by either formulating the problem as one-vs-one classification (referred here to as SVM1) or as one-vs-rest classification (referred to as SVM2). In addition, we have also used a recently proposed multi-classification approach, the maximal margin robot (MMR), proposed by Szedmak, Saunders, Shawe-Taylor, and Rousu (2005). For all SVM classifiers, the radial basis kernel with the same hyper-parameters has been used and five-fold Cross Validation (CV) error is calculated to evaluate the accuracy of the classifiers. Four feature selection algorithms produce four SET1s and four SET2s in each framework. As a result, eight final feature sets are employed to train a multi-class classifier in FRM1 and FRM2. Total number of classification tasks, therefore, amounts to 48 which are enough to carry out a fair comparison. Figs. 2 and 3 illustrate the average percentage of features selected from each feature groups, namely the prosodic, the subband energy, the MFCC and the LPC groups, in FRM1 and FRM2, respectively. Also, in order to illustrate the effectiveness of the proposed framework, the behavior of SFS (classic) and SFS (proposed) is compared in Table 2. For a fair comparison, the number of selected feature by the SFS (proposed) and by the SFS (classical) is set identical to each other. The figures are an indication of the informative power of each feature groups assigned by the feature selection algorithms. A close inspection of Figs. 2 and 3 reveals that SFS (proposed) tends to select more features from the prosodic and subband energy feature groups, compared to SFS (classic), while the number of selected features from LPC group is decreased. As it is shown in Table 2 this tendency of the algorithm considerably reduces CV error in general. This behavior is also consistent with the findings presented in literature that prosodic and energy related features are more informative (Cowie et al., 2001; Xiao et al., 2005). The accuracy of the classifiers in terms of five-fold Cross Validation (CV) error of the SFS algorithm in the proposed and classical implementation is illustrated in Table 2 using two multi-class classifiers (SVM1 and SVM2). In the table, the first row illustrates the case where no feature selection has been performed yet. The total number of features in this case is 58 and the baseline for accuracy is 79.9% and 78.1% for SVM1 and SVM2, respectively. As shown in Table 2 the SFS (classic) is able to successfully reduce the number features, but in most of the cases the accuracy of the classifiers deteriorate, producing higher CV error compared to the no-feature selection case. On the other hand, the classifiers produce outperforming results when the features selected by the SFS (proposed) are employed. The accuracy of SVM1 and SVM2 classifiers, in this case, is improved by up to 17.4% and 17.3%, respectively. As indicated above, the SFS algorithm in the proposed frameworks tends

6 H. Altun, G. Polat / Expert Systems with Applications 36 (2009) Fig. 2. Normalized percentage of features selected by feature selection algorithms in FRM1. Fig. 3. Normalized percentage of features selected by feature selection algorithms in FRM2. Table 2 Comparison of SFS (classic) and SFS (proposed) algorithms. (SET1: The intersection operator is used to obtain the final feature set. SET2: The unification operator is used to obtain the final feature set. FRM1: Feature selection is performed in one-vs-one framework. FRM2: Feature selection is performed in one-vs-rest framework). Feature set # of features SVM1 SVM2 Average Average accuracy (%) accuracy (%) Not applied SFS (classic) SFS (proposed) in SET1-FRM SFS (classic) SFS (proposed) in SET2-FRM SFS (classic) SFS (proposed) in SET1-FRM SFS (classic) SFS (proposed) in SET2-FRM Table 3a Average accuracy of the classifiers associated with SFS algorithm. Feature set # of features MMR SVM1 SVM2 NONE SET1-FRM SET1-FRM SET2-FRM SET2-FRM Table 3b Average accuracy of the classifiers associated with LSBOUND algorithm. Feature set # of features MMR SVM1 SVM2 SET1-FRM SET1-FRM SET2-FRM SET2-FRM to emphasis the prosodic and sub-band energy related features which improve the performance of the classifiers. These results indicate that the SFS algorithm is able to select more informative features in the proposed frameworks. Figs. 2 and 3 also reveal that the LSBOUND algorithms in the proposed frameworks tends to select more MFCC features in FRM1 and in FRM2, when compared to rest of the algorithms, while the number of LPC parameters selected by LSBOUND are decreased in general. As it is shown in Tables 3b and 6, this behaviour of the algorithm is the main contributor to its success. LSBOUND is the most successful algorithm in further reducing the average CV error. The tendency of the LSBOUND algorithms seems to be highly plausible, as MFCC related features are expected to show a clear distinction between the features of speech, which are related to the source of voice and that of the vocal tract as a result of their log operation in the frequency domain. Fig. 4 emphases the performance of the proposed frameworks in terms of the average number of the features selected from each distinct feature group. The figure reveals that the feature selection algorithms in FRM1 place more emphasis on the MFCC and the sub-band energy feature groups in general. On the other hand the number of features selected from LPC feature group is reduced

7 8202 H. Altun, G. Polat / Expert Systems with Applications 36 (2009) Fig. 4. Comparison of FRM1 and FRM2 in terms of average number of the selected features. Table 3c Average accuracy of the classifiers associated with R2W2 algorithm. Feature set # of features MMR SVM1 SVM2 SET1-FRM SET1-FRM SET2-FRM SET2-FRM Table 3d Average accuracy of the classifiers associated with MUTINF algorithm. Feature Set # of features MMR SVM1 SVM2 SET1-FRM SET1-FRM SET2-FRM SET2-FRM Table 4 Average accuracy of classifiers associated with the feature selection frameworks. Framework MMR SVM1 SVM2 FRM FRM in FRM1. This behavior of the algorithms in each framework explains the higher success rate of the classifiers in FRM1, compared to in FRM2, which is summarized in Table Emotion detection using multi-class SVM classifier After successful application of the proposed framework using SFS algorithm, a comprehensive comparison is carried out for all three multi-class classifiers using the feature sets selected by LSBOUND, R2W2 and MUTINF algorithms, respectively. The accuracy of the classifiers associated with a specific feature selection algorithm is illustrated in Table 3. As seen from Table 3, the accuracy of the classifiers is improved considerably. Especially LSBOUND algorithm seems to be the most successful in reducing CV error of the classifiers (see Table 3b). In order to clearly illustrate the effect of the proposed frameworks, the feature construction strategies and the feature selection algorithms, Table 3 is summarized in Tables 4, 5 and 6, respectively. As it is seen in Table 4, the multi-class classifiers in FRM1 is more successful in reducing average CV error. Among them SVM1 is able to produce an average CV error as low as (82.5% accuracy). The success of SVM1 is also apparent in the second framework producing an error level of (81.3% accuracy). The classifiers MMR and SVM2 are comparable to each other in terms of average CV error. In terms of the unification and intersection operators, it is apparent that the classifier SVM1 is able to produce less error compared to the other multi-classifiers as seen in Table 5. It is especially remarkable in the case where SVM1 is trained using the final feature sets obtained by the unification approach. There is no clear evidence, however, to reach a decisive conclusion on which feature construction strategy is preferable. Although, in general, the unification operation is able to reduce the average CV error further, compared to the intersection operation on the subsets, the average number of features in the SET1s and SET2s is given and 27.75, respectively. This finding favors the intersection operator. So it is difficult to prefer one approach over another. Lastly, a comparison between feature selection algorithms is illustrated in Table 6 with respect to the two feature selection frameworks. In both cases, the newly proposed LSBOUND based feature selection algorithm clearly outperforms the rest of the algorithms generating significantly lower average CV errors. The best accuracy is obtained by SVM1 with a CV error of (85.5% accuracy), when the LSBOUND based feature selection algorithm is Table 5 Average accuracy of classifiers associated with the feature construction strategies. Feature construction strategy Number of selected feature in average MMR SVM1 SVM2 Accuracy Accuracy Accuracy (%) (%) (%) SET1 (intersection) SET2 (unification) Table 6 Average accuracy of classifiers associated with particular feature selection algorithms. Feature construction algorithm Framework1 Framework2 Accuracy (%) Accuracy (%) SFS LSBOUND MUTINF R2W

8 H. Altun, G. Polat / Expert Systems with Applications 36 (2009) Table 7 Average computation time of the feature selection algorithms (in s). SFS LSBOUND R2W employed (see Table 3b). The obtained accuracy rate is slightly lower than human classification accuracy of 87.4% (Burkhardt et al., 2005). Furthermore, our result is the most successful achievement using the same number of emotional classes of Berlin Emotional Dataset compared to a recent study by Shami and Verhelst (2007). These results show that feature selection and multi-classification strategy is very effective in the multi-class emotion detection problem. A successful application of newly proposed LSBOUND feature selection algorithm has been also indicated. A cross validation error of as low as ± 0.02 is achieved when LSBOUND algorithm is used in the frameworks. Furthermore, compared to the wrapper type algorithm, (i.e. SFS, and the state-of-the-art algorithm, R2W2) the computational complexity of LSBOUND in the frameworks is not heavy as seen from Table Conclusion In this paper, an evaluation of the newly proposed feature selection frameworks and the construction approaches are carried out using four feature selection algorithms and three different multiclass classifiers in the emotion detection problem. Results show that the first framework where more informative features have been selected in terms of distinguishing one emotional state from the rest is more successful in achieving higher accuracy. It has also been shown that among all of the four different feature selection algorithms, the recently proposed LSBOUND based feature selection is superior in terms of reducing average CV error. Furthermore, the results have shown that SVM1 (one-vs-one approach to multiclassification scheme) outperforms the other type of multi-task classifiers investigated and is consistently able to give higher accuracy in emotion detection. It is found that among all features groups, the prosodic and subband energy features are most selected ones by all the algorithms in each framework. Also the success of the LSBOUND algorithm suggests that MFCC features are more informative than LPC features in terms of reducing CV error of multi-class classifiers. The achievement of the classification accuracy validates that the proposed framework boosts the selection of more informative features from speech related feature groups. Acknowledgments This work has been sponsored by TUBITAK Project under the contract of 104E179. Corresponding author also would like to thank Prof. Dr. J Shawe-Taylor for his hospitality and guidance during the academic visit in the summer of 2006 granted by TUB _ ITAK at University of Southampton and at University College of London. We would also like to thank Dr. S. Szedmak for providing the MMR algorithm. References Abe, N., & Kudo, M. (2006). Non-parametric classifier-independent feature selection. Pattern Recognition, 39, Altun, H., & Polat, G. (2007). New Frameworks to boost feature selection algorithms in emotion detection for improved human computer interaction, brain vision and artificial intelligent. In Lecturer notes in computer science, LNCS (Vol. 4729, pp ). Berlin, Heidelberg: Springer-Verlag. Altun, H., Shawe-Taylor, J., & Polat, G. (2007). New feature selection frameworks. In Emotion recognition to evaluate the informative power of speech related features, ISSPA07, ninth international symposium on signal processing and its applications. Bennett, B. J., Embrechts, K., Breneman, C. M., & Song, M. (2003). Dimensionality reduction via sparse support vector machines. Journal of Machine Learning Research, JMLR, 3, Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., & Weiss, B. (2005). A database of german emotional speech. In Proceedings Interspeech. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., & Taylor, J. G. (2001). Emotion recognition in human computer interaction. IEEE Signal Processing Magazine, 18(1), Dogan, G. (2006). Emotion detection using neural networks. M.Sc. thesis, Nigde University. Fernandez, R., & Picard, R. W. (2005). Classical and novel discriminant features for affect recognition from speech. In Interspeech 2005, Eurospeech 9th European Conference on Speech Communication and Technology (pp ). Fragopanagos, N., & Taylor, J. G. (2005). Emotion recognition in human computer interaction. In International symposium on neural networks (pp ). Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, Jain, A., & Zongker, D. (1997). Feature selection: Evaluation application and small sample performance. IEEE Transactions on PAMI, 19(2), Juslin, P. N., & Scherer, K. R. (Eds.). (2005). The new handbook of methods in nonverbal behavior research. Oxford, UK: Oxford University Press. Mladenić, D., Brank, J., Grobelnik, M., & Milic-Frayling, N. (2004). Feature selection using linear classifier weights: Interaction with classification models. In Proceedings of the 27th annual international ACM SIGIR conference on research and development (pp ). Pantic, M., Sebe, N., Cohn, J. F., & Huang, T. (2005). Affective multimodal human computer interaction. In Proceedings of ACM international conference on multimedia (pp ). Polat, G., & Altun, H. (2007a). Evaluation of performance of KNN, MLP and RBF classifiers in emotion detection problem. In IEEE 15th signal processing and communications applications. doi: /siu Polat, G., & Altun, H. (2007b). Determining efficiency of speech feature groups in emotion detection. In IEEE 15th signal processing and communications applications. doi: /siu Reunanen, J. (2003). Overfitting in making comparisons between variable selection methods. Journal of Machine Learning Research, 3, Reynolds, C., Ishikawa, M., & Tsujino, H. (2006). Realizing affect in speech classification in real-time. In Aurally informed performance: Integrating machine listening and auditory presentation in robotic systems, conjunction with AAAI Fall Symposia. Schuller, B., Arsic, D., Wallhoff, F., & Rigoll, G. (2006). Emotion recognition in the noise applying large acoustic feature sets. Speech Prosody. Dresden. Shami, M., & Verhelst, W. (2007). An evaluation of the robustness of existing supervised machine learning approaches to the classification of emotions in speech. Speech Communication, 49, Szedmak, S., Saunders, C. J., Shawe-Taylor, J., & Rousu, J. (2005). Learning hierarchies at two-class complexity. In Workshop on kernel methods and structured domains. Ververidis, D., & Kotropoulos, D. (2004). Automatic speech classification to five emotional states based on gender information. In Proceedings 12th European signal processing conference (EUSIPCO) (pp ). Ververidis, D., & Kotropoulos, D. (2005). Emotional speech classification using Gaussian mixture models and the sequential floating forward selection algorithm. In IEEE international conference on multimedia and expo (ICME) (pp ). Wang, Y., & Guan, L. (2004). An investigation of speech based human emotion recognition. In IEEE sixth workshop on multimedia signal processing (pp ). Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2000). Feature selection for SVMs, advances in neural information processing systems, neural information processing systems (NIPS) (pp ). Weston, J., Elisseeff, A., Bakır, G., & Sinz, F. (2006). The spider software package. < Xiao, Z., Dellandrea, E., Dou, W., & Chen, L. (2005). Features extraction and selection for emotional speech classification. In Proceedings of IEEE conference on advanced video and signal based surveillance (AVSS) (pp ). Yacoub, S., Simske, S., Lin, X., & Burns, J. (2003). Recognition of emotions in interactive voice response system. In Eighth European conference on speech communication and technology (pp ). Zaffalon, M., & Hutter, M. (2002). Robust feature selection by mutual information distributions. In A. Darwiche, N. Friedman (Eds.), UAI-2002: Proceedings of the 18th conference on uncertainty in artificial intelligence (pp ). San Francisco, USA: Morgan Kaufmann. Zhou, X., & Mao, K. Z. (2005). LS bound based gene selection for DNA microarray data. Bioinformatics, 21(8),

Emotion Detection from Speech

Emotion Detection from Speech Emotion Detection from Speech 1. Introduction Although emotion detection from speech is a relatively new field of research, it has many potential applications. In human-computer or human-human interaction

More information

Recognition of Emotions in Interactive Voice Response Systems

Recognition of Emotions in Interactive Voice Response Systems Recognition of Emotions in Interactive Voice Response Systems Sherif Yacoub, Steve Simske, Xiaofan Lin, John Burns HP Laboratories Palo Alto HPL-2003-136 July 2 nd, 2003* E-mail: {sherif.yacoub, steven.simske,

More information

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance

Automatic Evaluation Software for Contact Centre Agents voice Handling Performance International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 Automatic Evaluation Software for Contact Centre Agents voice Handling Performance K.K.A. Nipuni N. Perera,

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Gender Identification using MFCC for Telephone Applications A Comparative Study

Gender Identification using MFCC for Telephone Applications A Comparative Study Gender Identification using MFCC for Telephone Applications A Comparative Study Jamil Ahmad, Mustansar Fiaz, Soon-il Kwon, Maleerat Sodanil, Bay Vo, and * Sung Wook Baik Abstract Gender recognition is

More information

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

A fast multi-class SVM learning method for huge databases

A fast multi-class SVM learning method for huge databases www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,

More information

Tracking and Recognition in Sports Videos

Tracking and Recognition in Sports Videos Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey mustafa.teke@gmail.com b Department of Computer

More information

The Artificial Prediction Market

The Artificial Prediction Market The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

More information

A Feature Selection Methodology for Steganalysis

A Feature Selection Methodology for Steganalysis A Feature Selection Methodology for Steganalysis Yoan Miche 1, Benoit Roue 2, Amaury Lendasse 1, Patrick Bas 12 1 Laboratory of Computer and Information Science Helsinki University of Technology P.O. Box

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

D-optimal plans in observational studies

D-optimal plans in observational studies D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

Overfitting in Making Comparisons Between Variable Selection Methods

Overfitting in Making Comparisons Between Variable Selection Methods Journal of Machine Learning Research 3 (2003) 1371-1382 Submitted 5/02; Published 3/03 Overfitting in Making Comparisons Between Variable Selection Methods Juha Reunanen ABB, Web Imaging Systems P.O. Box

More information

Towards better accuracy for Spam predictions

Towards better accuracy for Spam predictions Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 czhao@cs.toronto.edu Abstract Spam identification is crucial

More information

A Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression

A Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression A Hybrid Forecasting Methodology using Feature Selection and Support Vector Regression José Guajardo, Jaime Miranda, and Richard Weber, Department of Industrial Engineering, University of Chile Abstract

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Sub-class Error-Correcting Output Codes

Sub-class Error-Correcting Output Codes Sub-class Error-Correcting Output Codes Sergio Escalera, Oriol Pujol and Petia Radeva Computer Vision Center, Campus UAB, Edifici O, 08193, Bellaterra, Spain. Dept. Matemàtica Aplicada i Anàlisi, Universitat

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Which Is the Best Multiclass SVM Method? An Empirical Study

Which Is the Best Multiclass SVM Method? An Empirical Study Which Is the Best Multiclass SVM Method? An Empirical Study Kai-Bo Duan 1 and S. Sathiya Keerthi 2 1 BioInformatics Research Centre, Nanyang Technological University, Nanyang Avenue, Singapore 639798 askbduan@ntu.edu.sg

More information

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,

More information

Integration of Negative Emotion Detection into a VoIP Call Center System

Integration of Negative Emotion Detection into a VoIP Call Center System Integration of Negative Detection into a VoIP Call Center System Tsang-Long Pao, Chia-Feng Chang, and Ren-Chi Tsao Department of Computer Science and Engineering Tatung University, Taipei, Taiwan Abstract

More information

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition Send Orders for Reprints to reprints@benthamscience.ae The Open Electrical & Electronic Engineering Journal, 2014, 8, 599-604 599 Open Access A Facial Expression Recognition Algorithm Based on Local Binary

More information

Hyperspectral images retrieval with Support Vector Machines (SVM)

Hyperspectral images retrieval with Support Vector Machines (SVM) Hyperspectral images retrieval with Support Vector Machines (SVM) Miguel A. Veganzones Grupo Inteligencia Computacional Universidad del País Vasco (Grupo Inteligencia SVM-retrieval Computacional Universidad

More information

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

More information

Multiple Kernel Learning on the Limit Order Book

Multiple Kernel Learning on the Limit Order Book JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John Shawe-Taylor

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

Speech and Network Marketing Model - A Review

Speech and Network Marketing Model - A Review Jastrzȩbia Góra, 16 th 20 th September 2013 APPLYING DATA MINING CLASSIFICATION TECHNIQUES TO SPEAKER IDENTIFICATION Kinga Sałapa 1,, Agata Trawińska 2 and Irena Roterman-Konieczna 1, 1 Department of Bioinformatics

More information

Machine Learning in FX Carry Basket Prediction

Machine Learning in FX Carry Basket Prediction Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines

More information

Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Journal of Computational Information Systems 10: 17 (2014) 7629 7635 Available at http://www.jofcis.com A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Tian

More information

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring 714 Evaluation of Feature election Methods for Predictive Modeling Using Neural Networks in Credits coring Raghavendra B. K. Dr. M.G.R. Educational and Research Institute, Chennai-95 Email: raghavendra_bk@rediffmail.com

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Multiclass Classification. 9.520 Class 06, 25 Feb 2008 Ryan Rifkin

Multiclass Classification. 9.520 Class 06, 25 Feb 2008 Ryan Rifkin Multiclass Classification 9.520 Class 06, 25 Feb 2008 Ryan Rifkin It is a tale Told by an idiot, full of sound and fury, Signifying nothing. Macbeth, Act V, Scene V What Is Multiclass Classification? Each

More information

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines. Colin Campbell, Bristol University Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

More information

Big Data - Lecture 1 Optimization reminders

Big Data - Lecture 1 Optimization reminders Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics

More information

Feature Selection with Monte-Carlo Tree Search

Feature Selection with Monte-Carlo Tree Search Feature Selection with Monte-Carlo Tree Search Robert Pinsler 20.01.2015 20.01.2015 Fachbereich Informatik DKE: Seminar zu maschinellem Lernen Robert Pinsler 1 Agenda 1 Feature Selection 2 Feature Selection

More information

A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

A MapReduce based distributed SVM algorithm for binary classification

A MapReduce based distributed SVM algorithm for binary classification A MapReduce based distributed SVM algorithm for binary classification Ferhat Özgür Çatak 1, Mehmet Erdal Balaban 2 1 National Research Institute of Electronics and Cryptology, TUBITAK, Turkey, Tel: 0-262-6481070,

More information

Scalable Developments for Big Data Analytics in Remote Sensing

Scalable Developments for Big Data Analytics in Remote Sensing Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,

More information

New Ensemble Combination Scheme

New Ensemble Combination Scheme New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,

More information

Employer Health Insurance Premium Prediction Elliott Lui

Employer Health Insurance Premium Prediction Elliott Lui Employer Health Insurance Premium Prediction Elliott Lui 1 Introduction The US spends 15.2% of its GDP on health care, more than any other country, and the cost of health insurance is rising faster than

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Establishing the Uniqueness of the Human Voice for Security Applications

Establishing the Uniqueness of the Human Voice for Security Applications Proceedings of Student/Faculty Research Day, CSIS, Pace University, May 7th, 2004 Establishing the Uniqueness of the Human Voice for Security Applications Naresh P. Trilok, Sung-Hyuk Cha, and Charles C.

More information

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications

The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications Forensic Science International 146S (2004) S95 S99 www.elsevier.com/locate/forsciint The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications A.

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

Support Vector Machines for Dynamic Biometric Handwriting Classification

Support Vector Machines for Dynamic Biometric Handwriting Classification Support Vector Machines for Dynamic Biometric Handwriting Classification Tobias Scheidat, Marcus Leich, Mark Alexander, and Claus Vielhauer Abstract Biometric user authentication is a recent topic in the

More information

An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset

An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

More information

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

Using artificial intelligence for data reduction in mechanical engineering

Using artificial intelligence for data reduction in mechanical engineering Using artificial intelligence for data reduction in mechanical engineering L. Mdlazi 1, C.J. Stander 1, P.S. Heyns 1, T. Marwala 2 1 Dynamic Systems Group Department of Mechanical and Aeronautical Engineering,

More information

Music Mood Classification

Music Mood Classification Music Mood Classification CS 229 Project Report Jose Padial Ashish Goel Introduction The aim of the project was to develop a music mood classifier. There are many categories of mood into which songs may

More information

Decompose Error Rate into components, some of which can be measured on unlabeled data

Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Theory Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Decomposition for Regression Bias-Variance Decomposition for Classification Bias-Variance

More information

How to Improve the Sound Quality of Your Microphone

How to Improve the Sound Quality of Your Microphone An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies Andreas Maier, Julian Exner, Stefan Steidl, Anton Batliner, Tino Haderlein, and Elmar Nöth Universität Erlangen-Nürnberg,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Annotated bibliographies for presentations in MUMT 611, Winter 2006

Annotated bibliographies for presentations in MUMT 611, Winter 2006 Stephen Sinclair Music Technology Area, McGill University. Montreal, Canada Annotated bibliographies for presentations in MUMT 611, Winter 2006 Presentation 4: Musical Genre Similarity Aucouturier, J.-J.

More information

SUPPORT VECTOR MACHINE (SVM) is the optimal

SUPPORT VECTOR MACHINE (SVM) is the optimal 130 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 1, JANUARY 2008 Multiclass Posterior Probability Support Vector Machines Mehmet Gönen, Ayşe Gönül Tanuğur, and Ethem Alpaydın, Senior Member, IEEE

More information

Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features

Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features 22 Psychological Motivated Multi-Stage Emotion Classification Exploiting Voice Quality Features Marko Lugger and Bin Yang University of Stuttgart Germany Open Access Database www.intechweb.org 1. Introduction

More information

Chapter 7. Feature Selection. 7.1 Introduction

Chapter 7. Feature Selection. 7.1 Introduction Chapter 7 Feature Selection Feature selection is not used in the system classification experiments, which will be discussed in Chapter 8 and 9. However, as an autonomous system, OMEGA includes feature

More information

Data Mining: A Preprocessing Engine

Data Mining: A Preprocessing Engine Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn

More information

Cross-Validation. Synonyms Rotation estimation

Cross-Validation. Synonyms Rotation estimation Comp. by: BVijayalakshmiGalleys0000875816 Date:6/11/08 Time:19:52:53 Stage:First Proof C PAYAM REFAEILZADEH, LEI TANG, HUAN LIU Arizona State University Synonyms Rotation estimation Definition is a statistical

More information

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

More information

Feature Selection vs. Extraction

Feature Selection vs. Extraction Feature Selection In many applications, we often encounter a very large number of potential features that can be used Which subset of features should be used for the best classification? Need for a small

More information

Support Vector Machine. Tutorial. (and Statistical Learning Theory)

Support Vector Machine. Tutorial. (and Statistical Learning Theory) Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@nec-labs.com 1 Support Vector Machines: history SVMs introduced

More information

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu

More information

Domain Classification of Technical Terms Using the Web

Domain Classification of Technical Terms Using the Web Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

Choosing Multiple Parameters for Support Vector Machines

Choosing Multiple Parameters for Support Vector Machines Machine Learning, 46, 131 159, 2002 c 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. Choosing Multiple Parameters for Support Vector Machines OLIVIER CHAPELLE LIP6, Paris, France olivier.chapelle@lip6.fr

More information

Ensemble Approach for the Classification of Imbalanced Data

Ensemble Approach for the Classification of Imbalanced Data Ensemble Approach for the Classification of Imbalanced Data Vladimir Nikulin 1, Geoffrey J. McLachlan 1, and Shu Kay Ng 2 1 Department of Mathematics, University of Queensland v.nikulin@uq.edu.au, gjm@maths.uq.edu.au

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

Classification by Pairwise Coupling

Classification by Pairwise Coupling Classification by Pairwise Coupling TREVOR HASTIE * Stanford University and ROBERT TIBSHIRANI t University of Toronto Abstract We discuss a strategy for polychotomous classification that involves estimating

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Machine Learning in Spam Filtering

Machine Learning in Spam Filtering Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.

More information

Model Trees for Classification of Hybrid Data Types

Model Trees for Classification of Hybrid Data Types Model Trees for Classification of Hybrid Data Types Hsing-Kuo Pao, Shou-Chih Chang, and Yuh-Jye Lee Dept. of Computer Science & Information Engineering, National Taiwan University of Science & Technology,

More information

Hardware Implementation of Probabilistic State Machine for Word Recognition

Hardware Implementation of Probabilistic State Machine for Word Recognition IJECT Vo l. 4, Is s u e Sp l - 5, Ju l y - Se p t 2013 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Hardware Implementation of Probabilistic State Machine for Word Recognition 1 Soorya Asokan, 2

More information

Comparison of machine learning methods for intelligent tutoring systems

Comparison of machine learning methods for intelligent tutoring systems Comparison of machine learning methods for intelligent tutoring systems Wilhelmiina Hämäläinen 1 and Mikko Vinni 1 Department of Computer Science, University of Joensuu, P.O. Box 111, FI-80101 Joensuu

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

1 Maximum likelihood estimation

1 Maximum likelihood estimation COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

More information

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

Visualization of large data sets using MDS combined with LVQ.

Visualization of large data sets using MDS combined with LVQ. Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk

More information

MHI3000 Big Data Analytics for Health Care Final Project Report

MHI3000 Big Data Analytics for Health Care Final Project Report MHI3000 Big Data Analytics for Health Care Final Project Report Zhongtian Fred Qiu (1002274530) http://gallery.azureml.net/details/81ddb2ab137046d4925584b5095ec7aa 1. Data pre-processing The data given

More information