Sparse Representation using Nonnegative Curds and Whey

Size: px
Start display at page:

Download "Sparse Representation using Nonnegative Curds and Whey"

Transcription

1 Sparse Representation using Nonnegative Curds and Whey Yanan Liu, Fei Wu, Zhihuang Zhang, Yueting Zhuang College of Computer Science and Technology, Zhejiang University, China {liuyn, wufei, zhzhang, Shuicheng Yan Department of Electrical and Computer Engineering, National University of Singapore, Singapore Abstract It has been of great interest to find sparse and/or nonnegative representations in computer vision literature. In this paper we propose a novel method to such a purpose and refer to it as nonnegative curds and whey (NNCW). The NNCW procedure consists of two stages. In the first stage we consider a set of sparse and nonnegative representations of a test image, each of which is a linear combination of the images within a certain class, by solving a set of regressiontype nonnegative matrix factorization problems. In the second stage we incorporate these representations into a new sparse and nonnegative representation by using the group nonnegative garrote. This procedure is particularly appropriate for discriminant analysis owing to its supervised and nonnegativity nature in sparsity pursuing. Experiments on several benchmark face databases and Caltech 0 image dataset demonstrate the efficiency and effectiveness of our nonnegative curds and whey method.. Introduction The problem of finding a sparse representation for the data has become an interesting topic recently in computer vision and pattern recognition. The essential challenge to be resolved in sparse representation is to develop an efficient approach with which each sample could be reconstructed from its sparse representation. Nonnegative matrix factorization (NMF) [2, 3] is an important technique for finding such a representation. It is well shown that NMF is able to produce such a sparse representation in a collective way [0, ]. Moreover, the nonnegativity constraint makes the representation easy to interpret due to purely additive combinations of nonnegative basis vectors. The NMF technique has been successfully applied in computer vision and pattern recognition, especially for image analysis [2]. Many of these applications are under an unsupervised setting and incidentally ignored the correlations within a same class and the disparity between different classes. Under the supervised setting, however, the NMF can be regarded as a nonnegative garrote [2]. Figure. An exemplar illustration of sparse representation using nonnegative curds and whey. In this paper we consider a supervised setting for image representation as well as image classification. With the empirically validated discriminativity of sparse representation in classification, ideally, a test image outside training images can be represented just in terms of the training images and only coefficients of those samples which belong to the same class with test image may be nonzero. That is, a valid test image could be sufficiently represented by the training samples from the same class. Sparse representation could expedite classification when the number of classes is reasonably large. The sparser the coefficients are, the easier the test sample is accurately assigned to its class label. Therefore, when the test image is expressed as linear superposition of all the training images, the coefficient vector is expected to be sparse and nonnegative.

2 In particular, we model the sparse representation by using a nonnegative curds and whey (NNCW) method. The key idea takes advantage of similarity within the same class and disparity between different classes to formulate the classification problem as two consequent linear regressions. That is to say, a test image is represented as a nonnegative weighted combination of all the training images. For this combination, we introduce two sets of sparse nonnegative weight coefficients, one of which is for each training image within a certain class and another is for each class. Our work is motivated by the latest work of Wright et al. [22], which cast the face recognition problem as a liner regression problem with sparse constraints for regression coefficients. To solve the regression problem, Wright et al. [22] reformulated it as the lasso problem [20]. Lassobased sparse representation was also used for image annotation with multiple tags [2], classification [7], and clustering [6]. For example, Wang et al. [2] proposed a multi-label sparse coding framework for automatic image annotation, which takes advantage of the l -norm based reconstruction coefficients. In [7], an empirical Bayesian approach to sparse regression and classification is presented, which does not involve any parameters controlling the degree of sparseness. Elhamifar and Vidal [6] introduced a sparse representation-based method to cluster data from multiple low-dimensional subspaces embedded in a highdimensional space. However, the lasso makes the representation unnecessarily additive. This might result in that the representation is not interpretable as NMF. Moreover, the class label or discriminant information from the training set was not apparently incorporated during constructing sparse representation, which may limit the ultimate classification accuracy. Our proposed method can circumvent these limitations since the two steps of linear regressions not only utilize the discriminative class information but also impose the nonnegativity constraint for each coefficient. Beyond the image classification in question, the nonnegative curds and whey is also related to the group nonnegative garrote [23], which is a grouped extension of the conventional nonnegative garrote. The estimate of the the regression coefficients in group nonnegative garrote for individual variables is based on the least squares error, thus these coefficients are not necessarily nonnegative and zero. Figure illustrates the overall procedure of the proposed NNCW method. Intuitively, only one regression model is learned by lasso-based representation methods without utilizing any discriminative label information and putting nonnegative constraints on each sample during convex optimization. However, NNCW first obtains m independent representations (called curds) from each class and then uses curds to redefine a new representation (called whey). Those two kinds of regression models are constructed in consequent order, and the later step directly output the class label information. The rest of this paper is organized as follows. In Section 2, we introduce the details on nonnegative curds and whey (NNCW) method for image representation and classification. Section 3 reviews the related work. Experiment results are reported in Section 4. Finally, we conclude this work in Section Methodology Given a set of n training samples, X = {x,..., x n } R d, where x i is a d-dimensional feature vector representing an image. Here the images are assumed to be grouped into m disjoint classes and each x i belongs to one and only one class. Let n j be the cardinality of the jth class. We then have m j= n j = n. Without loss of generality, we put the samples in the jth class into a d n j matrix X j. Accordingly, we form an d n training data matrix X = [X,..., X m ]. Our current concern is that of training a classifier when the sparse representation of a test image y R d is constructed from the training data, we can predict its corresponding label. The basic idea is to devise a sparse representation approach for the development of classifiers. Before formally presenting our method, we give some notation to be used in this paper. For a p vector a = (a,..., a p ) T, we by a 2 denote the l 2 -norm of a (i.e., p a 2 = j= a2 j ), by a denote the l -norm of a (i.e., a = p j= a j ) and by a 0 denote the l 0 -norm of a (i.e., the number of nonzero entries of a). The sparse representation-based classification approach proposed in this paper learns two inalienable linear regression models with nonnegative coefficient constraints under supervised learning framework. The direct point of our approach is to take discriminative information to make the classifier more interpretable (therefore structural) and an additive model. 2.. Nonnegative Curds and Whey Procedure Our proposed sparse representation approach consists of two stages. In the first stage we consider m linear regression models by treating y as the response and each image from X j as one basis. That is, the jth regression problem is based on: y = X T j b j + ɛ j, () where ɛ j is an error term, and b j = (b j,... b j,nj ) T R nj for j =,..., m. Recall that y and x i represent reference and training images respectively, so they are typically encoded as nonnegative values. The idea behind nonnegative matrix factorization for learning parts-based representation [2] inspires us to impose nonnegativity on the repression vectors b i. As a

3 result, we have the following optimization problems, for j =,..., m b j 2 y XT j b j λ j s.t. b jl 0, l n j b jl, l= where the λ j 0 are tunable weighting parameters. For j =,..., m, each optimization problem in (2) is a nonnegative garrote model [2]. The nonnegative garrote can be efficiently solved by classical numerical methods such as the least angle regression (LARS) [5] and pathwise coordinate method [8]. However, we follow Breiman s original implementation [2] to solve the optimization problems. The approach used in [2] for optimization is to shrink each ordinary least squares (OLS) estimated coefficient by a nonnegative amount whose sum is subject to an upper bound constraint (the garrote). Let the ˆb j = (ˆb j,..., ˆb jnj ) T be the estimate of the b j. As n j l= b jl = a, the b j should be sparse, and this leads to m sparse representations of y. We express them as z j = X j ˆbj for j =,..., m. Actually, since ˆb j is the reconstruction coefficients learned from samples within the jth class, all of ˆb j could be used to denote the difference of the different classes if we put ˆb j together. Therefore, to capture the class label information from the training samples and further make use of the disparity of different classes, we consider the following optimization problem, m c,...c m 2 y m c j z j λ p j c j, j= s.t. c j 0, j j= where λ 0 is a tunable weighting parameter, and the p j > 0 are degrees of penalties. In all experiments, we set p j = n j /n. The optimization problem in (3) is also a nonnegative garrote model. We can solve it again by using Breiman s implementation [2]. As we can see, the second stage further refines the representation of y by using the representations of y obtained in the first stage. In particular, y is now represented as j= (2) (3) m m u = ĉ j z j = ĉ j X ˆb j j. (4) j= As m j= p jc j can be considered as weighted l -norm of c = [c, c 2,..., c m ] T, some of the ĉ j are zeros, and the representation is sparse. Moreover, if ĉ j = 0 for a j {,..., m}, it means all samples from the jth class are eliminated from this representation due to ĉ j ˆbj = 0 and the test image y therefore apparently does not belong to the jth class. Algorithm NNCW (nonnegative curds and whey) : procedure NNCW({X,..., X m } R d ; y R d ) 2: Curds: Solve the following optimization problems b j 2 y XT j b j λ j s.t. b jl 0, l, n j b jl, l= for j =,..., m. 3: Whey: Solve the following optimization problem m c,...c m 2 y m c j z j λ p j c j, j= s.t. c j 0, j j= 4: Output: The class label k of the test sample y is 5: end procedure k = y ĉ j X T ˆb j j 2 2. j The optimization problems in (2) define m independent representations of y, which we call curds. The optimization problem in (3) then takes advantages of such m curds to define a new representation, which we call whey. Since we impose the the nonnegativity constraints on the b j as well as the c j, we refer to our method for sparse image representation as the nonnegative curds and whey (NNCW) Classification Procedure Given the test sample y and its corresponding ˆb j and ĉ j for j =,..., m, which are obtained from the NNCW method, we are now concerned with the class label of y. Ideally, the nonzero ĉ j indicates the class to which the test sample y belongs. However, it is not always the case that there is only one nonzero ĉ j. Thus, we allocate y to the kth class with k = y ĉ j X ˆb j j 2 2. (5) j We summarize the entire NNCW method in Algorithm. 3. Related Work A so-called curds and whey method was first proposed by Breiman and Friedman and was a form of multivariate shrinking [3]. However, the main purpose of the method in [3] is to improve predictive accuracy in multiple linear regression by using correlations between the response variables. As discussed before, NNCW first makes use of intra-class information to generate m independent representations (curds), then consequently utilizes inter-class information to generate a linear regression model (whey) for discriminative learning.

4 To some extent, our proposed NNCW can be regarded as a variant of the group lasso [23]. Group lasso is a natural extension of lasso and the covariates in group lasso are assumed to be clustered in groups. Intuitively, Group lasso will derive all the weights in one group to zero together and thus lead to group selection. Different from NNCW in this paper which put nonnegative efficient constraints on each training samples and each class, there is no nonnegative coefficient constraint on group lasso. Specially, NNCW is closely related to the group nonnegative garrote [23]. The main difference lies in that the group nonnegative garrote instead uses z j = X T j bls j, where b LS j is the least square estimate. In this case, z j is not guaranteed to be nonnegative, although X j is nonnegative. Thus, z j may no longer represent a real image. Moreover, owing to its explicit dependence on the full least square estimates, in problems where the sample size is small relative to the total number of variables, the group nonnegative garrote may perform suboptimally. Naturally, group nonnegative garrote is not robust to image noise and occlusion, and thus we do not compare our algorithm with it and focus on sparse related algorithm instead. It is worth pointing out that the sparse representation in [22] tries to solve the following problem: β β 0, subject to Xβ = y, (6) where β = (b,..., b m ) T. However, this problem is NPhard []. Based on the sparse theory from Donoho [4], Wright et al. [22] thus consider the following alternative: β 2 y Xβ λ n m j b ji, (7) j= i= which is essentially the lasso model. On one hand, this sparse model for classification does not consider the discriminative class information, which is definitely useful for classification. Moreover, in many regression problems, we are interested in finding important integratable factors in predicting the categorical information, where each factor may be represented by a group of derived variables. On the other hand, since β is not nonnegative, such sparse representations do not have interpretable properties as NMF and NNCW. The strength of this work is to integrate sparse coding, nonnegative data factorization and supervise learning together in NNCW framework. The two inseparate learned linear regression models encode similarity and disparity information useful for data classification. Moreover, the nonnegative constraints and natural sparsity in NNCW make it to be more interpretable. 4. Experiments In this section, we investigate the applications of our nonnegative curds and whey (NNCW) method in face recognition and image classification. We compare NNCW with three popular classification methods: nearest neighbor (NN), naive Bayes (NB), linear support vector machines (SVM), as well as sparse-representation based classification (SRC) proposed by Wright et al. in [22] and group lasso (glasso) [23]. The tuning parameters λ and λ j (j =,..., m) are evaluated by 0-fold cross-validation to avoid over-fitting. Four face databases and one image dataset were used. The face databases include ORL database [8], Extended Yale B database [9], AR face database [7] and CMU PIE face database [9]. Specifically these four face databases focus on frontal faces, illumination condition variations, occlusions and different poses, respectively. We also conducted experiments on Caltech 0 image dataset [5]. 4.. Visualization on Face Dataset We first give a visualization comparison of nonnegative curds and whey (NNCW) with lasso-based sparse representation (LSR) and group lasso (glasso) on a subset of ORL dataset. We chose 0 persons from ORL database, and 9 images per subject to comprise the training data. Then the one remaining image per person is treated as test sample. Figure 2, 3 and 4 respectively demonstrate the visualization results of sparse representation by LSR, glasso and our NNCW with the test sample from the first individual. Figure 4(a) shows the first stage of NNCW which computes z j, j 0, and Figure 4(b) illustrates the second stage of NNCW, which refines the sparse representation of y in the first stage. We can see that by incorporating the class label information in the second stage of NNCW, the optimized sparse estimates of class weights c j ( j 0) lead to quite sparse coefficients for test sample y. Specifically, after the computation in NNCW, we obtain an optimized c (i.e., 0.86) and b (i.e., [0, 0, 0, 0, 0.58, 0, 0, 0.0, 0]) to estimate y. These two sets of estimation parameters can reconstruct y effectively. In Figure 2, except for the first class to which the test sample belongs, the weights of other classes calculated with LSR are not sparse enough. That is why the reconstruction of y is not so good as NNCW. Besides, although glasso chose the right class in Figure 3, the reconstruction result of NNCW is much better than that of glasso. A possible explanation is that the negative coefficients of glasso bring negative visual affect for the representation of images Recognition on frontal faces The ORL database consists of 400 face images of 40 people (0 samples per person). The images with 92 2 pixels were captured at different times and have different variations including expressions (open or closed eyes, smiling

5 Figure 2. Visualization of LSR on the sampled ORL dataset. The test image y belongs to the subject. Figure 3. Visualization of glasso on the sampled ORL dataset. The test image y belongs to the subject. or non-smiling) and facial details (glasses or no glasses). To compute the recognition rate, the images are downsampled to 48, 99, 220, and 644 feature dimensions, which correspond to downsampling ratios of /5, /0, /6, and /4, respectively. For each subject, 6 images are randomly selected for training and the rest are used for testing. been seen that all of the six algorithms achieve good performance, since ORL database contains almost all frontal faces with little pose or illumination variations. The proposed NNCW achieves the best recognition accuracy rate of 96.53%, compared to 96.04% for glasso, 95.47% for SRC, 93.64% for SVM, 94.38% for NB, and 93.44% for NN Recognition with illumination variations Figure 5. Comparison of face recognition accuracy rates on ORL database. Figure 5 shows the face recognition results on ORL database using six different classification methods. It can The Extended Yale B database consists of 244 frontalface images of 38 persons. The cropped face images were captured under various illumination conditions [4]. The illumination type is determined uniquely by azimuthal and elevational values, where the azimuth changes from -30o to 30o, and the elevation ranges from -40o to 90o. Firstly we randomly select half of the images (about 32 images per individual) for training, and the other half for testing, as in [22]. The images are downsampled to 30, 56, 20, and 504 feature dimensions, corresponding to the downsampling ratios of /32, /24, /6, and /8, respectively. From Figure 6, we can see that NNCW improves the highest recognition accuracy rate to 95.29% from 80.63% for NN, 87.25% for NB, 85.54% for SVM, 94.3% for SRC, and 94.58% for glasso. Secondly we divide the images with 504 feature dimensions into five subsets of increasing azimuth illumination

6 (a) The first stage of NNCW (b) The second stage of NNCW Figure 4. Visualization of NNCW on the sampled ORL dataset. The test image y belongs to the subject. Figure 6. Comparison of face recognition accuracy rates on Extended Yale B face database. angles, i.e., the frontal illuminated images are used as the training set, the test set, 2, 3, 4 include the face images under variant illumination conditions for which the light angle varies from 5o to 5o, from 20o to 45o, from 50o to 75o, from 75o to 30o, respectively. Figure 7 illustrates the face recognition accuracy rates under different illumination situations. We can see that the recognition accuracy rates decline with the increased lighting angles, which indicates that the variations of illumination affect the face recognition performance, especially for nearest neighbor (NN) classifier. Our NNCW achieves a recognition accuracy rate between 87.22% and 98.75%, much better than the other methods. Figure 7. Comparion of face recognition accuracy rates under different illuminations on Extended Yale B database Recognition with occlusions The AR face database comprises over 4,000 color images corresponding to the faces from 26 people (70 male and 56 female). This dataset includes frontal view faces with different facial expressions, illumination conditions, and occlusions (sun glasses and scarf). Each person participated in two sessions, separated by two weeks (4 days) time, totally 26 pictures were taken. In this experiment, as in [22], we firstly chose a subset of the database consisting of 50 male individuals and 50 female individuals. For each individual, the 3 images from Session were selected for training, and the 3 images for Session 2 were for testing. The images are firstly cropped with feature dimension of and converted into grayscale. The images are also downsampled to 30, 54, 30, 540-dimensional feature spaces, with downsampling

7 ratios of /24, /8, /2, and /6, respectively. Figure 8 shows the recognition accuracy rates for this experiment. NNCW achieves a recognition accuracy rate of 90.5% with 540 dimensional features, higher than the other methods, e.g % for NN, 83.32% for NB, 84.8% for SVM, 88.33% for SRC and 88.98% for glasso. test set 2 (c, c37), test set 3 (c02, c4), and test set 4 (c22, c34). Figure 8. Comparison of face recognition accuracy rates on AR database. Moreover, we test the classification performances with different occlusions on a subset (70 male, 55 female except women-027 due to the corrupted image w bmp) of the AR face database. We use 750 (4 each) unoccluded frontal face images as training set. Test set with sunglasses occlusion contains 750 images (6 each), and test set 2 with scarves occlusion also consists of 750 images (6 each). Table lists the face recognition accuracy rates in scenarios with occlusions (sunglasses and scarves) for six different methods. We can see that NNCW achieves best recognition accuracy rates for both occlusion conditions, though on the case with scarf occlusions the overall accuracy rate is not quite high. Sunglasses Scarves NN 69.87% 4.2% NB 75.39% 40.66% SVM 8.33% 45.48% SRC 86.28% 59.2% glasso 86.93% 6.37% NNCW 88.44% 62.9% Table. Comparison of face recognition accuracy rates with different occlusions (sunglasses and scarves) on AR database Recognition under different poses In this experiment, we evaluate six methods under different poses using CMU PIE face database. We use the frontal faces c27 as training data, and four test sets with increasing variations of pose angles, including test set (c05, c29), Figure 9. Comparison of face recognition accuracy rates on CMU PIE database. Figure 9 shows the face recognition results of six different methods. Test set contains the most near frontal images, so the recognition accuracy rates are the best. The results of the test set 2 and 3 are worse since the angle variations are larger than that for the test set. Test set 4 are almost for profile faces, which results in the worst recognition accuracy rates. The proposed NNCW method still achieves better performance than the others Image classification Caltech 0 image database contains 997 images from 0 various object categories, collected from Google image search by Li. et al. [5]. Most objects are centered and in the foreground. In order to make a robust comparison, we have selected 50 categories which contain more sample images than others, range from 60 to 800. Then we randomly chose 50 images per category for training and the remaining images are used as testing samples. We downsampled the images to 00 (0 0), 225 (5 5), 400 (20 20), and 625 (25 25)-dimensional feature vectors. From Figure 0, we observe that NNCW outperforms other methods for image classification on Caltech 0 image database. The classification accuracy of Caltech 0 is not as good as those on other face databases, since general image classification is more complicated and challenging Exploring sparseness To testify the sparse property of the proposed NNCW method, we also investigate the sparsity ratio of the estimated ˆb, defined as: Sparsity ratio = the number of zeros in ˆb the number of elements in ˆb Table 2 lists the average sparsity ratio of the estimated coefficients ˆb for different databases with SRC, glasso and NNCW. As can be seen, all the sparsity ratios are larger

8 References Figure 0. Comparion of image classification accuracies on Caltech 0 image database. than 0.5 for SRC, larger than 0.6 for glasso and NNCW. This indicates that the sparse representation takes effect in the classification task and NNCW achieves better sparsity than SRC and glasso in general. SRC glasso NNCW ORL Extended Yale B AR PIE Caltech Table 2. Comparison of average sparsity ratio between SRC, glasso and NNCW on different databases. 5. Conclusions and Future Work This paper proposed a novel sparse nonnegative image representation method, called the nonnegative curds and whey (NNCW). The NNCW method is attractive due to its natural sparsity along with its nonnegativity property and discriminating capability. The NNCW method consists of a set of the nonnegative garrote models, which are solved by using the numerical approach developed by Bremian [2]. In recent years, there are some sophisticated approaches to the nonnegative garrote such as the least angle regression [5] and pathwise coordinate method [8]. It would be interesting to implement our method via these approaches in our future work. Acknowledgements This work is supported by 973 Program (2009CB32080), National Natural Science Foundation of China ( ), National Key Technology R&D Program (2007BAHB0), Program for Changjiang Scholars and Innovative Research Team in University (IRT0652,PCSIRT). [] E. Amaldi and V. Kann. On the approximability of minimizing nonzero variables or unsatified relations in linear systems. Theoretical Computer Science, [2] L. Breiman. Better subset regression using the nonnegative garrote. Technometrics, 995., 3, 8 [3] L. Breiman and J. Friedman. Predicting multivariate responses in multiple linear regression (with discussions). J.R.Statist. Soc.B, [4] D. Donoho. For most large underdetermined systems of equations, the minimal l -norm near-solution approximates the sparsest near-solution. Comm. Pure Appl. Math., [5] B. Efron, I. Johnstone, T. Hastie, and R. Tibshirani. Least angle regression. Ann. Statist., , 8 [6] E. Elhamifar and R. Vidal. Sparse subspace clustering. In CVPR, [7] M. Figueiredo. Adaptive sparseness for supervised learning. IEEE TPAMI, [8] J. Friedman, T. Hastie, H. Hoefling, and R. Tibshirani. Pathwise coordinate optimization. Ann. Appl. Stat., , 8 [9] A. Georghiades, P. Belhumeur, and D. Kriegman. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE TPAMI, [0] P. Hoyer. Nonnegative matrix factorization with sparseness constraints. JMLR, [] J. Kim and H. Park. Sparse nonnegative matrix factorization for clustering. Technical Report CSE Technical Reports; GT- CSE-08-0, Georgia Institute of Technology, [2] D. Lee and H. Seung. Learning the parts of objects by nonnegative matrix factorization. Nature, 999., 2 [3] D. Lee and H. Seung. Algorithms for non-negative matrix factorization. In NIPS, 200. [4] K. Lee, J. Ho, and D. Kriegman. Acquiring linear subspaces for face recognition under variable lighting. IEEE TPAMI, [5] F. Li, R. Fergus, and P. Perona. Learning generative visual models from few training examples: an incremental bayesian approach tested on 0 object categories. In IEEE CVPR 2004, Workshop on Generative-Model Based Vision. 4, 7 [6] A. Martinez and R. Benavente. The AR face database. Technical Report 24, CVC, [7] F. Samaria and A. Harter. Parameterisation of a stochastic model for human face identification. In 2nd IEEE Workshop on Applications of Computer Vision, [8] T. Sim, S. Baker, and M. Bsat. The CMU pose, illumination, and expression database. IEEE TPAMI, [9] R. Tibshirani. Regression shrinkage and selection via the lasso. J.R.Statist. Soc.B, [20] C. Wang, S. Yan, L. Zhang, and H. J. Zhang. Multi-label sparse coding for automatic image annotation. In CVPR, [2] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE TPAMI, , 4, 5, 6 [22] M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. J.R.Statist. Soc.B, , 4

Lasso on Categorical Data

Lasso on Categorical Data Lasso on Categorical Data Yunjin Choi, Rina Park, Michael Seo December 14, 2012 1 Introduction In social science studies, the variables of interest are often categorical, such as race, gender, and nationality.

More information

Lasso-based Spam Filtering with Chinese Emails

Lasso-based Spam Filtering with Chinese Emails Journal of Computational Information Systems 8: 8 (2012) 3315 3322 Available at http://www.jofcis.com Lasso-based Spam Filtering with Chinese Emails Zunxiong LIU 1, Xianlong ZHANG 1,, Shujuan ZHENG 2 1

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

See All by Looking at A Few: Sparse Modeling for Finding Representative Objects

See All by Looking at A Few: Sparse Modeling for Finding Representative Objects See All by Looking at A Few: Sparse Modeling for Finding Representative Objects Ehsan Elhamifar Johns Hopkins University Guillermo Sapiro University of Minnesota René Vidal Johns Hopkins University Abstract

More information

Class-specific Sparse Coding for Learning of Object Representations

Class-specific Sparse Coding for Learning of Object Representations Class-specific Sparse Coding for Learning of Object Representations Stephan Hasler, Heiko Wersing, and Edgar Körner Honda Research Institute Europe GmbH Carl-Legien-Str. 30, 63073 Offenbach am Main, Germany

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Neural Networks for Sentiment Detection in Financial Text

Neural Networks for Sentiment Detection in Financial Text Neural Networks for Sentiment Detection in Financial Text Caslav Bozic* and Detlef Seese* With a rise of algorithmic trading volume in recent years, the need for automatic analysis of financial news emerged.

More information

A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

More information

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet

More information

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION Saurabh Asija 1, Rakesh Singh 2 1 Research Scholar (Computer Engineering Department), Punjabi University, Patiala. 2 Asst.

More information

Novelty Detection in image recognition using IRF Neural Networks properties

Novelty Detection in image recognition using IRF Neural Networks properties Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Learning A Discriminative Dictionary for Sparse Coding via Label Consistent K-SVD

Learning A Discriminative Dictionary for Sparse Coding via Label Consistent K-SVD Learning A Discriminative Dictionary for Sparse Coding via Label Consistent K-SVD Zhuolin Jiang, Zhe Lin, Larry S. Davis University of Maryland, College Park, MD, 7 Adobe Systems Incorporated, San Jose,

More information

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning SAMSI 10 May 2013 Outline Introduction to NMF Applications Motivations NMF as a middle step

More information

Subspace Analysis and Optimization for AAM Based Face Alignment

Subspace Analysis and Optimization for AAM Based Face Alignment Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China zhaoming1999@zju.edu.cn Stan Z. Li Microsoft

More information

Machine Learning Big Data using Map Reduce

Machine Learning Big Data using Map Reduce Machine Learning Big Data using Map Reduce By Michael Bowles, PhD Where Does Big Data Come From? -Web data (web logs, click histories) -e-commerce applications (purchase histories) -Retail purchase histories

More information

1 Review of Least Squares Solutions to Overdetermined Systems

1 Review of Least Squares Solutions to Overdetermined Systems cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

More information

SYMMETRIC EIGENFACES MILI I. SHAH

SYMMETRIC EIGENFACES MILI I. SHAH SYMMETRIC EIGENFACES MILI I. SHAH Abstract. Over the years, mathematicians and computer scientists have produced an extensive body of work in the area of facial analysis. Several facial analysis algorithms

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with

More information

Several Views of Support Vector Machines

Several Views of Support Vector Machines Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Regularized Logistic Regression for Mind Reading with Parallel Validation

Regularized Logistic Regression for Mind Reading with Parallel Validation Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data

Effective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA

More information

BLOCK, GROUP, AND AFFINE REGULARIZED SPARSE CODING AND DICTIONARY LEARNING

BLOCK, GROUP, AND AFFINE REGULARIZED SPARSE CODING AND DICTIONARY LEARNING BLOCK, GROUP, AND AFFINE REGULARIZED SPARSE CODING AND DICTIONARY LEARNING By YU-TSEH CHI A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

More information

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

Think of the beards as a layer on top of the face rather than part of the face itself. Using

Think of the beards as a layer on top of the face rather than part of the face itself. Using Tyler Ambroziak Ryan Fox CS 638-1 (Dyer) Spring 2010 Virtual Barber Abstract What would you look like without a beard? Or how about with a different type of beard? Think of the beards as a layer on top

More information

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

Unsupervised and supervised dimension reduction: Algorithms and connections

Unsupervised and supervised dimension reduction: Algorithms and connections Unsupervised and supervised dimension reduction: Algorithms and connections Jieping Ye Department of Computer Science and Engineering Evolutionary Functional Genomics Center The Biodesign Institute Arizona

More information

Multiple Kernel Learning on the Limit Order Book

Multiple Kernel Learning on the Limit Order Book JMLR: Workshop and Conference Proceedings 11 (2010) 167 174 Workshop on Applications of Pattern Analysis Multiple Kernel Learning on the Limit Order Book Tristan Fletcher Zakria Hussain John Shawe-Taylor

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not?

Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? Naive-Deep Face Recognition: Touching the Limit of LFW Benchmark or Not? Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com Qi Yin yq@megvii.com Abstract Face recognition performance improves rapidly

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

A comparative study on face recognition techniques and neural network

A comparative study on face recognition techniques and neural network A comparative study on face recognition techniques and neural network 1. Abstract Meftah Ur Rahman Department of Computer Science George Mason University mrahma12@masonlive.gmu.edu In modern times, face

More information

From Few to Many: Illumination Cone Models for Face Recognition Under Variable Lighting and Pose. Abstract

From Few to Many: Illumination Cone Models for Face Recognition Under Variable Lighting and Pose. Abstract To Appear in the IEEE Trans. on Pattern Analysis and Machine Intelligence From Few to Many: Illumination Cone Models for Face Recognition Under Variable Lighting and Pose Athinodoros S. Georghiades Peter

More information

Social Media Aided Stock Market Predictions by Sparsity Induced Regression

Social Media Aided Stock Market Predictions by Sparsity Induced Regression Social Media Aided Stock Market Predictions by Sparsity Induced Regression Delft Center for Systems and Control Social Media Aided Stock Market Predictions by Sparsity Induced Regression For the degree

More information

A Survey of Classification Techniques in the Area of Big Data.

A Survey of Classification Techniques in the Area of Big Data. A Survey of Classification Techniques in the Area of Big Data. 1PrafulKoturwar, 2 SheetalGirase, 3 Debajyoti Mukhopadhyay 1Reseach Scholar, Department of Information Technology 2Assistance Professor,Department

More information

Supporting Online Material for

Supporting Online Material for www.sciencemag.org/cgi/content/full/313/5786/504/dc1 Supporting Online Material for Reducing the Dimensionality of Data with Neural Networks G. E. Hinton* and R. R. Salakhutdinov *To whom correspondence

More information

Adaptive Face Recognition System from Myanmar NRC Card

Adaptive Face Recognition System from Myanmar NRC Card Adaptive Face Recognition System from Myanmar NRC Card Ei Phyo Wai University of Computer Studies, Yangon, Myanmar Myint Myint Sein University of Computer Studies, Yangon, Myanmar ABSTRACT Biometrics is

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION

UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION UNIVERSAL SPEECH MODELS FOR SPEAKER INDEPENDENT SINGLE CHANNEL SOURCE SEPARATION Dennis L. Sun Department of Statistics Stanford University Gautham J. Mysore Adobe Research ABSTRACT Supervised and semi-supervised

More information

Model selection in R featuring the lasso. Chris Franck LISA Short Course March 26, 2013

Model selection in R featuring the lasso. Chris Franck LISA Short Course March 26, 2013 Model selection in R featuring the lasso Chris Franck LISA Short Course March 26, 2013 Goals Overview of LISA Classic data example: prostate data (Stamey et. al) Brief review of regression and model selection.

More information

New Ensemble Combination Scheme

New Ensemble Combination Scheme New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,

More information

Efficient online learning of a non-negative sparse autoencoder

Efficient online learning of a non-negative sparse autoencoder and Machine Learning. Bruges (Belgium), 28-30 April 2010, d-side publi., ISBN 2-93030-10-2. Efficient online learning of a non-negative sparse autoencoder Andre Lemme, R. Felix Reinhart and Jochen J. Steil

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

Data-Driven Thermal Efficiency Modeling and Optimization for Multi-fuel Boiler

Data-Driven Thermal Efficiency Modeling and Optimization for Multi-fuel Boiler Proceedings of the th International Conference on Process Systems Engineering (PSE ASIA) 5-7 June 3, Kuala Lumpur. Data-Driven Thermal Efficiency Modeling and Optimization for Multi-fuel Boiler Jian-GuoWang,

More information

Online Learning in Biometrics: A Case Study in Face Classifier Update

Online Learning in Biometrics: A Case Study in Face Classifier Update Online Learning in Biometrics: A Case Study in Face Classifier Update Richa Singh, Mayank Vatsa, Arun Ross, and Afzel Noore Abstract In large scale applications, hundreds of new subjects may be regularly

More information

Image Normalization for Illumination Compensation in Facial Images

Image Normalization for Illumination Compensation in Facial Images Image Normalization for Illumination Compensation in Facial Images by Martin D. Levine, Maulin R. Gandhi, Jisnu Bhattacharyya Department of Electrical & Computer Engineering & Center for Intelligent Machines

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

II. RELATED WORK. Sentiment Mining

II. RELATED WORK. Sentiment Mining Sentiment Mining Using Ensemble Classification Models Matthew Whitehead and Larry Yaeger Indiana University School of Informatics 901 E. 10th St. Bloomington, IN 47408 {mewhiteh, larryy}@indiana.edu Abstract

More information

Author Gender Identification of English Novels

Author Gender Identification of English Novels Author Gender Identification of English Novels Joseph Baena and Catherine Chen December 13, 2013 1 Introduction Machine learning algorithms have long been used in studies of authorship, particularly in

More information

The Optimality of Naive Bayes

The Optimality of Naive Bayes The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most

More information

Content-Based Recommendation

Content-Based Recommendation Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2015 Collaborative Filtering assumption: users with similar taste in past will have similar taste in future requires only matrix of ratings applicable in many domains

More information

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577 T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or

More information

Building risk prediction models - with a focus on Genome-Wide Association Studies. Charles Kooperberg

Building risk prediction models - with a focus on Genome-Wide Association Studies. Charles Kooperberg Building risk prediction models - with a focus on Genome-Wide Association Studies Risk prediction models Based on data: (D i, X i1,..., X ip ) i = 1,..., n we like to fit a model P(D = 1 X 1,..., X p )

More information

1 Solving LPs: The Simplex Algorithm of George Dantzig

1 Solving LPs: The Simplex Algorithm of George Dantzig Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

More information

Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

More information

Component Ordering in Independent Component Analysis Based on Data Power

Component Ordering in Independent Component Analysis Based on Data Power Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals

More information

Generalized LARS as an Effective Feature Selection Tool for Text Classification With SVMs

Generalized LARS as an Effective Feature Selection Tool for Text Classification With SVMs Generalized LARS as an Effective Feature Selection Tool for Text Classification With SVMs S. Sathiya Keerthi selvarak@yahoo-inc.com Yahoo! Research Labs, 210 S. DeLacey Avenue, Pasadena, CA-91105 Abstract

More information

Appendix B Data Quality Dimensions

Appendix B Data Quality Dimensions Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Visualization of large data sets using MDS combined with LVQ.

Visualization of large data sets using MDS combined with LVQ. Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk

More information

Statistical Feature Selection Techniques for Arabic Text Categorization

Statistical Feature Selection Techniques for Arabic Text Categorization Statistical Feature Selection Techniques for Arabic Text Categorization Rehab M. Duwairi Department of Computer Information Systems Jordan University of Science and Technology Irbid 22110 Jordan Tel. +962-2-7201000

More information

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets

Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: macario.cordel@dlsu.edu.ph

More information

FPGA Implementation of Human Behavior Analysis Using Facial Image

FPGA Implementation of Human Behavior Analysis Using Facial Image RESEARCH ARTICLE OPEN ACCESS FPGA Implementation of Human Behavior Analysis Using Facial Image A.J Ezhil, K. Adalarasu Department of Electronics & Communication Engineering PSNA College of Engineering

More information

Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject

Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Defending Networks with Incomplete Information: A Machine Learning Approach Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Agenda Security Monitoring: We are doing it wrong Machine Learning

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

On the degrees of freedom in shrinkage estimation

On the degrees of freedom in shrinkage estimation On the degrees of freedom in shrinkage estimation Kengo Kato Graduate School of Economics, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan kato ken@hkg.odn.ne.jp October, 2007 Abstract

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

One-shot learning and big data with n = 2

One-shot learning and big data with n = 2 One-shot learning and big data with n = Lee H. Dicker Rutgers University Piscataway, NJ ldicker@stat.rutgers.edu Dean P. Foster University of Pennsylvania Philadelphia, PA dean@foster.net Abstract We model

More information

LASSO Regression. Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 21 th, 2013.

LASSO Regression. Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 21 th, 2013. Case Study 3: fmri Prediction LASSO Regression Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox February 21 th, 2013 Emily Fo013 1 LASSO Regression LASSO: least

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

What is the Right Illumination Normalization for Face Recognition?

What is the Right Illumination Normalization for Face Recognition? What is the Right Illumination Normalization for Face Recognition? Aishat Mahmoud Dan-ali Department of Computer Science and Engineering The American University in Cairo AUC Avenue, P.O. Box 74, New Cairo

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents

More information

Sparse Nonnegative Matrix Factorization for Clustering

Sparse Nonnegative Matrix Factorization for Clustering Sparse Nonnegative Matrix Factorization for Clustering Jingu Kim and Haesun Park College of Computing Georgia Institute of Technology 266 Ferst Drive, Atlanta, GA 30332, USA {jingu, hpark}@cc.gatech.edu

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Big Data Techniques Applied to Very Short-term Wind Power Forecasting

Big Data Techniques Applied to Very Short-term Wind Power Forecasting Big Data Techniques Applied to Very Short-term Wind Power Forecasting Ricardo Bessa Senior Researcher (ricardo.j.bessa@inesctec.pt) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with

More information

An Initial Study on High-Dimensional Data Visualization Through Subspace Clustering

An Initial Study on High-Dimensional Data Visualization Through Subspace Clustering An Initial Study on High-Dimensional Data Visualization Through Subspace Clustering A. Barbosa, F. Sadlo and L. G. Nonato ICMC Universidade de São Paulo, São Carlos, Brazil IWR Heidelberg University, Heidelberg,

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore Steven C.H. Hoi School of Computer Engineering Nanyang Technological University Singapore Acknowledgments: Peilin Zhao, Jialei Wang, Hao Xia, Jing Lu, Rong Jin, Pengcheng Wu, Dayong Wang, etc. 2 Agenda

More information

Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information