CATEGORIZATION OF SIMILAR OBJECTS USING BAG OF VISUAL WORDS AND k NEAREST NEIGHBOUR CLASSIFIER
|
|
|
- Piers Stokes
- 10 years ago
- Views:
Transcription
1 TECHNICAL SCIENCES Abbrev.: Techn. Sc., No 15(2), Y 2012 CATEGORIZATION OF SIMILAR OBJECTS USING BAG OF VISUAL WORDS AND k NEAREST NEIGHBOUR CLASSIFIER Piotr Artiemjew, Przemysław Górecki, Krzysztof Sopyła Chair of Mathematical Methods in Computer Science University of Warmia and Mazury Olsztyn K e y w o r d s: Image categorization, k Nearest Neighbor Classifier, Bag of Visual Words. Abstract Image categorization is one of the fundamental tasks in computer vision, it has wide application in methods of artificial intelligence, robotic vision and many others. There are a lot of difficulties in computer vision to overcome, one of them appears during image recognition and classification. The difficulty arises from an image variance, which may be caused by scaling, rotation, changes in a perspective, illumination levels, or partial occlusions. Due to these reasons, the main task is to represent represent images in such way that would allow recognizing them even if they have been modified. Bag of Visual Words (BoVW) approach, which allows for describing local characteristic features of images, has recently gained much attention in the computer vision community. In this article we have presented the results of image classification with the use of BoVW and k Nearest Neighbor classifier with different kinds of metrics and similarity measures. Additionally, the results of k NN classification are compared with the ones obtained from a Support Vector Machine classifier. Introduction One of the most popular supervised classification methods is based on the searching k Nearest Neighbors (k NN) objects using a fixed similarity measure or metric. In the classification by means of k NNmethod, the main problem is to identify the best method for computing a similarity between objects, and to find an optimal value of neighbors k. It is necessary to identify a measure which works best, but it is obvious that for different data the best method could differ. In this article we have investigated the problem of selecting a proper measure and k parameter, in the domain of images represented by means of Bag of Visual Words (BoVW).
2 294 Piotr Artiemjew et al. Our methodology involves using SIFT (Scale-Invariant Feature Transform) (LOWE 2004) feature extractor to obtain the collection of keypoints from the considered images. Successively, k-means clustering method is used for quantizing the keypoints into visual words (dictionary construction), whichallowsustorepresenttheimagesbymeansofthefrequenciesofvisual words which are present in the image. In the classification process we use different kinds of metrics and similarity measures. We use the Chisquare metric, Euclidean, Canberra, Manhattan, Normalized, Chebyshev distance and modifications of similarity measures like Cosine measure or the Pearson product-moment correlation coefficient (see DEZA E., DEZA M. 2009). Our results are evaluated by Leave One Out (LOO) method and compared with the results of our recent experiments obtained by applying different kernel functions in Support Vector Machine classifier (GÓRECKI et al. 2012c). As concerning image representation, BoVW is employed, which is well knowningenericvisualcategorization(csurka et al. 2004, DESELAERS et al. 2008, HERBRICH et al. 2000) and subcategorization (GÓRECKI et al. 2012b). In particular, an image is represented using frequencies of distinctive image patches (visual words), obtained in a two-step process. In the first step, a keypoint detector is employed to identify local interest points in a dataset of different images. Successively, the keypoints are quantized into visual words, so that one visual word corresponds to a number of visually similar keypoints. In most cases, K-means clustering is used to carry out the quantization, so that each centroid represents a visual word, and a set of visual words is called a visual dictionary. By counting visual words in the particular image, a feature vector encoding frequencies of visual words is obtained. Given the feature vector, an image can be further classified into a predefined set of categories using supervised machine learning algorithm, such as k-nn or SVM. Methodology There are two issues of BoVW image categorization. The first one is the choice of keypoint detector/descriptor. There were many descriptors proposed in the literature, such as SIFT (LEWIS 1998), SURF (BAY et al. 2006), and more recently BRISK (LEUTENEGGER et al. 2011) and FREAK (ALAHI et al. 2012). Their common feature is robustness to changes in image rotation, scale, perspective and illumination. A comprehensive survey of keypoint detectors can be found in (MAK et al. 2008, THORSTEN et al. 1998). Another important
3 Categorization of Similar Objects Using Bag aspect is the choice of the classifier. In this paper SIFT detector and k NN classifier were chosen. Our process of image categorization consists of typical steps (CSURKA et al. 2004): 1. Identification of image keypoints SIFT keypoints were detected for all images in the dataset. 2. Building the visual dictionary all keypoints identified in the previous step were clustered into K visual words using K means algorithm. During the experimental session we use different dictionaries. 3. Building the image representation for each image, the keypoints are mapped to the most similar visual words and then the image feature vector v =(v 1,...,v K ) is obtained, where v i encodes the frequency of the i-th visual word. 4. Classification we use the k NN classifier, and LOO method to evaluate the effectiveness of the classification, the details are in the classification section. Data In our experiments we have clustered the keypoints into different numbers of visual words. In any case, the empty clusters are discarded, therefore the number of obtained visual words (attributes) could be smaller than number of clusters considered originally. Our datasets contain the following number of conditional attributes (visual words): 50, 100, 250, 499, 983, 2442, 4736, where the original number of considered visual words are 50, 100, 250, 500, 1000, 2500, Fig. 1. An exemplary shoes from five distinctive classes of examined dataset
4 296 Piotr Artiemjew et al. Fig. 2. An exemplary set of key points Classification In this article we use a classic way of classification based on k NN methodology. We search for neighbors of the test objects in the whole dataset, and the major class is assigned to the test object, where ties are resolved hierarchically. Having obtained all training objects {y i }, we classified the test object x in the following way: (i) We have computed the distance between objects based on the chosen similarity measure or metric, that is g(x, y i ), where g is metric (d) or similarity measure (p). (ii) For the fixed test object, we have chosen k nearest training objects. (iii) The most numerous class assigns the decision to the test object. In the case of draws, we have chosen last conflicted class. Similarity measures and metrics It is really important to identify the metric or similarity measure which works best for the considered data. For our k NNclassifier we get distance between objects according to the following functions: The first one d : XxX [0, ) fulfills conditions, (i) d (x, y) =0 x = y (ii) d (x, y) =d (y, x) (iii) d (x, y) d (x, z) +d (z, y)
5 Categorization of Similar Objects Using Bag which define a metric, and the second one p : XxX [0, 1] (i) p(x, y) =1 x = y (ii) p(x, y) =p(y, x) (iii) p(x, y) [0, 1] is a similarity measure. The Cosine measure applied in this article gives the values of similarity from the range [-1,1], which is an exception of the above definition. One of the most popular is Euclidean metric (DEZA E., DEZA M. 2009), defined in the following way, d (x, y) = (Σ(x i y i ) 2 ) The Cosine measure (DEZA E., DEZA M. 2009) works as follows, x p (x, y) =Σ n i=1 ( y x * y ) Where the scalar product is defined as, x y = Σ x i * y i Length of vectors is the following x = Σ x 2 i ; y = Σ y 2 i One of the simplest is Manhattan metric (DEZA E., DEZA M. 2009) defined below, d (x, y) =Σ x i y i The normalized distance between objects based on division by the sum of attribute values is called Canberra metric (DEZA E., DEZA M. 2009), d (x, y) =Σ x i y i x i +y i
6 298 Piotr Artiemjew et al. And modification of Canberra metric is the Normalized metric normalized by the maximal and minimal values of attributes in their domains, x i y i d (x, y) =Σ max i + min i An interesting metric commonly used as a kernel of Support Vector Machine is Chisquare metric [6] defined in the following way, d (x, y) =Σ ( (x i y i ) 2 ) x i +y i Another metric, which determines the distance between objects as the maximal distance between attributes of objects, is the Chebyshev distance (DEZA E., DEZA M. 2009), d (x, y) = max ( x i y i ), i = 1,2,...,n The Pearson product-moment correlation coefficient, which measures the linear correlation of objects, is defined as follows, p (x, y) = r x, y Σ (x i x )(y i ȳ) r x, y = Σ(x i x ) 2 Σ(y i ȳ) 2 x = 1 Σ x i,ȳ= 1 Σ y i n n Data preprocessing The scalar length of feature vectors extracted from the dataset could disturb classification in case of data with a large number of visual words. For this reason we have normalized feature vectors by scaling their length to unit one, which is done by dividing all object s attributes by the scalar length of objects. It means that for all x U, and for all a A, we perform the following operation: a (x) = a (x), where x = Σ x 2 1 x
7 Categorization of Similar Objects Using Bag Error evaluation In this article we have evaluated the classification quality by using the standard Leave One Out method, where for n-objects of decision system we create n tests, where each time a different object is treated as test set and the rest of objects are the training set. The effectiveness of Leave One Out method was investigated among others in (MOLINARO et al. 2005). The LOO method is almost unbiased classification error estimator, hence the evaluation of classification quality is close to the real result. Results of Experiments The main goal of our experimental session was to identify the best similarity measure or metric and classification parameters for the prepared dataset with use of global k NNmethod. Our dataset consists of 200 objects (images), which represent five categories of shoes, the cardinalities of decision classes are the following, 59, 20, 34, 29, and 58 images respectively. The exemplary images of each class and exemplary key points of selected image are in the Figure 1. and Figure 2. After data normalization, the classification results for Cosine measure and Euclidean metric are the same, because distance between objects in Cosine metric is reduced to the computation of a scalar product of objects, that is equal to Euclidean distance between objects. The Cosine measure gives the same result before and after normalization, because the applied normalization is an internal part of this measure. In the Table 1 we can see that the best results of classification for normalized and non-normalized data and considered dictionary sizes. For a smaller number of visual words in the range of the classification Table 1 Leave One Out; The result of classification for the best parameter k and a measure of distance between objects, before and after normalization; chi = Chisquare metric, euk = Euclidean metric, cos = Cosine measure, pea = Pearson product-moment correlation coefficient Before norm Measure chi euk pea euk pea pea pea pea k After norm Measure chi man pea cos pea man pea pea pea cos pea euk euk k
8 300 Piotr Artiemjew et al. is better without normalization. The reason is that the number of visual words, not their types, plays a dominant role in distinguishing decision classes.theoptimalvalueofthek parameter is in the set {1, 2, 3}, the most effective measure turns out to be the Pearson product-moment correlation coefficient. The normalization does not have any influence on the Pearson product-moment correlation coefficient, since linear correlation between the objects is maintained, so the best result for a higher number of visual words is the same regardless of a normalization method. What is interesting, the best metric for 50 visual words, before and after normalization, is the Chisquare metric and additionally the Manhattan metric after normalization. For 100 words, the Euclidean metric is the best, (both before and after normalization), and Pearson s measure performs equally well if normalization is applied. For a higher number of words Pearson s measure is the best, but in a few cases the Euclidean and Manhattan metric have equal accuracy to Pearson s measure. In the Table 2 we have shown the best results and parameters for all metrics chosen for the data with a different number of visual words. It turns outthatweachievethebestresultsbeforenormalizationforthechisquare, Table 2 LeaveOneOut;Thebestresultforallmetricsandtheparameterk before and after normalization Metric Before norm No.of.visual.words After norm No.of.visual.words k Pearson Chisquare Manhattan Cosine Euclidean Normalized ,2, , Canberra Chebyshev Normalized, Canberra, and Chebyszev metric, and after normalization tthe Manhattan and Euclidean metrics work better. As we mentioned before, Pearson s and Cosine measures work equally well before and after normalization. Considering the best result of classification, we made an assumption that best parameter k has values 1, 2 or 3, and for this reason, in the Figures 3, 4, 5 and 6, we present the average of the classification results for these three parameters. Particularly, in the Figure 3 and 4 we have separate results for all dictionaries, and metrics before and after normalization. These generalized
9 Categorization of Similar Objects Using Bag Fig. 3 Leave One Out; k NN; Average result of accuracy for k=1,2,3 and visual words results lead us to the conclusion, that for 50 and 100 words, all metrics work better before normalization, except of Manhattan and Normalized metric in case of 100 words. For words, we obtain the best results for all metrics and measures after normalization, except for Chebyszev metric for 250 words. In the Figure 5 and 6, we have results for non-normalized data and normalized data respectively. In the plots, we have the results for all metrics vs all dictionaries (the data from the Figure 3 and 4 shown in the different way). In the Figure 7, we have exemplary detail wesult for Pearson s measure with data after normalization. The conclusion is that for 50, 100 and 250 words all the metrics work really steadily before and after normalization. Pearson s and Cosine measure work optimally for all the dictionaries. In case of the Czebyszev metric after normalization, the result of classification is more consistent for all the dictionaries. The Euclidean metric works better after normalization. The result before and after normalization for the rest of the metrics is comparable. In addition to our results we have compared results of a SVM classifier (CHAPELLE et al. 1997, FAN et al. 2005) with different kernel functions
10 302 Piotr Artiemjew et al. Fig. 4. Leave One Out; k NN; Average result of accuracy for k=1,2,3 and visual words Fig. 5. Leave One Out; k NN; For data before normalization; Average result of accuracy for k=1,2,3 and visual words
11 Categorization of Similar Objects Using Bag Fig. 6. Leave One Out; k NN; For data after normalization; Average result of accuracy for k=1,2,3 and visual words Fig. 7. Leave One Out; k NN; For Pearson s measure with data after normalization; The result of accuracy for k=1,2,...,10 and visual words
12 304 Piotr Artiemjew et al. (see Tab. 3) (GÓRECKI et al. 2012c), and global k NNclassifier with use of different similarity measures and metrics. We can summarize that for a number of visual words in the range of 50, 100, 2500, 5000, the results are comparable with the k NNmethod and even better for a smaller number of words in the range of 50 and 100, but in range of 250, 500, 1000 visual words, the SVM classifier wins by about four percent of classification accuracy. Table 3 Cross Validation 5; Overall accuracy of Support Vector Machine kernels in relation to visual dictionary size Kernel Number of visual words Linear Chi^ Histogram RBF Cauchy Conclusions The results of these experiments show an interesting dependence between the quality of classification by means of a global k NN method, the number of visual words, and the normalization method. It has turned out that for smaller number of visual words in the range of 50, 100, we get better result for non-normalized data. In case of 50 visual words the best is the Chisquare metric, for 100 visual words the best is the Euclidean metric, and starting from 250 words the best similarity measure for the considered data turn out to be the Pearson product-moment correlation coefficient and Cosine measure. For a higher number of visual words in the range of , we achieve better result after normalization. The optimal parameter k for global k NN classifier and considered dataset is in the set {1, 2, 3}. In the future we are planning to check the effectiveness of other methods of classification based on other publicly available datasets. Another goal is to apply other keypoint detectors in our research, so that their effectiveness can be compared with our current results.
13 Categorization of Similar Objects Using Bag Acknowledgements The research has been supported by grant N N from the National Science Center of Republic of Poland and grant from Ministry of Science and Higher Education of the Republic of Poland. Translated by MATT PURLAND Accepted for print References ALAHI A., ORTIZ R. VANDERGHEYNST P FREAK: Fast Retina Keypoint. IEEE Conference on Computer Vision and Pattern Recognition, Rhode Island, Providence, USA, BAY H., TUYTELAARS T., VAN GOOL L Surf: Speeded up robust features. ECCV, p CHAPELLE O., HAFFNER P., VAPNIK V.N Support vector machines for histogram-based image classification. Neural Networks, IEEE Transactions on, 10(5): CSURKA G., DANCE CH.R., FAN L., WILLAMOWSKI J., BRAY C Visual categorization with bags of keypoints. Workshop on Statistical Learning in Computer Vision, ECCV, p DESELAERS T., PIMENIDIS L., NEY H Bag-of-visual-words models for adult image classification and filtering. ICPR, p. 1 4, DEZA E., DEZA M Encyclopedia of Distances, Springer. FAN R., CHEN P., LIN Ch Working set selection using the second order information for training svm. Journal of machine learning research, 6: GÓRECKI P., ARTIEMJEW P., DROZDA P., SOPYŁA K. 2012a. Categorization of Similar Objects using Bag of Visual Words and Support Vector Machines. Fourth International Conference on Agents and Artificial Intelligence. IEEE. GÓRECKI P., SOPYŁA K., DROZDA P. 2012b. Ranking by k-means voting algorithm for similar image retrieval. In: ICAISC 1(7267): , of Lecture Notes in Computer Science, eds. L. Rutkowski, M. Korytkowski, R. Scherer, R. Tadeusiewicz, L.A. Zadeh, J.M. Zurada, Springer. GÓRECKI P., SOPYŁA K., DROZDA P. 2012c. Different SVM Kernels for Bag of Visual Words. In: International Congress of Young Scientist, SMI 2012, Springer. HERBRICH R., GRAEPEL T., OBERMAYER K Large margin rank boundaries for ordinal regression. MIT Press, Cambridge, MA. LEUTENEGGER S., CHLI M., SIEGWART R BRISK: Binary Robust Invariant Scalable Keypoints. Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp LEWIS D.D Naive (bayes) at forty: The independence assumption in information retrieval. Springer Verlag, p LOWE D.G Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision, 60: MAK M.W., GUO J., KUNG S.-Y Pairprosvm: Protein subcellular localization based on local pairwise profile alignment and svm. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 5(3): MIKOLAJCZYK K., LEIBE B., SCHIELE B Local Features for Object Class Recognition. Tenth IEEE International Conference on Computer Vision (ICCV 05), 1: MOLINARO A.M., SIMON R., PFEIFFER R.M Prediction error estimation. a comparison of resampling methods, Bioinformatics, 21(15): NISTR D., STEWNIUS H Scalable recognition with a vocabulary tree. In IN CVPR, p THORSTEN J Text categorization with support vector machines: learning with many relevant features. Proceedings of ECML-98, 10th European Conference on Machine Learning, 1398: Eds. C. N edellec, C. Rouveirol. Springer Verlag, Heidelberg, DE. TUYTELAARS T., MIKOLAJCZYK K Local invariant feature detectors: a survey. Found. Trends. Comput. Graph. Vis., 3:
Fast Matching of Binary Features
Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been
A Comparison of Keypoint Descriptors in the Context of Pedestrian Detection: FREAK vs. SURF vs. BRISK
A Comparison of Keypoint Descriptors in the Context of Pedestrian Detection: FREAK vs. SURF vs. BRISK Cameron Schaeffer Stanford University CS Department [email protected] Abstract The subject of keypoint
Augmented Reality Tic-Tac-Toe
Augmented Reality Tic-Tac-Toe Joe Maguire, David Saltzman Department of Electrical Engineering [email protected], [email protected] Abstract: This project implements an augmented reality version
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
CODING MODE DECISION ALGORITHM FOR BINARY DESCRIPTOR CODING
CODING MODE DECISION ALGORITHM FOR BINARY DESCRIPTOR CODING Pedro Monteiro and João Ascenso Instituto Superior Técnico - Instituto de Telecomunicações ABSTRACT In visual sensor networks, local feature
Local features and matching. Image classification & object localization
Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to
Probabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg [email protected] www.multimedia-computing.{de,org} References
Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang
Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform
View-Invariant Dynamic Texture Recognition using a Bag of Dynamical Systems
View-Invariant Dynamic Texture Recognition using a Bag of Dynamical Systems Avinash Ravichandran, Rizwan Chaudhry and René Vidal Center for Imaging Science, Johns Hopkins University, Baltimore, MD 21218,
FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION
FAST APPROXIMATE NEAREST NEIGHBORS WITH AUTOMATIC ALGORITHM CONFIGURATION Marius Muja, David G. Lowe Computer Science Department, University of British Columbia, Vancouver, B.C., Canada [email protected],
The Delicate Art of Flower Classification
The Delicate Art of Flower Classification Paul Vicol Simon Fraser University University Burnaby, BC [email protected] Note: The following is my contribution to a group project for a graduate machine learning
The use of computer vision technologies to augment human monitoring of secure computing facilities
The use of computer vision technologies to augment human monitoring of secure computing facilities Marius Potgieter School of Information and Communication Technology Nelson Mandela Metropolitan University
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service
TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service Feng Tang, Daniel R. Tretter, Qian Lin HP Laboratories HPL-2012-131R1 Keyword(s): image recognition; cloud service;
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
3D Model based Object Class Detection in An Arbitrary View
3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/
Medical Information Management & Mining. You Chen Jan,15, 2013 [email protected]
Medical Information Management & Mining You Chen Jan,15, 2013 [email protected] 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin
Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Advanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 2, February 2014,
Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition
IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
Face Recognition in Low-resolution Images by Using Local Zernike Moments
Proceedings of the International Conference on Machine Vision and Machine Learning Prague, Czech Republic, August14-15, 014 Paper No. 15 Face Recognition in Low-resolution Images by Using Local Zernie
Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object
siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service
siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service Ahmad Pahlavan Tafti 1, Hamid Hassannia 2, and Zeyun Yu 1 1 Department of Computer Science, University of Wisconsin -Milwaukee,
Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
Scalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
Car Insurance. Prvák, Tomi, Havri
Car Insurance Prvák, Tomi, Havri Sumo report - expectations Sumo report - reality Bc. Jan Tomášek Deeper look into data set Column approach Reminder What the hell is this competition about??? Attributes
Mining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
Incremental PCA: An Alternative Approach for Novelty Detection
Incremental PCA: An Alternative Approach for Detection Hugo Vieira Neto and Ulrich Nehmzow Department of Computer Science University of Essex Wivenhoe Park Colchester CO4 3SQ {hvieir, udfn}@essex.ac.uk
ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION
ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical
Big Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
Meta-learning. Synonyms. Definition. Characteristics
Meta-learning Włodzisław Duch, Department of Informatics, Nicolaus Copernicus University, Poland, School of Computer Engineering, Nanyang Technological University, Singapore [email protected] (or search
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor Dominik Wȩgrzyn and Luís A. Alexandre IT - Instituto de Telecomunicações Dept. of Computer Science, Univ. Beira Interior, 6200-001 Covilhã, Portugal
Group Sparse Coding. Fernando Pereira Google Mountain View, CA [email protected]. Dennis Strelow Google Mountain View, CA strelow@google.
Group Sparse Coding Samy Bengio Google Mountain View, CA [email protected] Fernando Pereira Google Mountain View, CA [email protected] Yoram Singer Google Mountain View, CA [email protected] Dennis Strelow
IT services for analyses of various data samples
IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical
Machine Learning in Spam Filtering
Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov [email protected] Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.
Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model
AI TERM PROJECT GROUP 14 1 Anti-Spam Filter Based on,, and model Yun-Nung Chen, Che-An Lu, Chao-Yu Huang Abstract spam email filters are a well-known and powerful type of filters. We construct different
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Image Classification for Dogs and Cats
Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science [email protected] Abstract
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Analytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
IMAGE PROCESSING BASED APPROACH TO FOOD BALANCE ANALYSIS FOR PERSONAL FOOD LOGGING
IMAGE PROCESSING BASED APPROACH TO FOOD BALANCE ANALYSIS FOR PERSONAL FOOD LOGGING Keigo Kitamura, Chaminda de Silva, Toshihiko Yamasaki, Kiyoharu Aizawa Department of Information and Communication Engineering
Predict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, [email protected] Department of Electrical Engineering, Stanford University Abstract Given two persons
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
Object Recognition. Selim Aksoy. Bilkent University [email protected]
Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University [email protected] Image classification Image (scene) classification is a fundamental
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms
A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms H. Kandil and A. Atwan Information Technology Department, Faculty of Computer and Information Sciences, Mansoura University,El-Gomhoria
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
A Logistic Regression Approach to Ad Click Prediction
A Logistic Regression Approach to Ad Click Prediction Gouthami Kondakindi [email protected] Satakshi Rana [email protected] Aswin Rajkumar [email protected] Sai Kaushik Ponnekanti [email protected] Vinit Parakh
Support Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Module II: Multimedia Data Mining
ALMA MATER STUDIORUM - UNIVERSITÀ DI BOLOGNA Module II: Multimedia Data Mining Laurea Magistrale in Ingegneria Informatica University of Bologna Multimedia Data Retrieval Home page: http://www-db.disi.unibo.it/courses/dm/
BRIEF: Binary Robust Independent Elementary Features
BRIEF: Binary Robust Independent Elementary Features Michael Calonder, Vincent Lepetit, Christoph Strecha, and Pascal Fua CVLab, EPFL, Lausanne, Switzerland e-mail: [email protected] Abstract.
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Biometric Authentication using Online Signatures
Biometric Authentication using Online Signatures Alisher Kholmatov and Berrin Yanikoglu [email protected], [email protected] http://fens.sabanciuniv.edu Sabanci University, Tuzla, Istanbul,
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II
! E6893 Big Data Analytics Lecture 5:! Big Data Analytics Algorithms -- II Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j
Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet
Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
Introduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
Surgical Tools Recognition and Pupil Segmentation for Cataract Surgical Process Modeling
Surgical Tools Recognition and Pupil Segmentation for Cataract Surgical Process Modeling David Bouget, Florent Lalys, Pierre Jannin To cite this version: David Bouget, Florent Lalys, Pierre Jannin. Surgical
How To Identify A Churner
2012 45th Hawaii International Conference on System Sciences A New Ensemble Model for Efficient Churn Prediction in Mobile Telecommunication Namhyoung Kim, Jaewook Lee Department of Industrial and Management
Tracking Groups of Pedestrians in Video Sequences
Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal
Clustering. Data Mining. Abraham Otero. Data Mining. Agenda
Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in
Cross-Validation. Synonyms Rotation estimation
Comp. by: BVijayalakshmiGalleys0000875816 Date:6/11/08 Time:19:52:53 Stage:First Proof C PAYAM REFAEILZADEH, LEI TANG, HUAN LIU Arizona State University Synonyms Rotation estimation Definition is a statistical
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Open-Set Face Recognition-based Visitor Interface System
Open-Set Face Recognition-based Visitor Interface System Hazım K. Ekenel, Lorant Szasz-Toth, and Rainer Stiefelhagen Computer Science Department, Universität Karlsruhe (TH) Am Fasanengarten 5, Karlsruhe
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.
Visual search: what's next? Cees Snoek University of Amsterdam The Netherlands Euvision Technologies The Netherlands Problem statement US flag Tree Aircraft Humans Dog Smoking Building Basketball Table
Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
Machine Learning. 01 - Introduction
Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge
Clustering Technique in Data Mining for Text Documents
Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow
, pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices
The Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL
The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts
BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an
Self Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, [email protected]
Class-specific Sparse Coding for Learning of Object Representations
Class-specific Sparse Coding for Learning of Object Representations Stephan Hasler, Heiko Wersing, and Edgar Körner Honda Research Institute Europe GmbH Carl-Legien-Str. 30, 63073 Offenbach am Main, Germany
Data, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
Machine learning for algo trading
Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with
Data Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH
1 SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH Y, HONG, N. GAUTAM, S. R. T. KUMARA, A. SURANA, H. GUPTA, S. LEE, V. NARAYANAN, H. THADAKAMALLA The Dept. of Industrial Engineering,
