Weak Hypotheses and Boosting for Generic Object Detection and Recognition

Size: px
Start display at page:

Download "Weak Hypotheses and Boosting for Generic Object Detection and Recognition"

Transcription

1 Weak Hypotheses and Boosting for Generic Object Detection and Recognition A. Opelt 1,2, M. Fussenegger 1,2, A. Pinz 2, and P. Auer 1 1 Institute of Computer Science, 8700 Leoben, Austria {auer,andreas.opelt}@unileoben.ac.at 2 Institute of Electrical Measurement and Measurement Signal Processing, 8010 Graz, Austria {fussenegger,opelt,pinz}@emt.tugraz.at Abstract. In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framework: Boosting together with a weak hypotheses finder may choose very inhomogeneous features as most relevant for combination into a final hypothesis. As another advantage the weak hypotheses finder may search the weak hypotheses space without explicit calculation of all available hypotheses, reducing computation time. This contrasts the related work of Agarwal and Roth [1] where Winnow was used as learning algorithm and all weak hypotheses were calculated explicitly. In our first empirical evaluation we use four types of local descriptors: two basic ones consisting of a set of grayvalues and intensity moments and two high level descriptors: moment invariants [8] and SIFTs [12]. The descriptors are calculated from local patches detected by an interest point operator. The weak hypotheses finder selects one of the local patches and one type of local descriptor and efficiently searches for the most discriminative similarity threshold. This differs from other work on Boosting for object recognition where simple rectangular hypotheses [22] or complex classifiers [20] have been used. In relatively simple images, where the objects are prominent, our approach yields results comparable to the state-of-the-art [3]. But we also obtain very good results on more complex images, where the objects are located in arbitrary positions, poses, and scales in the images. These results indicate that our flexible approach, which also allows the inclusion of features from segmented regions and even spatial relationships, leads us a significant step towards generic object recognition. 1 Introduction We believe that a learning component is a necessary part of any generic object recognition system. In this paper we investigate a principle approach for learning objects in still images which allows the use of flexible and extendible T. Pajdla and J. Matas (Eds.): ECCV 2004, LNCS 3022, pp , c Springer-Verlag Berlin Heidelberg 2004

2 72 A. Opelt et al. sets of features for describing objects and object categories. Objects should be recognized even if they occur at abitrary scale, shown from different perspective views on highly textured backgrounds. Our main learning technique relies on Boosting [5]. Boosting is a technique for combining several weak classifiers into a final strong classifier. The weak classifiers are calculated on different weightings of the training examples to emphasize different portions of the training set. Since any classification function can potentially serve as a weak classifier we can use classifiers based on arbitrary and inhomogeneous sets of image features. A further advantage of Boosting is that weak classifiers may be calculated when needed instead of calculating unnecessary hypotheses a priori. In our learning setting, the learning algorithm needs to learn an object category. It is provided with a set of labeled training images, where a positive label indicates that a relevant object appears in the image. The objects are not segmented and pose and location are unknown. As output, the learning algorithm delivers a final classifier which predicts if a relevant object is present in a new image. Having such a classifier, the localization of the object in the image is straightforward. The image analysis transforms images to greyvalues and extracts normalised regions around interest (salient) points to obtain reduced representations of images. As an appropriate representation for the learning procedure we calculate local descriptors of these patches. The result of the training procedure is saved as the final hypothesis which is later used for testing (see figure 1). Images (Labeled) Region Detection Local Descriptors Boosting Labels Preprocessing Grayscale Images Scaled Harris /Laplace Affine Invariant Low Level Moment Inv. AdaBoost SIFT Detector SIFTs Interest Point Final Hypothesis Learning Testing Pred. Labels Calculate Fig. 1. Overview showing the framework for our approach for generic object recognition. The solid arrows show the training cycle, the dotted ones the testing procedure. We describe our general learning approach in detail in section 2. In section 3, we discuss the image analysis steps, including illumination and size normalisation, interest point detection, and the extraction of the local descriptors. An explicit explanation of how we calculate the weak hypotheses used by the Boosting algorithm, is given in section 4. Section 5 contains a description of the setup we used for our experiments. The results are presented and compared with other

3 Weak Hypotheses and Boosting for Generic Object Detection 73 approaches for object recognition. We regard the present system as a first step and further work is outlined in section Related Work Clearly there is an extensive body of literature on object recognition (e.g. [3], [2], [22], [24], [6], [14]). In general, these approaches use image databases which show the object of interest at prominent scales and with only little variation in pose. We discuss only some of the most relevant and most recent results related to our approach. Boosting was successfully used by Viola and Jones [22] as the ingredient for a fast face detector. The weak hypotheses were the thresholded average brightness of collections of up to four rectangular regions. In our approach we experiment with much larger sets of features to be able to perform recognition of a wider class of objects. Schneiderman and Kanade [20] used Boosting to improve an already complex classifier. In contrast, we are using Boosting to combine rather simple classifiers by selecting the most discriminative features. Agarwal and Roth [1] used Winnow as the underlying learning algorithm for the recognition of cars from side views. For this purpose images were represented as binary feature vectors. The bits of such a feature vector can be seen as the outcomes of weak classifiers, one weak classifier for each position in the binary vector. Thus for learning it is required that the outcomes of all weak classifiers are calculated a priori. In contrast, Boosting only needs to find the few weak classifiers which actually appear in the final classifier. This substantially speeds up learning, if the space of weak classifiers carries a structure which allows the efficient search for discriminative weak classifiers. A simple example is a weak classifier which compares a real valued feature against a threshold. For Winnow, one weak classifier needs to be calculated for each possible threshold a priori 1, whereas for Boosting the optimal threshold can be determined efficiently when needed. A different approach to object class recognition was presented by Fergus, Perona, and Zisserman [3]. They used a generative probabilistic model for objects built as constellations of parts. Using an EM-type learning algorithm they achieved very good recognition performance. In our work we have chosen a model-free approach for flexibility. If at all, the sets of weak classifiers we use can be seen as model classes, but with much less structure than in [3]. Furthermore, we propose Boosting as a very different learning algorithm from EM. Dorko and Schmid [2] introduced an approach for constructing and selecting scale-invariant object parts. These parts are subsequently used to learn a classifier. They show a robust detection under scale changes and variations in viewing conditions, but in contrast to our approach, the objects of interest are manually pre-segmented. This dramatically reduces the complexity of distinguishing between relevant patches on the objects and background clutter. 1 More efficient techniques for Winnow like using virtual threshold gates [13] do not improve the situation much.

4 74 A. Opelt et al. 2 Our Learning Model for Object Recognition In our setup, a learning algorithm has to recognize objects from a certain category in still images. For this purpose, the learning algorithm delivers a classifier that predicts whether a given image contains an object from this category or not. As training data, labeled images (I 1,l 1 ),...,(I m,l m ) are provided for the learning algorithm where l k =+1ifI k contains a relevant object and l k = 1 if I k contains no relevant object. Now the learning algorithm delivers a function H : I ˆl which predicts the label of image I. To calculate this classification function H we use the classical AdaBoost algorithm [5]. AdaBoost puts weights w k on the training images and requires the construction of a weak hypothesis h which has some discriminative power relative to these weights, i.e. w k > w k, (1) k:h(i k )=l k k:h(i k ) l k such that more images are correctly classified than misclassified, relative to the weights w k. (Such a hypothesis is called weak since it needs to satisfy only a very weak requirement.) The process of putting weights and constructing a weak hypothesis is iterated for several rounds t = 1,...,T, and the weak hypotheses h t of each round are combined into the final hypothesis H. In each round t the weight w k is decreased if the prediction for I k was correct (h t (I k )=l k ), and increased if the prediction was incorrect. Different to the standard AdaBoost algorithm we vary the factor β t to trade off precision and recall. We set β t = 1 ε ε η if l k = +1 and l k h t (I k ). 1 ε ε else with ε being the error of the weak hypothesis in this round and η as an additional weight factor to control the update of wrongly classified positive examples. Here two general comments are in place. First, it is intuitively quite clear that weak hypotheses with high discriminative power with a large difference of the sums in (1) are preferable, and indeed this is shown in the convergence proof of AdaBoost [5]. Second, the adaptation of the weights w k in each round performs some sort of adaptive decorrelation of the weak hypotheses: if an image was correctly classified in round t, then its weight is decreased and less emphasis is put on this image in the next round, yielding quite different hypotheses h t and h t+1. 2 Thus it can be expected that the first few weak hypotheses characterize the object category under consideration quite well. This is particularly interesting when a sparse representation of the object category is needed. Obviously AdaBoost is a very general learning technique for obtaining classification functions. To adapt it for a specific application, suitable weak hypotheses 2 In fact AdaBoost sets the weights in such a way that h t is not discriminative in respect to the new weights. Thus h t is in some sense oblivious to the predictions of h t+1.

5 Weak Hypotheses and Boosting for Generic Object Detection 75 have to be constructed. For the purpose of object recognition we need to extract suitable features from images and use these features to construct the weak hypotheses. Since AdaBoost is a general learning technique we are free to choose any type of features we like, as long as we are able to provide an effective weak hypotheses finder which returns discriminative weak hypotheses based on this set of features. The chosen features should be able to represent the content of images, at least in respect to the object category under consideration. Since we may choose several types of features, we represent an image I by a set of pairs R (I) ={(τ,v)} where τ denotes the type of a feature and v denotes a value of this feature, typically a vector of reals. Then for AdaBoost a weak hypothesis is constructed from the representations R (I k ), labels l k, and weights w k of the training images. In the next section we describe the types of features we are currently using, although many other features could be used, too. In Section 4 we describe the effective construction of the weak hypotheses. 3 Image Analysis and Feature Construction We extract features from raw images, ignoring the labels used for learning. To lower the number of the points in an image we have to attend to, we use an interest point detector to get salient points. We evaluate three different detectors, a scale invariant interest point detector, an affine invariant interest point detector, and the SIFT interest point detector ([15], [16], [12], see section 3.1). Using these salient points we can reduce the content of an image to a number of points (and their surroundings) while being robust against irrelevant variations in illumination and scale. Since the most salient points 3 may not belong to the relevant objects, we have to take a rather large number of points into account, which implies choosing a low threshold in the interest point detectors. The number of SIFTs is reduced by a vector quantization using k-means (similarly to Fergus et al. [3]). The pixels enclosing an interest point are refered to as a patch. Due to different illumination conditions we normalise each patch before the local descriptors are calculated. Representing patches through a local descriptor can be done in different ways. We use subsampled grayvalues, intensity moments, Moment Invariants and SIFTs here. 3.1 Interest Point Detection There is a variety of work on interest point detection at fixed (e.g. [9,21,25,10]), and at varying scales (e.g. [11,15,16]). Based on the evaluation of interest point detectors by Schmid et al. [19], we decided to use the scale invariant Harris- Laplace detector [15] and the affine invariant interest point detector [16], both by Mikolajczyk and Schmid. In addition we use the interest point detector used by Lowe [12] because it is strongly interrelated with SIFTs as local descriptors. 3 E.g. by measuring the entropy of the histogram in the surrounding [3] or doing a Principal Component Analysis.

6 76 A. Opelt et al. The scale invariant detector finds interest points by calculating a scaled version of the second moment matrix M and localizing points where the Harris Measure H = det(m) αtrace 2 (M) is above a certain threshold th. The characteristic scale for each of these points is found in scale-space by calculating the Laplacians L(x,σ)= σ 2 (L xx (x,σ)+l yy (x,σ)) for each desired scale σ and taking the one at which L has a maximum in an 8-neighbourhood of the point. The affine invariant detector is also based on the second moment matrix computed at a point which can be used to normalise a region in an affine invariant way. The characteristic scale is again obtained by selecting the scale at which the Laplacian has a maximum. An iterative algorithm is then used which converges to affine invariant points by modifying the location, scale and neighbourhood of each point. Lowe introduced an interest point detector invariant to translation, scaling and rotation and minimally affected by small distortions and noise [12]. He also uses the scale-space but built with a difference of Gaussian (DoG). Additionally, a scale pyramid achieved by bilinear interpolation is employed. Calculating the image gradient magnitude and the orientation at each point of the scale pyramid, salient points with characteristic scales and orientations are achieved. 3.2 Region Normalisation To normalise the patches we have to consider illumination, scale and affine transformations. For the size normalisation we have decided to use quadratic patches with a side of l pixels. The value of l is a variable we vary in our experiments. We extract a window of size w =6 σ I where σ I is the characteristic scale of the interest point delivered by the interest point detector. Scale normalisation is done by smoothing and subsampling in cases of l<wand by linear interpolation otherwise. In order to obtain affine invariant patches the values of the transformation matrix resulting from the affine invariant interest point detector are used to normalise the window to the shape of a square, before the size normalisation. For illumination normalisation we use Homomorphic Filtering (see e.g. [7], chapter 4.5). The Homomorphic Filter is based on an image formation model where the image intensity I(x, y) =i(x, y)r(x, y) is modeled as the product of illumination i(x, y) and reflectance r(x, y). Elimination of the illumination part leads to a normalisation. This is achieved by applying a Fast Fourier Transform to the logarithm image ln(i). Now the reflectance component can be separated by a high pass filter. After a back transformation and an exponentiation we get the desired normalised patch. 3.3 Feature Extraction To represent each patch we have to choose some local descriptors. Local descriptors have been researched quite well (e.g. [4], [12], [18], [8]). We selected four local descriptors for our patches. Our first descriptor is simply a vector of all pixels in a patch subsampled by two. The dimension of this vector is l 2 4 which is rather high and increases computational complexity. As a second descriptor we use intensity moments MI a pq = ω i(x, y)a x p y q dx dy with a as the degree and

7 Weak Hypotheses and Boosting for Generic Object Detection 77 p + q as the order, up to degree 2 and order 2. Without using the moments of degree 0 we get a feature vector with a dimension of 10. This reduces the computational costs dramatically. With respect to the performance evaluation of local descriptors done by Mikolajczyk and Schmid [17] we took SIFTs (see [12]) as a third and Moment Invariants (see [8]) as a fourth choice. In this evaluation the SIFTs outmatched the others in nearly all tests and the Moment Invariants were in the middle ground for all aspects considered. According to [8] we selected first and second order Moment Invariants. We chose the first order affine Invariant and four first order affine and photometric Invariants. Additionally we took all five second order Invariants described in [8]. Since the Invariants require two contours, the whole square patch is taken as one contour and rectangles corresponding to one half of the patch are used as a second contour. All four possibilities of the second contour are calculated and used to obtain the Invariants. The dimenson of the Moment Invariants description vector is 10. As shown in [12] the description of the patches with SIFTs is done by multiple representations in various orientation planes. These orientation planes are blurred and resampled to allow larger shifts in positions of the gradients. A local descriptor with a dimension of 128 is obtained here for a circular region around the point with a radius of 8 pixels, 8 orientation planes and sampling over a 4x4 and a 2x2 grid of locations. 4 Calculation of Weak Hypotheses Using the features constructed in the previous section, an image is represented by a list of features (τ f,v f ), f = 1,...,F, where τ f denotes the type of a feature, v f denotes its value as real vector, and F is the number of extracted features in an image. The weak hypotheses for AdaBoost are calculated from these features. For object recognition we have chosen weak hypotheses which indicate if certain feature values appear in images. For this a weak hypothesis h has to select a feature type τ, its value v, and a similarity threshold θ. The threshold θ decides if an image contains a feature value v f that is sufficiently similar to v. The similarity between v f and v is calculated by the Mahalanobis distance for Moment Invariants and by the Euclidean distance for SIFTs. The weak hypotheses finder searches for the optimal weak hypothesis given labeled representations of the training images (R (I 1 ),l 1 ),...,(R (I m ),l m ) and their weights w 1,...,w m calculated by AdaBoost among all possible feature values and corresponding thresholds. The main computational burden is the calculation of the distances between v f and v, since they both range over all feature values that appear in the training images. 4 Given these distances which can be calculated prior to Boosting, the remaining calculations are relatively inexpensive. Details for the weak hypotheses finder are given in Figure 2. After sorting the optimal threshold for feature (τ k,f,v k,f ) can now be calculated in time O(m) by scanning through the weights 4 We discuss possible improvements in Section 6.

8 78 A. Opelt et al. Input: Labeled representations (R (I k ),l k ), k =1,...,m, R (I k )={(τ k,f,v k,f ):f =1,...,F k }. Distance functions: Let d τ (, ) be the distance in respect to the feature values of type τ in the training images. Minimal distance matrix: For all features (τ k,f,v k,f ) and all images I j calculate the minimal distance between v k,f and features in I j, d k,f,j = min d τk,f (v k,f,v j,g). 1 g F j :τ j,g =τ k,f Sorting: For each k, f let π k,f (1),...,π k,f (m) be a permutation such that d k,f,πk,f (1) d k,f,πk,f (m). Select best weak hypothesis (Scanline): For all features (τ k,f,v k,f ) calculate over all images I j s max w πk,f (i) l πk,f (i). s i=1 and select the feature (τ i,f,v i,f ) where the maximum is achieved. Select threshold θ: With the position s where the scanline reached a maxium sum the threshold θ is set to θ = d k,f,π k,f (s) d k,f,πk,f (s+1). 2 Fig. 2. Explanation of the weak hypotheses finder. w 1,...,w m in the order of the distances d k,f,j. Searching over all features, the calculation of the optimal weak hypothesis takes O(Fm) time. To give an example of the absolute computation times we used a dataset of 150 positive and 150 negative images. Each image has an average number of approximately 400 patches. Using SIFTs one iteration after preprocessing requires about one minute computation time on a P4, 2.4GHz PC. 5 Experimental Setup and Results We carried out our experiments as follows: the whole approach was first tested on the database used by Fergus et al. [3]. After demonstrating a comparable performance, the approach was tested on a new, more difficult database 5, see figure 5. These images contain the objects at arbitrary scales and poses. The images also contain highly textured background. Testing on these images shows that our approach still performs well. We have used two categories of objects, persons (P) and bikes (B), and images containing none of these objects (N). Our database contains 450 images of category P, 350 of B and 250 of category N. The recognition was based on deciding presence or absence of a relevant object. 5 Available at http : // pinz/data/

9 Weak Hypotheses and Boosting for Generic Object Detection 79 Preparing our data set we randomly chose a number of images, half belonging to the object category we want to learn and half not. From each of these two piles we take one third of the images as a set of images for testing the achieved model. The performance was measured with the receiver-operating characteristic (ROC) corresponding error rate. We tested the images containing the object (e.g. category B) against non-object images from the database (e.g. categories P and N). Our training set contains 100 positive and 100 negative images. The tests are carried out on 100 new images, half belonging to the learned class and half not. Each experiment was done using just one type of local descriptor. Figure 3(a) shows the recall-precision curve (RPC) of our approach (obtained by varying η), the approach of Fergus et al. [3] and the one of Agarwal and Roth [1], trained on the dataset used by Fergus et al. [3] 6. Our approach performs better than the one of Agarwal and Roth but slightly worse than the approach of Fergus et al. Table 1 shows the results of our approach (using the affine invariant interest point detection and Moment Invariants) compared with the ones of Fergus et al. and other methods [23], [24], [1]. While they use a kind of scale and viewing direction normalisation (see [3]), we work on the original images. Our results are almost as good as the results of Fergus et al. for the motorbikes dataset. For the other datasets our error rate is somewhat higher than the one of Fergus et al., but mostly lower than the error rate of the other methods. Table 1. The table gives the ROC equal error rates on a number of datasets from the database used by Fergus et al. [3]. Our results (using the affine invariant interest point detection and Moment Invariants) are compared with the results of the approach of Fergus et al. and other methods [23], [24], [1]. The error rates of our algorithm are between the other approaches and the ones of Fergus et al. in all cases except for the faces where the algorithm of Weber et al. [24] is also slightly better. Dataset Ours Fergus et al. [3] Others Ref. Motorbikes [23] Airplanes [23] Faces [24] Cars(Side) [1] This comparison shows that our approach performs well on the Fergus et al. database. We proceed with experiments on our own dataset and show some effects of parameter tuning 7. Figure 3(b) shows the influence of the additional weighting of right positive examples in the Boosting algorithm (η). We can see that with a factor η smaller than 1.8, the recall increases faster than the precision 6 Available at http : // vgg/data/ 7 Parametes not given in these tests are set to η =1.8, T = 50, l =16px, th = 30000, smallest scale is skipped. Depending on textured/homogenous background, the number of interest points detected in an image varies between 50 and 1000.

10 80 A. Opelt et al. Fig. 3. The curves in (a) and (b) are obtained by varying the factor η. In (a) the diagram shows the recall-precision curve for [3], [1] and our approach on the cars (side) dataset. Our approach is superior to the one of Agarwal and Roth but slightly worse than the one of Fergus et al. The diagram (b) shows the influence of an additional factor η for the weights of correctly positive classified examples. The recall increases faster than the precision drops until a factor of 1.8. drops. Then both curves have nearly the same (but inverse) gradient up to a factor of 3. For η>3 the precision decreases rapidly with no relevant gain of recall. Table 2 presents the performance of the Moment Invarants as local descriptor, compared with our low level descriptors (using the affine invariant interest point detector). Moment Invariants delivered the best results but the other low level descriptors did not perform badly, either. This behaviour might be explained by the fact that the extracted regions are already normalised against the same set of transformations as the Moment Invariants. Table 2. The table shows the results we reached with the three different kinds of local descriptors. We used an additional weight factor η = 1.7 here. Moment Invariants delivered the best results. Local Descriptor recall precision Moment Invariants Intensity Moments Subsampled Grayvalues In table 3 the results of our approach using the scale invariant interest point detector compared with the use of the affine invariant interest point detector are shown. We also vary the additional weight for right positive classified examples η. The affine invariant interest point detector achieves better results for the recall but precision is higher when we use the scale invariant version of the interest

11 Weak Hypotheses and Boosting for Generic Object Detection 81 1 Recall Precision Curve of our approach 1 Recall Precision Curve of our approach recall 0.5 recall Moment Invariants + Aff. Inv. SIFT precision (a) 0.1 Moment Invariants + Aff. Inv. SIFT precision (b) Fig. 4. In (a) the recall-precision curve of our approach with Moment Invariants and the affine invariant interest point detection, and the recall-precision curve of our approach using SIFTs for category bike are shown. (b) shows the recall-precision curves with the same methods for the category person. point detector. This is to be expected since the affine invariant detector allows for more variation in the image, which implies higher recall but less precision. Table 3. The table shows the results of our approach using the scale invariant interest point detector compared with using the affine invariant interest point detector varying the additional weight for right positive classified examples η. η recall (scale inv.) precision (scale inv.) recall (affine inv.) precision (affine inv.) We skipped the smallest scale in our experiments because experiments show that this reduction of number of points does not have relevant influence to the error rates. Again, using the parameters that performed best, figure 4(a) shows an example of a recall-precision curve (RPC) of our approach trained on the bike dataset from our image database with Moment Invariants and the affine invariant interest point detection compared with our approach using SIFTs. Using the same methods we obtain the recall-precision curves (RPC) shown in figure 4(b) for the category person. For directly comparing the results reached using the Moment Invariants with the affine invariant interest point detector or using SIFTs, the ROC equal error rates on various datasets are shown in table 4. As seen here the SIFTs perform better on our database. Tested on a category of the database from Fergus et al. one can see that the Moment Invariants perform better in that case.

12 82 A. Opelt et al. Table 4. This table shows a comparison of the ROC equal error rates reached with the two high level features. On our database the SIFTs perform better, but on the database of Fergus et al. the Moment Invariants reach the better error rate. Dataset Moment Invariants SIFTs Airplanes Bikes Persons Discussion and Outlook In conclusion, we have presented a novel approach for the detection and recognition of object categories in still images. Our system uses several steps of image analysis and feature extraction, which have been previously described, but succeeds on rather complex images with a lot of background structure. Objects are shown in substantially different poses and scales, and in many of the images the objects (bikes or persons) cover only a small portion of the whole image. The main contribution of the paper, however, lies in the new concept of learning. We use Boosting as the underlying learning technique and combine it with a weak hypothesis finder. In addition to several other advantages of this approach, which have already been mentioned, we want to emphasize that this approach allows the formation of very diverse visual features into a final hypothesis. We think that this capability is the main reason for the good experimental results on our complex database. Furthermore, experimental comparison on the database used by Fergus et al. [3] shows that our approach performs similarly well to state-of-the-art object categorization on simpler images. We are currently investigating extensions of our approach in several directions. Maybe the most obvious is the addition of more features to our image analysis. This includes not only other local descriptors like differential invariants [12], but also regional features 8 and geometric features 9. To reduce the complexity of our approach we are considering a reduction of the number of features by clustering methods. As the next step we will use spatial relations between features to improve the accuracy of our object detector. To handle the complexity of many possible relations between features, we will use the features constructed in our current approach (with parameters set for high recall) as starting points. Boosting will again be the underlying method for learning object representations as spatial combinations of features. This will allow the construction of weak hypotheses for discriminative spatial relations. Acknowledgements. This work was supported by the European project LAVA (IST ) and by the Austrian Science Foundation (FWF, project S Regional features describe regions found by appearance based clustering. 9 A geometric feature describes the appearance of geometric shapes, e.g. ellipses, in images.

13 Weak Hypotheses and Boosting for Generic Object Detection 83 Fig. 5. Examples from our image data base. The first column shows three images from the object class bike, the second column contains objects from the class person and the images in the last column belong to none of the classes (called nobikenoperson). The second example in the last column shows a moped as a very difficult counter-example to the category of bikes. N04). We are grateful to David Lowe and Cordelia Schmid for providing the code for their detectors/descriptors. References 1. S. Agarwal and D. Roth. Learning a sparse representation for object detection. In Proc. ECCV, pages , Gy. Dorko and C. Schmid. Selection of scale-invariant parts for object class recognition. In Proc. International Conference on Computer Vision, R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. In Proc. CVPR, W. Freeman and E. Adelson. The design and use of steerable filters. In PAMI, pages , Y. Freund and R. E. Schapire. A decision-theoretic generalisation of on-line learning. Computer and System Sciences, 55(1), A. Garg, S. Agarwal, and T. S. Huang. Fusion of global and local information for object detection. In Proc. CVPR, volume 2, pages , R. C. Gonzalez and R.E. Woods. Digital Image Processing. Addison-Wesley, 2001.

14 84 A. Opelt et al. 8. L. Van Gool, T. Moons, and D. Ungureanu. Affine / photometric invariants for planar intensity patterns. In Proc. ECCV, pages , C. Harris and M. Stephens. A combined corner and edge detector. In Proc. of the 4th ALVEY vision conference, pages , R. Laganiere. A morphological operator for corner detection. Pattern Recognition, 31(11): , T. Lindeberg. Feature detection with automatic scale selection. International Journal of Computer Vision, D. G. Lowe. Object recognition from local scale-invariant features. In Proc. ICCV, pages , W. Maass and M. Warmuth. Efficient learning with virtual threshold gates. Information and Computation, 141(1):66 83, S. Mahamud, M. Hebert, and J. Shi. Object recognition using boosted discriminants. In Proc. CVPR, volume 1, pages , K. Mikolajczyk and C. Schmid. Indexing based on scale invariant interest points. In Proc. ICCV, pages , K. Mikolajczyk and C. Schmid. An affine invariant interest point detector. In Proc. ECCV, pages , K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. In Proc. CVPR, C. Schmid and R. Mohr. Local grayvalue invariants for image retrieval. In PAMI, volume 19, pages , C. Schmid, R. Mohr, and C. Bauckhage. Evaluation of interest point detectors. International Journal of Computer Vision, pages , H. Schneiderman and T. Kanade. Object detection using the statistics of parts. International Journal of Computer Vision, to appear. 21. E. Shilat, M. Werman, and Y. Gdalyahu. Ridge s corner detection and correspondence. In Computer Vision and Pattern Recognition, pages , P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proc. CVPR, M. Weber. Unsupervised Learning of Models for Object Recognition. PhD thesis, California Institute of Technology, Pasadena, CA, M. Weber, M. Welling, and P. Perona. Unsupervised learning of models for recognition. In Proc. ECCV, R. P. Wuertz and T. Lourens. Corner detection in color images by multiscale combination of end-stopped cortical cells. In International Conference on Artificial Neuronal Networks, pages , 1997.

3D Model based Object Class Detection in An Arbitrary View

3D Model based Object Class Detection in An Arbitrary View 3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

More information

Object class recognition using unsupervised scale-invariant learning

Object class recognition using unsupervised scale-invariant learning Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of Technology Goal Recognition of object categories

More information

Object Class Recognition by Unsupervised Scale-Invariant Learning

Object Class Recognition by Unsupervised Scale-Invariant Learning Object Class Recognition by Unsupervised Scale-Invariant Learning R. Fergus 1 P. Perona 2 A. Zisserman 1 1 Dept. of Engineering Science 2 Dept. of Electrical Engineering University of Oxford California

More information

Scale-Invariant Object Categorization using a Scale-Adaptive Mean-Shift Search

Scale-Invariant Object Categorization using a Scale-Adaptive Mean-Shift Search in DAGM 04 Pattern Recognition Symposium, Tübingen, Germany, Aug 2004. Scale-Invariant Object Categorization using a Scale-Adaptive Mean-Shift Search Bastian Leibe ½ and Bernt Schiele ½¾ ½ Perceptual Computing

More information

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow , pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices

More information

More Local Structure Information for Make-Model Recognition

More Local Structure Information for Make-Model Recognition More Local Structure Information for Make-Model Recognition David Anthony Torres Dept. of Computer Science The University of California at San Diego La Jolla, CA 9093 Abstract An object classification

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

Canny Edge Detection

Canny Edge Detection Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties

More information

A Bayesian Framework for Unsupervised One-Shot Learning of Object Categories

A Bayesian Framework for Unsupervised One-Shot Learning of Object Categories A Bayesian Framework for Unsupervised One-Shot Learning of Object Categories Li Fei-Fei 1 Rob Fergus 1,2 Pietro Perona 1 1. California Institute of Technology 2. Oxford University Single Object Object

More information

Image Segmentation and Registration

Image Segmentation and Registration Image Segmentation and Registration Dr. Christine Tanner (tanner@vision.ee.ethz.ch) Computer Vision Laboratory, ETH Zürich Dr. Verena Kaynig, Machine Learning Laboratory, ETH Zürich Outline Segmentation

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow 02/09/12 Feature Tracking and Optical Flow Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Many slides adapted from Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve

More information

Jiří Matas. Hough Transform

Jiří Matas. Hough Transform Hough Transform Jiří Matas Center for Machine Perception Department of Cybernetics, Faculty of Electrical Engineering Czech Technical University, Prague Many slides thanks to Kristen Grauman and Bastian

More information

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006 Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,

More information

Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., Canada lowe@cs.ubc.ca January 5, 2004 Abstract This paper

More information

Randomized Trees for Real-Time Keypoint Recognition

Randomized Trees for Real-Time Keypoint Recognition Randomized Trees for Real-Time Keypoint Recognition Vincent Lepetit Pascal Lagger Pascal Fua Computer Vision Laboratory École Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne, Switzerland Email:

More information

Self Organizing Maps: Fundamentals

Self Organizing Maps: Fundamentals Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen

More information

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Build Panoramas on Android Phones

Build Panoramas on Android Phones Build Panoramas on Android Phones Tao Chu, Bowen Meng, Zixuan Wang Stanford University, Stanford CA Abstract The purpose of this work is to implement panorama stitching from a sequence of photos taken

More information

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Face Recognition in Low-resolution Images by Using Local Zernike Moments Proceedings of the International Conference on Machine Vision and Machine Learning Prague, Czech Republic, August14-15, 014 Paper No. 15 Face Recognition in Low-resolution Images by Using Local Zernie

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

CS 534: Computer Vision 3D Model-based recognition

CS 534: Computer Vision 3D Model-based recognition CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!

More information

Incremental PCA: An Alternative Approach for Novelty Detection

Incremental PCA: An Alternative Approach for Novelty Detection Incremental PCA: An Alternative Approach for Detection Hugo Vieira Neto and Ulrich Nehmzow Department of Computer Science University of Essex Wivenhoe Park Colchester CO4 3SQ {hvieir, udfn}@essex.ac.uk

More information

Visual Categorization with Bags of Keypoints

Visual Categorization with Bags of Keypoints Visual Categorization with Bags of Keypoints Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray Xerox Research Centre Europe 6, chemin de Maupertuis 38240 Meylan, France

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

Discovering objects and their location in images

Discovering objects and their location in images Discovering objects and their location in images Josef Sivic Bryan C. Russell Alexei A. Efros Andrew Zisserman William T. Freeman Dept. of Engineering Science CS and AI Laboratory School of Computer Science

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Colour Image Segmentation Technique for Screen Printing

Colour Image Segmentation Technique for Screen Printing 60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone

More information

Finding people in repeated shots of the same scene

Finding people in repeated shots of the same scene Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences

More information

Novelty Detection in image recognition using IRF Neural Networks properties

Novelty Detection in image recognition using IRF Neural Networks properties Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

More information

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service Feng Tang, Daniel R. Tretter, Qian Lin HP Laboratories HPL-2012-131R1 Keyword(s): image recognition; cloud service;

More information

Open-Set Face Recognition-based Visitor Interface System

Open-Set Face Recognition-based Visitor Interface System Open-Set Face Recognition-based Visitor Interface System Hazım K. Ekenel, Lorant Szasz-Toth, and Rainer Stiefelhagen Computer Science Department, Universität Karlsruhe (TH) Am Fasanengarten 5, Karlsruhe

More information

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr

MVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Department of Applied Mathematics Ecole Centrale Paris Galen

More information

QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING

QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING MRS. A H. TIRMARE 1, MS.R.N.KULKARNI 2, MR. A R. BHOSALE 3 MR. C.S. MORE 4 MR.A.G.NIMBALKAR 5 1, 2 Assistant professor Bharati Vidyapeeth s college

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

THE development of methods for automatic detection

THE development of methods for automatic detection Learning to Detect Objects in Images via a Sparse, Part-Based Representation Shivani Agarwal, Aatif Awan and Dan Roth, Member, IEEE Computer Society 1 Abstract We study the problem of detecting objects

More information

Convolution. 1D Formula: 2D Formula: Example on the web: http://www.jhu.edu/~signals/convolve/

Convolution. 1D Formula: 2D Formula: Example on the web: http://www.jhu.edu/~signals/convolve/ Basic Filters (7) Convolution/correlation/Linear filtering Gaussian filters Smoothing and noise reduction First derivatives of Gaussian Second derivative of Gaussian: Laplacian Oriented Gaussian filters

More information

Make and Model Recognition of Cars

Make and Model Recognition of Cars Make and Model Recognition of Cars Sparta Cheung Department of Electrical and Computer Engineering University of California, San Diego 9500 Gilman Dr., La Jolla, CA 92093 sparta@ucsd.edu Alice Chu Department

More information

Determining optimal window size for texture feature extraction methods

Determining optimal window size for texture feature extraction methods IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec

More information

Signature verification using Kolmogorov-Smirnov. statistic

Signature verification using Kolmogorov-Smirnov. statistic Signature verification using Kolmogorov-Smirnov statistic Harish Srinivasan, Sargur N.Srihari and Matthew J Beal University at Buffalo, the State University of New York, Buffalo USA {srihari,hs32}@cedar.buffalo.edu,mbeal@cse.buffalo.edu

More information

On-line Boosting and Vision

On-line Boosting and Vision On-line Boosting and Vision Helmut Grabner and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology {hgrabner, bischof}@icg.tu-graz.ac.at Abstract Boosting has become

More information

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28 Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object

More information

Solving Geometric Problems with the Rotating Calipers *

Solving Geometric Problems with the Rotating Calipers * Solving Geometric Problems with the Rotating Calipers * Godfried Toussaint School of Computer Science McGill University Montreal, Quebec, Canada ABSTRACT Shamos [1] recently showed that the diameter of

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Fast Matching of Binary Features

Fast Matching of Binary Features Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been

More information

The goal is multiply object tracking by detection with application on pedestrians.

The goal is multiply object tracking by detection with application on pedestrians. Coupled Detection and Trajectory Estimation for Multi-Object Tracking By B. Leibe, K. Schindler, L. Van Gool Presented By: Hanukaev Dmitri Lecturer: Prof. Daphna Wienshall The Goal The goal is multiply

More information

Image Classification for Dogs and Cats

Image Classification for Dogs and Cats Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science kzhou3@ualberta.ca Abstract

More information

jorge s. marques image processing

jorge s. marques image processing image processing images images: what are they? what is shown in this image? What is this? what is an image images describe the evolution of physical variables (intensity, color, reflectance, condutivity)

More information

Morphological segmentation of histology cell images

Morphological segmentation of histology cell images Morphological segmentation of histology cell images A.Nedzved, S.Ablameyko, I.Pitas Institute of Engineering Cybernetics of the National Academy of Sciences Surganova, 6, 00 Minsk, Belarus E-mail abl@newman.bas-net.by

More information

Document Image Retrieval using Signatures as Queries

Document Image Retrieval using Signatures as Queries Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and

More information

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu

More information

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS Alexander Velizhev 1 (presenter) Roman Shapovalov 2 Konrad Schindler 3 1 Hexagon Technology Center, Heerbrugg, Switzerland 2 Graphics & Media

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

3 An Illustrative Example

3 An Illustrative Example Objectives An Illustrative Example Objectives - Theory and Examples -2 Problem Statement -2 Perceptron - Two-Input Case -4 Pattern Recognition Example -5 Hamming Network -8 Feedforward Layer -8 Recurrent

More information

SURF: Speeded Up Robust Features

SURF: Speeded Up Robust Features SURF: Speeded Up Robust Features Herbert Bay 1, Tinne Tuytelaars 2, and Luc Van Gool 12 1 ETH Zurich {bay, vangool}@vision.ee.ethz.ch 2 Katholieke Universiteit Leuven {Tinne.Tuytelaars, Luc.Vangool}@esat.kuleuven.be

More information

L25: Ensemble learning

L25: Ensemble learning L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

More information

Principal components analysis

Principal components analysis CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k

More information

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree. Visual search: what's next? Cees Snoek University of Amsterdam The Netherlands Euvision Technologies The Netherlands Problem statement US flag Tree Aircraft Humans Dog Smoking Building Basketball Table

More information

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition 1. Image Pre-Processing - Pixel Brightness Transformation - Geometric Transformation - Image Denoising 1 1. Image Pre-Processing

More information

Crater detection with segmentation-based image processing algorithm

Crater detection with segmentation-based image processing algorithm Template reference : 100181708K-EN Crater detection with segmentation-based image processing algorithm M. Spigai, S. Clerc (Thales Alenia Space-France) V. Simard-Bilodeau (U. Sherbrooke and NGC Aerospace,

More information

A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms

A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms H. Kandil and A. Atwan Information Technology Department, Faculty of Computer and Information Sciences, Mansoura University,El-Gomhoria

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Medical Image Segmentation of PACS System Image Post-processing *

Medical Image Segmentation of PACS System Image Post-processing * Medical Image Segmentation of PACS System Image Post-processing * Lv Jie, Xiong Chun-rong, and Xie Miao Department of Professional Technical Institute, Yulin Normal University, Yulin Guangxi 537000, China

More information

3D OBJECT MODELING AND RECOGNITION IN PHOTOGRAPHS AND VIDEO

3D OBJECT MODELING AND RECOGNITION IN PHOTOGRAPHS AND VIDEO 3D OBJECT MODELING AND RECOGNITION IN PHOTOGRAPHS AND VIDEO Fredrick H. Rothganger, Ph.D. Computer Science University of Illinois at Urbana-Champaign, 2004 Jean Ponce, Adviser This thesis introduces a

More information

The use of computer vision technologies to augment human monitoring of secure computing facilities

The use of computer vision technologies to augment human monitoring of secure computing facilities The use of computer vision technologies to augment human monitoring of secure computing facilities Marius Potgieter School of Information and Communication Technology Nelson Mandela Metropolitan University

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Lecture 2: The SVM classifier

Lecture 2: The SVM classifier Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function

More information

Tracking in flussi video 3D. Ing. Samuele Salti

Tracking in flussi video 3D. Ing. Samuele Salti Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking

More information

Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

Admin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis

Admin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis Admin stuff 4 Image Pyramids Change of office hours on Wed 4 th April Mon 3 st March 9.3.3pm (right after class) Change of time/date t of last class Currently Mon 5 th May What about Thursday 8 th May?

More information

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University

More information

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service Ahmad Pahlavan Tafti 1, Hamid Hassannia 2, and Zeyun Yu 1 1 Department of Computer Science, University of Wisconsin -Milwaukee,

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

Visual Structure Analysis of Flow Charts in Patent Images

Visual Structure Analysis of Flow Charts in Patent Images Visual Structure Analysis of Flow Charts in Patent Images Roland Mörzinger, René Schuster, András Horti, and Georg Thallinger JOANNEUM RESEARCH Forschungsgesellschaft mbh DIGITAL - Institute for Information

More information

Application of Face Recognition to Person Matching in Trains

Application of Face Recognition to Person Matching in Trains Application of Face Recognition to Person Matching in Trains May 2008 Objective Matching of person Context : in trains Using face recognition and face detection algorithms With a video-surveillance camera

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

High Quality Image Magnification using Cross-Scale Self-Similarity

High Quality Image Magnification using Cross-Scale Self-Similarity High Quality Image Magnification using Cross-Scale Self-Similarity André Gooßen 1, Arne Ehlers 1, Thomas Pralow 2, Rolf-Rainer Grigat 1 1 Vision Systems, Hamburg University of Technology, D-21079 Hamburg

More information

Vision based Vehicle Tracking using a high angle camera

Vision based Vehicle Tracking using a high angle camera Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work

More information

Footwear Print Retrieval System for Real Crime Scene Marks

Footwear Print Retrieval System for Real Crime Scene Marks Footwear Print Retrieval System for Real Crime Scene Marks Yi Tang, Sargur N. Srihari, Harish Kasiviswanathan and Jason J. Corso Center of Excellence for Document Analysis and Recognition (CEDAR) University

More information

Tracking-Learning-Detection

Tracking-Learning-Detection IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 6, NO., JANUARY 200 Tracking-Learning-Detection Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas, Abstract This paper investigates

More information

Vector and Matrix Norms

Vector and Matrix Norms Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

More information

Least-Squares Intersection of Lines

Least-Squares Intersection of Lines Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a

More information

Euler Vector: A Combinatorial Signature for Gray-Tone Images

Euler Vector: A Combinatorial Signature for Gray-Tone Images Euler Vector: A Combinatorial Signature for Gray-Tone Images Arijit Bishnu, Bhargab B. Bhattacharya y, Malay K. Kundu, C. A. Murthy fbishnu t, bhargab, malay, murthyg@isical.ac.in Indian Statistical Institute,

More information

Surgical Tools Recognition and Pupil Segmentation for Cataract Surgical Process Modeling

Surgical Tools Recognition and Pupil Segmentation for Cataract Surgical Process Modeling Surgical Tools Recognition and Pupil Segmentation for Cataract Surgical Process Modeling David Bouget, Florent Lalys, Pierre Jannin To cite this version: David Bouget, Florent Lalys, Pierre Jannin. Surgical

More information

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus

Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information

More information

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information