Weak Hypotheses and Boosting for Generic Object Detection and Recognition
|
|
- Cora Gray
- 7 years ago
- Views:
Transcription
1 Weak Hypotheses and Boosting for Generic Object Detection and Recognition A. Opelt 1,2, M. Fussenegger 1,2, A. Pinz 2, and P. Auer 1 1 Institute of Computer Science, 8700 Leoben, Austria {auer,andreas.opelt}@unileoben.ac.at 2 Institute of Electrical Measurement and Measurement Signal Processing, 8010 Graz, Austria {fussenegger,opelt,pinz}@emt.tugraz.at Abstract. In this paper we describe the first stage of a new learning system for object detection and recognition. For our system we propose Boosting [5] as the underlying learning technique. This allows the use of very diverse sets of visual features in the learning process within a common framework: Boosting together with a weak hypotheses finder may choose very inhomogeneous features as most relevant for combination into a final hypothesis. As another advantage the weak hypotheses finder may search the weak hypotheses space without explicit calculation of all available hypotheses, reducing computation time. This contrasts the related work of Agarwal and Roth [1] where Winnow was used as learning algorithm and all weak hypotheses were calculated explicitly. In our first empirical evaluation we use four types of local descriptors: two basic ones consisting of a set of grayvalues and intensity moments and two high level descriptors: moment invariants [8] and SIFTs [12]. The descriptors are calculated from local patches detected by an interest point operator. The weak hypotheses finder selects one of the local patches and one type of local descriptor and efficiently searches for the most discriminative similarity threshold. This differs from other work on Boosting for object recognition where simple rectangular hypotheses [22] or complex classifiers [20] have been used. In relatively simple images, where the objects are prominent, our approach yields results comparable to the state-of-the-art [3]. But we also obtain very good results on more complex images, where the objects are located in arbitrary positions, poses, and scales in the images. These results indicate that our flexible approach, which also allows the inclusion of features from segmented regions and even spatial relationships, leads us a significant step towards generic object recognition. 1 Introduction We believe that a learning component is a necessary part of any generic object recognition system. In this paper we investigate a principle approach for learning objects in still images which allows the use of flexible and extendible T. Pajdla and J. Matas (Eds.): ECCV 2004, LNCS 3022, pp , c Springer-Verlag Berlin Heidelberg 2004
2 72 A. Opelt et al. sets of features for describing objects and object categories. Objects should be recognized even if they occur at abitrary scale, shown from different perspective views on highly textured backgrounds. Our main learning technique relies on Boosting [5]. Boosting is a technique for combining several weak classifiers into a final strong classifier. The weak classifiers are calculated on different weightings of the training examples to emphasize different portions of the training set. Since any classification function can potentially serve as a weak classifier we can use classifiers based on arbitrary and inhomogeneous sets of image features. A further advantage of Boosting is that weak classifiers may be calculated when needed instead of calculating unnecessary hypotheses a priori. In our learning setting, the learning algorithm needs to learn an object category. It is provided with a set of labeled training images, where a positive label indicates that a relevant object appears in the image. The objects are not segmented and pose and location are unknown. As output, the learning algorithm delivers a final classifier which predicts if a relevant object is present in a new image. Having such a classifier, the localization of the object in the image is straightforward. The image analysis transforms images to greyvalues and extracts normalised regions around interest (salient) points to obtain reduced representations of images. As an appropriate representation for the learning procedure we calculate local descriptors of these patches. The result of the training procedure is saved as the final hypothesis which is later used for testing (see figure 1). Images (Labeled) Region Detection Local Descriptors Boosting Labels Preprocessing Grayscale Images Scaled Harris /Laplace Affine Invariant Low Level Moment Inv. AdaBoost SIFT Detector SIFTs Interest Point Final Hypothesis Learning Testing Pred. Labels Calculate Fig. 1. Overview showing the framework for our approach for generic object recognition. The solid arrows show the training cycle, the dotted ones the testing procedure. We describe our general learning approach in detail in section 2. In section 3, we discuss the image analysis steps, including illumination and size normalisation, interest point detection, and the extraction of the local descriptors. An explicit explanation of how we calculate the weak hypotheses used by the Boosting algorithm, is given in section 4. Section 5 contains a description of the setup we used for our experiments. The results are presented and compared with other
3 Weak Hypotheses and Boosting for Generic Object Detection 73 approaches for object recognition. We regard the present system as a first step and further work is outlined in section Related Work Clearly there is an extensive body of literature on object recognition (e.g. [3], [2], [22], [24], [6], [14]). In general, these approaches use image databases which show the object of interest at prominent scales and with only little variation in pose. We discuss only some of the most relevant and most recent results related to our approach. Boosting was successfully used by Viola and Jones [22] as the ingredient for a fast face detector. The weak hypotheses were the thresholded average brightness of collections of up to four rectangular regions. In our approach we experiment with much larger sets of features to be able to perform recognition of a wider class of objects. Schneiderman and Kanade [20] used Boosting to improve an already complex classifier. In contrast, we are using Boosting to combine rather simple classifiers by selecting the most discriminative features. Agarwal and Roth [1] used Winnow as the underlying learning algorithm for the recognition of cars from side views. For this purpose images were represented as binary feature vectors. The bits of such a feature vector can be seen as the outcomes of weak classifiers, one weak classifier for each position in the binary vector. Thus for learning it is required that the outcomes of all weak classifiers are calculated a priori. In contrast, Boosting only needs to find the few weak classifiers which actually appear in the final classifier. This substantially speeds up learning, if the space of weak classifiers carries a structure which allows the efficient search for discriminative weak classifiers. A simple example is a weak classifier which compares a real valued feature against a threshold. For Winnow, one weak classifier needs to be calculated for each possible threshold a priori 1, whereas for Boosting the optimal threshold can be determined efficiently when needed. A different approach to object class recognition was presented by Fergus, Perona, and Zisserman [3]. They used a generative probabilistic model for objects built as constellations of parts. Using an EM-type learning algorithm they achieved very good recognition performance. In our work we have chosen a model-free approach for flexibility. If at all, the sets of weak classifiers we use can be seen as model classes, but with much less structure than in [3]. Furthermore, we propose Boosting as a very different learning algorithm from EM. Dorko and Schmid [2] introduced an approach for constructing and selecting scale-invariant object parts. These parts are subsequently used to learn a classifier. They show a robust detection under scale changes and variations in viewing conditions, but in contrast to our approach, the objects of interest are manually pre-segmented. This dramatically reduces the complexity of distinguishing between relevant patches on the objects and background clutter. 1 More efficient techniques for Winnow like using virtual threshold gates [13] do not improve the situation much.
4 74 A. Opelt et al. 2 Our Learning Model for Object Recognition In our setup, a learning algorithm has to recognize objects from a certain category in still images. For this purpose, the learning algorithm delivers a classifier that predicts whether a given image contains an object from this category or not. As training data, labeled images (I 1,l 1 ),...,(I m,l m ) are provided for the learning algorithm where l k =+1ifI k contains a relevant object and l k = 1 if I k contains no relevant object. Now the learning algorithm delivers a function H : I ˆl which predicts the label of image I. To calculate this classification function H we use the classical AdaBoost algorithm [5]. AdaBoost puts weights w k on the training images and requires the construction of a weak hypothesis h which has some discriminative power relative to these weights, i.e. w k > w k, (1) k:h(i k )=l k k:h(i k ) l k such that more images are correctly classified than misclassified, relative to the weights w k. (Such a hypothesis is called weak since it needs to satisfy only a very weak requirement.) The process of putting weights and constructing a weak hypothesis is iterated for several rounds t = 1,...,T, and the weak hypotheses h t of each round are combined into the final hypothesis H. In each round t the weight w k is decreased if the prediction for I k was correct (h t (I k )=l k ), and increased if the prediction was incorrect. Different to the standard AdaBoost algorithm we vary the factor β t to trade off precision and recall. We set β t = 1 ε ε η if l k = +1 and l k h t (I k ). 1 ε ε else with ε being the error of the weak hypothesis in this round and η as an additional weight factor to control the update of wrongly classified positive examples. Here two general comments are in place. First, it is intuitively quite clear that weak hypotheses with high discriminative power with a large difference of the sums in (1) are preferable, and indeed this is shown in the convergence proof of AdaBoost [5]. Second, the adaptation of the weights w k in each round performs some sort of adaptive decorrelation of the weak hypotheses: if an image was correctly classified in round t, then its weight is decreased and less emphasis is put on this image in the next round, yielding quite different hypotheses h t and h t+1. 2 Thus it can be expected that the first few weak hypotheses characterize the object category under consideration quite well. This is particularly interesting when a sparse representation of the object category is needed. Obviously AdaBoost is a very general learning technique for obtaining classification functions. To adapt it for a specific application, suitable weak hypotheses 2 In fact AdaBoost sets the weights in such a way that h t is not discriminative in respect to the new weights. Thus h t is in some sense oblivious to the predictions of h t+1.
5 Weak Hypotheses and Boosting for Generic Object Detection 75 have to be constructed. For the purpose of object recognition we need to extract suitable features from images and use these features to construct the weak hypotheses. Since AdaBoost is a general learning technique we are free to choose any type of features we like, as long as we are able to provide an effective weak hypotheses finder which returns discriminative weak hypotheses based on this set of features. The chosen features should be able to represent the content of images, at least in respect to the object category under consideration. Since we may choose several types of features, we represent an image I by a set of pairs R (I) ={(τ,v)} where τ denotes the type of a feature and v denotes a value of this feature, typically a vector of reals. Then for AdaBoost a weak hypothesis is constructed from the representations R (I k ), labels l k, and weights w k of the training images. In the next section we describe the types of features we are currently using, although many other features could be used, too. In Section 4 we describe the effective construction of the weak hypotheses. 3 Image Analysis and Feature Construction We extract features from raw images, ignoring the labels used for learning. To lower the number of the points in an image we have to attend to, we use an interest point detector to get salient points. We evaluate three different detectors, a scale invariant interest point detector, an affine invariant interest point detector, and the SIFT interest point detector ([15], [16], [12], see section 3.1). Using these salient points we can reduce the content of an image to a number of points (and their surroundings) while being robust against irrelevant variations in illumination and scale. Since the most salient points 3 may not belong to the relevant objects, we have to take a rather large number of points into account, which implies choosing a low threshold in the interest point detectors. The number of SIFTs is reduced by a vector quantization using k-means (similarly to Fergus et al. [3]). The pixels enclosing an interest point are refered to as a patch. Due to different illumination conditions we normalise each patch before the local descriptors are calculated. Representing patches through a local descriptor can be done in different ways. We use subsampled grayvalues, intensity moments, Moment Invariants and SIFTs here. 3.1 Interest Point Detection There is a variety of work on interest point detection at fixed (e.g. [9,21,25,10]), and at varying scales (e.g. [11,15,16]). Based on the evaluation of interest point detectors by Schmid et al. [19], we decided to use the scale invariant Harris- Laplace detector [15] and the affine invariant interest point detector [16], both by Mikolajczyk and Schmid. In addition we use the interest point detector used by Lowe [12] because it is strongly interrelated with SIFTs as local descriptors. 3 E.g. by measuring the entropy of the histogram in the surrounding [3] or doing a Principal Component Analysis.
6 76 A. Opelt et al. The scale invariant detector finds interest points by calculating a scaled version of the second moment matrix M and localizing points where the Harris Measure H = det(m) αtrace 2 (M) is above a certain threshold th. The characteristic scale for each of these points is found in scale-space by calculating the Laplacians L(x,σ)= σ 2 (L xx (x,σ)+l yy (x,σ)) for each desired scale σ and taking the one at which L has a maximum in an 8-neighbourhood of the point. The affine invariant detector is also based on the second moment matrix computed at a point which can be used to normalise a region in an affine invariant way. The characteristic scale is again obtained by selecting the scale at which the Laplacian has a maximum. An iterative algorithm is then used which converges to affine invariant points by modifying the location, scale and neighbourhood of each point. Lowe introduced an interest point detector invariant to translation, scaling and rotation and minimally affected by small distortions and noise [12]. He also uses the scale-space but built with a difference of Gaussian (DoG). Additionally, a scale pyramid achieved by bilinear interpolation is employed. Calculating the image gradient magnitude and the orientation at each point of the scale pyramid, salient points with characteristic scales and orientations are achieved. 3.2 Region Normalisation To normalise the patches we have to consider illumination, scale and affine transformations. For the size normalisation we have decided to use quadratic patches with a side of l pixels. The value of l is a variable we vary in our experiments. We extract a window of size w =6 σ I where σ I is the characteristic scale of the interest point delivered by the interest point detector. Scale normalisation is done by smoothing and subsampling in cases of l<wand by linear interpolation otherwise. In order to obtain affine invariant patches the values of the transformation matrix resulting from the affine invariant interest point detector are used to normalise the window to the shape of a square, before the size normalisation. For illumination normalisation we use Homomorphic Filtering (see e.g. [7], chapter 4.5). The Homomorphic Filter is based on an image formation model where the image intensity I(x, y) =i(x, y)r(x, y) is modeled as the product of illumination i(x, y) and reflectance r(x, y). Elimination of the illumination part leads to a normalisation. This is achieved by applying a Fast Fourier Transform to the logarithm image ln(i). Now the reflectance component can be separated by a high pass filter. After a back transformation and an exponentiation we get the desired normalised patch. 3.3 Feature Extraction To represent each patch we have to choose some local descriptors. Local descriptors have been researched quite well (e.g. [4], [12], [18], [8]). We selected four local descriptors for our patches. Our first descriptor is simply a vector of all pixels in a patch subsampled by two. The dimension of this vector is l 2 4 which is rather high and increases computational complexity. As a second descriptor we use intensity moments MI a pq = ω i(x, y)a x p y q dx dy with a as the degree and
7 Weak Hypotheses and Boosting for Generic Object Detection 77 p + q as the order, up to degree 2 and order 2. Without using the moments of degree 0 we get a feature vector with a dimension of 10. This reduces the computational costs dramatically. With respect to the performance evaluation of local descriptors done by Mikolajczyk and Schmid [17] we took SIFTs (see [12]) as a third and Moment Invariants (see [8]) as a fourth choice. In this evaluation the SIFTs outmatched the others in nearly all tests and the Moment Invariants were in the middle ground for all aspects considered. According to [8] we selected first and second order Moment Invariants. We chose the first order affine Invariant and four first order affine and photometric Invariants. Additionally we took all five second order Invariants described in [8]. Since the Invariants require two contours, the whole square patch is taken as one contour and rectangles corresponding to one half of the patch are used as a second contour. All four possibilities of the second contour are calculated and used to obtain the Invariants. The dimenson of the Moment Invariants description vector is 10. As shown in [12] the description of the patches with SIFTs is done by multiple representations in various orientation planes. These orientation planes are blurred and resampled to allow larger shifts in positions of the gradients. A local descriptor with a dimension of 128 is obtained here for a circular region around the point with a radius of 8 pixels, 8 orientation planes and sampling over a 4x4 and a 2x2 grid of locations. 4 Calculation of Weak Hypotheses Using the features constructed in the previous section, an image is represented by a list of features (τ f,v f ), f = 1,...,F, where τ f denotes the type of a feature, v f denotes its value as real vector, and F is the number of extracted features in an image. The weak hypotheses for AdaBoost are calculated from these features. For object recognition we have chosen weak hypotheses which indicate if certain feature values appear in images. For this a weak hypothesis h has to select a feature type τ, its value v, and a similarity threshold θ. The threshold θ decides if an image contains a feature value v f that is sufficiently similar to v. The similarity between v f and v is calculated by the Mahalanobis distance for Moment Invariants and by the Euclidean distance for SIFTs. The weak hypotheses finder searches for the optimal weak hypothesis given labeled representations of the training images (R (I 1 ),l 1 ),...,(R (I m ),l m ) and their weights w 1,...,w m calculated by AdaBoost among all possible feature values and corresponding thresholds. The main computational burden is the calculation of the distances between v f and v, since they both range over all feature values that appear in the training images. 4 Given these distances which can be calculated prior to Boosting, the remaining calculations are relatively inexpensive. Details for the weak hypotheses finder are given in Figure 2. After sorting the optimal threshold for feature (τ k,f,v k,f ) can now be calculated in time O(m) by scanning through the weights 4 We discuss possible improvements in Section 6.
8 78 A. Opelt et al. Input: Labeled representations (R (I k ),l k ), k =1,...,m, R (I k )={(τ k,f,v k,f ):f =1,...,F k }. Distance functions: Let d τ (, ) be the distance in respect to the feature values of type τ in the training images. Minimal distance matrix: For all features (τ k,f,v k,f ) and all images I j calculate the minimal distance between v k,f and features in I j, d k,f,j = min d τk,f (v k,f,v j,g). 1 g F j :τ j,g =τ k,f Sorting: For each k, f let π k,f (1),...,π k,f (m) be a permutation such that d k,f,πk,f (1) d k,f,πk,f (m). Select best weak hypothesis (Scanline): For all features (τ k,f,v k,f ) calculate over all images I j s max w πk,f (i) l πk,f (i). s i=1 and select the feature (τ i,f,v i,f ) where the maximum is achieved. Select threshold θ: With the position s where the scanline reached a maxium sum the threshold θ is set to θ = d k,f,π k,f (s) d k,f,πk,f (s+1). 2 Fig. 2. Explanation of the weak hypotheses finder. w 1,...,w m in the order of the distances d k,f,j. Searching over all features, the calculation of the optimal weak hypothesis takes O(Fm) time. To give an example of the absolute computation times we used a dataset of 150 positive and 150 negative images. Each image has an average number of approximately 400 patches. Using SIFTs one iteration after preprocessing requires about one minute computation time on a P4, 2.4GHz PC. 5 Experimental Setup and Results We carried out our experiments as follows: the whole approach was first tested on the database used by Fergus et al. [3]. After demonstrating a comparable performance, the approach was tested on a new, more difficult database 5, see figure 5. These images contain the objects at arbitrary scales and poses. The images also contain highly textured background. Testing on these images shows that our approach still performs well. We have used two categories of objects, persons (P) and bikes (B), and images containing none of these objects (N). Our database contains 450 images of category P, 350 of B and 250 of category N. The recognition was based on deciding presence or absence of a relevant object. 5 Available at http : // pinz/data/
9 Weak Hypotheses and Boosting for Generic Object Detection 79 Preparing our data set we randomly chose a number of images, half belonging to the object category we want to learn and half not. From each of these two piles we take one third of the images as a set of images for testing the achieved model. The performance was measured with the receiver-operating characteristic (ROC) corresponding error rate. We tested the images containing the object (e.g. category B) against non-object images from the database (e.g. categories P and N). Our training set contains 100 positive and 100 negative images. The tests are carried out on 100 new images, half belonging to the learned class and half not. Each experiment was done using just one type of local descriptor. Figure 3(a) shows the recall-precision curve (RPC) of our approach (obtained by varying η), the approach of Fergus et al. [3] and the one of Agarwal and Roth [1], trained on the dataset used by Fergus et al. [3] 6. Our approach performs better than the one of Agarwal and Roth but slightly worse than the approach of Fergus et al. Table 1 shows the results of our approach (using the affine invariant interest point detection and Moment Invariants) compared with the ones of Fergus et al. and other methods [23], [24], [1]. While they use a kind of scale and viewing direction normalisation (see [3]), we work on the original images. Our results are almost as good as the results of Fergus et al. for the motorbikes dataset. For the other datasets our error rate is somewhat higher than the one of Fergus et al., but mostly lower than the error rate of the other methods. Table 1. The table gives the ROC equal error rates on a number of datasets from the database used by Fergus et al. [3]. Our results (using the affine invariant interest point detection and Moment Invariants) are compared with the results of the approach of Fergus et al. and other methods [23], [24], [1]. The error rates of our algorithm are between the other approaches and the ones of Fergus et al. in all cases except for the faces where the algorithm of Weber et al. [24] is also slightly better. Dataset Ours Fergus et al. [3] Others Ref. Motorbikes [23] Airplanes [23] Faces [24] Cars(Side) [1] This comparison shows that our approach performs well on the Fergus et al. database. We proceed with experiments on our own dataset and show some effects of parameter tuning 7. Figure 3(b) shows the influence of the additional weighting of right positive examples in the Boosting algorithm (η). We can see that with a factor η smaller than 1.8, the recall increases faster than the precision 6 Available at http : // vgg/data/ 7 Parametes not given in these tests are set to η =1.8, T = 50, l =16px, th = 30000, smallest scale is skipped. Depending on textured/homogenous background, the number of interest points detected in an image varies between 50 and 1000.
10 80 A. Opelt et al. Fig. 3. The curves in (a) and (b) are obtained by varying the factor η. In (a) the diagram shows the recall-precision curve for [3], [1] and our approach on the cars (side) dataset. Our approach is superior to the one of Agarwal and Roth but slightly worse than the one of Fergus et al. The diagram (b) shows the influence of an additional factor η for the weights of correctly positive classified examples. The recall increases faster than the precision drops until a factor of 1.8. drops. Then both curves have nearly the same (but inverse) gradient up to a factor of 3. For η>3 the precision decreases rapidly with no relevant gain of recall. Table 2 presents the performance of the Moment Invarants as local descriptor, compared with our low level descriptors (using the affine invariant interest point detector). Moment Invariants delivered the best results but the other low level descriptors did not perform badly, either. This behaviour might be explained by the fact that the extracted regions are already normalised against the same set of transformations as the Moment Invariants. Table 2. The table shows the results we reached with the three different kinds of local descriptors. We used an additional weight factor η = 1.7 here. Moment Invariants delivered the best results. Local Descriptor recall precision Moment Invariants Intensity Moments Subsampled Grayvalues In table 3 the results of our approach using the scale invariant interest point detector compared with the use of the affine invariant interest point detector are shown. We also vary the additional weight for right positive classified examples η. The affine invariant interest point detector achieves better results for the recall but precision is higher when we use the scale invariant version of the interest
11 Weak Hypotheses and Boosting for Generic Object Detection 81 1 Recall Precision Curve of our approach 1 Recall Precision Curve of our approach recall 0.5 recall Moment Invariants + Aff. Inv. SIFT precision (a) 0.1 Moment Invariants + Aff. Inv. SIFT precision (b) Fig. 4. In (a) the recall-precision curve of our approach with Moment Invariants and the affine invariant interest point detection, and the recall-precision curve of our approach using SIFTs for category bike are shown. (b) shows the recall-precision curves with the same methods for the category person. point detector. This is to be expected since the affine invariant detector allows for more variation in the image, which implies higher recall but less precision. Table 3. The table shows the results of our approach using the scale invariant interest point detector compared with using the affine invariant interest point detector varying the additional weight for right positive classified examples η. η recall (scale inv.) precision (scale inv.) recall (affine inv.) precision (affine inv.) We skipped the smallest scale in our experiments because experiments show that this reduction of number of points does not have relevant influence to the error rates. Again, using the parameters that performed best, figure 4(a) shows an example of a recall-precision curve (RPC) of our approach trained on the bike dataset from our image database with Moment Invariants and the affine invariant interest point detection compared with our approach using SIFTs. Using the same methods we obtain the recall-precision curves (RPC) shown in figure 4(b) for the category person. For directly comparing the results reached using the Moment Invariants with the affine invariant interest point detector or using SIFTs, the ROC equal error rates on various datasets are shown in table 4. As seen here the SIFTs perform better on our database. Tested on a category of the database from Fergus et al. one can see that the Moment Invariants perform better in that case.
12 82 A. Opelt et al. Table 4. This table shows a comparison of the ROC equal error rates reached with the two high level features. On our database the SIFTs perform better, but on the database of Fergus et al. the Moment Invariants reach the better error rate. Dataset Moment Invariants SIFTs Airplanes Bikes Persons Discussion and Outlook In conclusion, we have presented a novel approach for the detection and recognition of object categories in still images. Our system uses several steps of image analysis and feature extraction, which have been previously described, but succeeds on rather complex images with a lot of background structure. Objects are shown in substantially different poses and scales, and in many of the images the objects (bikes or persons) cover only a small portion of the whole image. The main contribution of the paper, however, lies in the new concept of learning. We use Boosting as the underlying learning technique and combine it with a weak hypothesis finder. In addition to several other advantages of this approach, which have already been mentioned, we want to emphasize that this approach allows the formation of very diverse visual features into a final hypothesis. We think that this capability is the main reason for the good experimental results on our complex database. Furthermore, experimental comparison on the database used by Fergus et al. [3] shows that our approach performs similarly well to state-of-the-art object categorization on simpler images. We are currently investigating extensions of our approach in several directions. Maybe the most obvious is the addition of more features to our image analysis. This includes not only other local descriptors like differential invariants [12], but also regional features 8 and geometric features 9. To reduce the complexity of our approach we are considering a reduction of the number of features by clustering methods. As the next step we will use spatial relations between features to improve the accuracy of our object detector. To handle the complexity of many possible relations between features, we will use the features constructed in our current approach (with parameters set for high recall) as starting points. Boosting will again be the underlying method for learning object representations as spatial combinations of features. This will allow the construction of weak hypotheses for discriminative spatial relations. Acknowledgements. This work was supported by the European project LAVA (IST ) and by the Austrian Science Foundation (FWF, project S Regional features describe regions found by appearance based clustering. 9 A geometric feature describes the appearance of geometric shapes, e.g. ellipses, in images.
13 Weak Hypotheses and Boosting for Generic Object Detection 83 Fig. 5. Examples from our image data base. The first column shows three images from the object class bike, the second column contains objects from the class person and the images in the last column belong to none of the classes (called nobikenoperson). The second example in the last column shows a moped as a very difficult counter-example to the category of bikes. N04). We are grateful to David Lowe and Cordelia Schmid for providing the code for their detectors/descriptors. References 1. S. Agarwal and D. Roth. Learning a sparse representation for object detection. In Proc. ECCV, pages , Gy. Dorko and C. Schmid. Selection of scale-invariant parts for object class recognition. In Proc. International Conference on Computer Vision, R. Fergus, P. Perona, and A. Zisserman. Object class recognition by unsupervised scale-invariant learning. In Proc. CVPR, W. Freeman and E. Adelson. The design and use of steerable filters. In PAMI, pages , Y. Freund and R. E. Schapire. A decision-theoretic generalisation of on-line learning. Computer and System Sciences, 55(1), A. Garg, S. Agarwal, and T. S. Huang. Fusion of global and local information for object detection. In Proc. CVPR, volume 2, pages , R. C. Gonzalez and R.E. Woods. Digital Image Processing. Addison-Wesley, 2001.
14 84 A. Opelt et al. 8. L. Van Gool, T. Moons, and D. Ungureanu. Affine / photometric invariants for planar intensity patterns. In Proc. ECCV, pages , C. Harris and M. Stephens. A combined corner and edge detector. In Proc. of the 4th ALVEY vision conference, pages , R. Laganiere. A morphological operator for corner detection. Pattern Recognition, 31(11): , T. Lindeberg. Feature detection with automatic scale selection. International Journal of Computer Vision, D. G. Lowe. Object recognition from local scale-invariant features. In Proc. ICCV, pages , W. Maass and M. Warmuth. Efficient learning with virtual threshold gates. Information and Computation, 141(1):66 83, S. Mahamud, M. Hebert, and J. Shi. Object recognition using boosted discriminants. In Proc. CVPR, volume 1, pages , K. Mikolajczyk and C. Schmid. Indexing based on scale invariant interest points. In Proc. ICCV, pages , K. Mikolajczyk and C. Schmid. An affine invariant interest point detector. In Proc. ECCV, pages , K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. In Proc. CVPR, C. Schmid and R. Mohr. Local grayvalue invariants for image retrieval. In PAMI, volume 19, pages , C. Schmid, R. Mohr, and C. Bauckhage. Evaluation of interest point detectors. International Journal of Computer Vision, pages , H. Schneiderman and T. Kanade. Object detection using the statistics of parts. International Journal of Computer Vision, to appear. 21. E. Shilat, M. Werman, and Y. Gdalyahu. Ridge s corner detection and correspondence. In Computer Vision and Pattern Recognition, pages , P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proc. CVPR, M. Weber. Unsupervised Learning of Models for Object Recognition. PhD thesis, California Institute of Technology, Pasadena, CA, M. Weber, M. Welling, and P. Perona. Unsupervised learning of models for recognition. In Proc. ECCV, R. P. Wuertz and T. Lourens. Corner detection in color images by multiscale combination of end-stopped cortical cells. In International Conference on Artificial Neuronal Networks, pages , 1997.
3D Model based Object Class Detection in An Arbitrary View
3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/
More informationLocal features and matching. Image classification & object localization
Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to
More informationProbabilistic Latent Semantic Analysis (plsa)
Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References
More informationObject class recognition using unsupervised scale-invariant learning
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of Technology Goal Recognition of object categories
More informationObject Class Recognition by Unsupervised Scale-Invariant Learning
Object Class Recognition by Unsupervised Scale-Invariant Learning R. Fergus 1 P. Perona 2 A. Zisserman 1 1 Dept. of Engineering Science 2 Dept. of Electrical Engineering University of Oxford California
More informationScale-Invariant Object Categorization using a Scale-Adaptive Mean-Shift Search
in DAGM 04 Pattern Recognition Symposium, Tübingen, Germany, Aug 2004. Scale-Invariant Object Categorization using a Scale-Adaptive Mean-Shift Search Bastian Leibe ½ and Bernt Schiele ½¾ ½ Perceptual Computing
More informationA Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow
, pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices
More informationMore Local Structure Information for Make-Model Recognition
More Local Structure Information for Make-Model Recognition David Anthony Torres Dept. of Computer Science The University of California at San Diego La Jolla, CA 9093 Abstract An object classification
More informationAnalecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
More informationCanny Edge Detection
Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties
More informationA Bayesian Framework for Unsupervised One-Shot Learning of Object Categories
A Bayesian Framework for Unsupervised One-Shot Learning of Object Categories Li Fei-Fei 1 Rob Fergus 1,2 Pietro Perona 1 1. California Institute of Technology 2. Oxford University Single Object Object
More informationImage Segmentation and Registration
Image Segmentation and Registration Dr. Christine Tanner (tanner@vision.ee.ethz.ch) Computer Vision Laboratory, ETH Zürich Dr. Verena Kaynig, Machine Learning Laboratory, ETH Zürich Outline Segmentation
More informationFeature Tracking and Optical Flow
02/09/12 Feature Tracking and Optical Flow Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Many slides adapted from Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve
More informationJiří Matas. Hough Transform
Hough Transform Jiří Matas Center for Machine Perception Department of Cybernetics, Faculty of Electrical Engineering Czech Technical University, Prague Many slides thanks to Kristen Grauman and Bastian
More informationPractical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006
Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,
More informationDistinctive Image Features from Scale-Invariant Keypoints
Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., Canada lowe@cs.ubc.ca January 5, 2004 Abstract This paper
More informationRandomized Trees for Real-Time Keypoint Recognition
Randomized Trees for Real-Time Keypoint Recognition Vincent Lepetit Pascal Lagger Pascal Fua Computer Vision Laboratory École Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne, Switzerland Email:
More informationSelf Organizing Maps: Fundamentals
Self Organizing Maps: Fundamentals Introduction to Neural Networks : Lecture 16 John A. Bullinaria, 2004 1. What is a Self Organizing Map? 2. Topographic Maps 3. Setting up a Self Organizing Map 4. Kohonen
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
More informationBuild Panoramas on Android Phones
Build Panoramas on Android Phones Tao Chu, Bowen Meng, Zixuan Wang Stanford University, Stanford CA Abstract The purpose of this work is to implement panorama stitching from a sequence of photos taken
More informationFace Recognition in Low-resolution Images by Using Local Zernike Moments
Proceedings of the International Conference on Machine Vision and Machine Learning Prague, Czech Republic, August14-15, 014 Paper No. 15 Face Recognition in Low-resolution Images by Using Local Zernie
More informationComparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationCS 534: Computer Vision 3D Model-based recognition
CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!
More informationIncremental PCA: An Alternative Approach for Novelty Detection
Incremental PCA: An Alternative Approach for Detection Hugo Vieira Neto and Ulrich Nehmzow Department of Computer Science University of Essex Wivenhoe Park Colchester CO4 3SQ {hvieir, udfn}@essex.ac.uk
More informationVisual Categorization with Bags of Keypoints
Visual Categorization with Bags of Keypoints Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray Xerox Research Centre Europe 6, chemin de Maupertuis 38240 Meylan, France
More informationAutomatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo
More informationClassification of Fingerprints. Sarat C. Dass Department of Statistics & Probability
Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller
More informationRecognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang
Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform
More informationDiscovering objects and their location in images
Discovering objects and their location in images Josef Sivic Bryan C. Russell Alexei A. Efros Andrew Zisserman William T. Freeman Dept. of Engineering Science CS and AI Laboratory School of Computer Science
More informationClassifying Manipulation Primitives from Visual Data
Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationColour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone
More informationFinding people in repeated shots of the same scene
Finding people in repeated shots of the same scene Josef Sivic 1 C. Lawrence Zitnick Richard Szeliski 1 University of Oxford Microsoft Research Abstract The goal of this work is to find all occurrences
More informationNovelty Detection in image recognition using IRF Neural Networks properties
Novelty Detection in image recognition using IRF Neural Networks properties Philippe Smagghe, Jean-Luc Buessler, Jean-Philippe Urban Université de Haute-Alsace MIPS 4, rue des Frères Lumière, 68093 Mulhouse,
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationObject Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr
Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental
More informationTouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service
TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service Feng Tang, Daniel R. Tretter, Qian Lin HP Laboratories HPL-2012-131R1 Keyword(s): image recognition; cloud service;
More informationOpen-Set Face Recognition-based Visitor Interface System
Open-Set Face Recognition-based Visitor Interface System Hazım K. Ekenel, Lorant Szasz-Toth, and Rainer Stiefelhagen Computer Science Department, Universität Karlsruhe (TH) Am Fasanengarten 5, Karlsruhe
More informationMVA ENS Cachan. Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr
Machine Learning for Computer Vision 1 MVA ENS Cachan Lecture 2: Logistic regression & intro to MIL Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Department of Applied Mathematics Ecole Centrale Paris Galen
More informationQUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING
QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING MRS. A H. TIRMARE 1, MS.R.N.KULKARNI 2, MR. A R. BHOSALE 3 MR. C.S. MORE 4 MR.A.G.NIMBALKAR 5 1, 2 Assistant professor Bharati Vidyapeeth s college
More informationEnvironmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
More informationTHE development of methods for automatic detection
Learning to Detect Objects in Images via a Sparse, Part-Based Representation Shivani Agarwal, Aatif Awan and Dan Roth, Member, IEEE Computer Society 1 Abstract We study the problem of detecting objects
More informationConvolution. 1D Formula: 2D Formula: Example on the web: http://www.jhu.edu/~signals/convolve/
Basic Filters (7) Convolution/correlation/Linear filtering Gaussian filters Smoothing and noise reduction First derivatives of Gaussian Second derivative of Gaussian: Laplacian Oriented Gaussian filters
More informationMake and Model Recognition of Cars
Make and Model Recognition of Cars Sparta Cheung Department of Electrical and Computer Engineering University of California, San Diego 9500 Gilman Dr., La Jolla, CA 92093 sparta@ucsd.edu Alice Chu Department
More informationDetermining optimal window size for texture feature extraction methods
IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec
More informationSignature verification using Kolmogorov-Smirnov. statistic
Signature verification using Kolmogorov-Smirnov statistic Harish Srinivasan, Sargur N.Srihari and Matthew J Beal University at Buffalo, the State University of New York, Buffalo USA {srihari,hs32}@cedar.buffalo.edu,mbeal@cse.buffalo.edu
More informationOn-line Boosting and Vision
On-line Boosting and Vision Helmut Grabner and Horst Bischof Institute for Computer Graphics and Vision Graz University of Technology {hgrabner, bischof}@icg.tu-graz.ac.at Abstract Boosting has become
More informationRecognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object
More informationSolving Geometric Problems with the Rotating Calipers *
Solving Geometric Problems with the Rotating Calipers * Godfried Toussaint School of Computer Science McGill University Montreal, Quebec, Canada ABSTRACT Shamos [1] recently showed that the diameter of
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationFast Matching of Binary Features
Fast Matching of Binary Features Marius Muja and David G. Lowe Laboratory for Computational Intelligence University of British Columbia, Vancouver, Canada {mariusm,lowe}@cs.ubc.ca Abstract There has been
More informationThe goal is multiply object tracking by detection with application on pedestrians.
Coupled Detection and Trajectory Estimation for Multi-Object Tracking By B. Leibe, K. Schindler, L. Van Gool Presented By: Hanukaev Dmitri Lecturer: Prof. Daphna Wienshall The Goal The goal is multiply
More informationImage Classification for Dogs and Cats
Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science kzhou3@ualberta.ca Abstract
More informationjorge s. marques image processing
image processing images images: what are they? what is shown in this image? What is this? what is an image images describe the evolution of physical variables (intensity, color, reflectance, condutivity)
More informationMorphological segmentation of histology cell images
Morphological segmentation of histology cell images A.Nedzved, S.Ablameyko, I.Pitas Institute of Engineering Cybernetics of the National Academy of Sciences Surganova, 6, 00 Minsk, Belarus E-mail abl@newman.bas-net.by
More informationDocument Image Retrieval using Signatures as Queries
Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and
More informationJournal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition
IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu
More informationIMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS
IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS Alexander Velizhev 1 (presenter) Roman Shapovalov 2 Konrad Schindler 3 1 Hexagon Technology Center, Heerbrugg, Switzerland 2 Graphics & Media
More informationCategorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
More information3 An Illustrative Example
Objectives An Illustrative Example Objectives - Theory and Examples -2 Problem Statement -2 Perceptron - Two-Input Case -4 Pattern Recognition Example -5 Hamming Network -8 Feedforward Layer -8 Recurrent
More informationSURF: Speeded Up Robust Features
SURF: Speeded Up Robust Features Herbert Bay 1, Tinne Tuytelaars 2, and Luc Van Gool 12 1 ETH Zurich {bay, vangool}@vision.ee.ethz.ch 2 Katholieke Universiteit Leuven {Tinne.Tuytelaars, Luc.Vangool}@esat.kuleuven.be
More informationL25: Ensemble learning
L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna
More informationPrincipal components analysis
CS229 Lecture notes Andrew Ng Part XI Principal components analysis In our discussion of factor analysis, we gave a way to model data x R n as approximately lying in some k-dimension subspace, where k
More informationCees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.
Visual search: what's next? Cees Snoek University of Amsterdam The Netherlands Euvision Technologies The Netherlands Problem statement US flag Tree Aircraft Humans Dog Smoking Building Basketball Table
More informationVEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS
VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,
More informationAssessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationBildverarbeitung und Mustererkennung Image Processing and Pattern Recognition
Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition 1. Image Pre-Processing - Pixel Brightness Transformation - Geometric Transformation - Image Denoising 1 1. Image Pre-Processing
More informationCrater detection with segmentation-based image processing algorithm
Template reference : 100181708K-EN Crater detection with segmentation-based image processing algorithm M. Spigai, S. Clerc (Thales Alenia Space-France) V. Simard-Bilodeau (U. Sherbrooke and NGC Aerospace,
More informationA Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms
A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms H. Kandil and A. Atwan Information Technology Department, Faculty of Computer and Information Sciences, Mansoura University,El-Gomhoria
More informationA Learning Based Method for Super-Resolution of Low Resolution Images
A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method
More informationMedical Image Segmentation of PACS System Image Post-processing *
Medical Image Segmentation of PACS System Image Post-processing * Lv Jie, Xiong Chun-rong, and Xie Miao Department of Professional Technical Institute, Yulin Normal University, Yulin Guangxi 537000, China
More information3D OBJECT MODELING AND RECOGNITION IN PHOTOGRAPHS AND VIDEO
3D OBJECT MODELING AND RECOGNITION IN PHOTOGRAPHS AND VIDEO Fredrick H. Rothganger, Ph.D. Computer Science University of Illinois at Urbana-Champaign, 2004 Jean Ponce, Adviser This thesis introduces a
More informationThe use of computer vision technologies to augment human monitoring of secure computing facilities
The use of computer vision technologies to augment human monitoring of secure computing facilities Marius Potgieter School of Information and Communication Technology Nelson Mandela Metropolitan University
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationLecture 2: The SVM classifier
Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function
More informationTracking in flussi video 3D. Ing. Samuele Salti
Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking
More informationMachine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
More informationExtend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia
More informationAdmin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis
Admin stuff 4 Image Pyramids Change of office hours on Wed 4 th April Mon 3 st March 9.3.3pm (right after class) Change of time/date t of last class Currently Mon 5 th May What about Thursday 8 th May?
More informationTensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
More informationsiftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service
siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service Ahmad Pahlavan Tafti 1, Hamid Hassannia 2, and Zeyun Yu 1 1 Department of Computer Science, University of Wisconsin -Milwaukee,
More informationSegmentation & Clustering
EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,
More informationVisual Structure Analysis of Flow Charts in Patent Images
Visual Structure Analysis of Flow Charts in Patent Images Roland Mörzinger, René Schuster, András Horti, and Georg Thallinger JOANNEUM RESEARCH Forschungsgesellschaft mbh DIGITAL - Institute for Information
More informationApplication of Face Recognition to Person Matching in Trains
Application of Face Recognition to Person Matching in Trains May 2008 Objective Matching of person Context : in trains Using face recognition and face detection algorithms With a video-surveillance camera
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationHigh Quality Image Magnification using Cross-Scale Self-Similarity
High Quality Image Magnification using Cross-Scale Self-Similarity André Gooßen 1, Arne Ehlers 1, Thomas Pralow 2, Rolf-Rainer Grigat 1 1 Vision Systems, Hamburg University of Technology, D-21079 Hamburg
More informationVision based Vehicle Tracking using a high angle camera
Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work
More informationFootwear Print Retrieval System for Real Crime Scene Marks
Footwear Print Retrieval System for Real Crime Scene Marks Yi Tang, Sargur N. Srihari, Harish Kasiviswanathan and Jason J. Corso Center of Excellence for Document Analysis and Recognition (CEDAR) University
More informationTracking-Learning-Detection
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 6, NO., JANUARY 200 Tracking-Learning-Detection Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas, Abstract This paper investigates
More informationVector and Matrix Norms
Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty
More informationLeast-Squares Intersection of Lines
Least-Squares Intersection of Lines Johannes Traa - UIUC 2013 This write-up derives the least-squares solution for the intersection of lines. In the general case, a set of lines will not intersect at a
More informationEuler Vector: A Combinatorial Signature for Gray-Tone Images
Euler Vector: A Combinatorial Signature for Gray-Tone Images Arijit Bishnu, Bhargab B. Bhattacharya y, Malay K. Kundu, C. A. Murthy fbishnu t, bhargab, malay, murthyg@isical.ac.in Indian Statistical Institute,
More informationSurgical Tools Recognition and Pupil Segmentation for Cataract Surgical Process Modeling
Surgical Tools Recognition and Pupil Segmentation for Cataract Surgical Process Modeling David Bouget, Florent Lalys, Pierre Jannin To cite this version: David Bouget, Florent Lalys, Pierre Jannin. Surgical
More informationFacebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
More informationMachine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
More informationData, Measurements, Features
Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are
More information