WIDE-BASELINE MATTE PROPAGATION FOR INDOOR SCENES

Size: px
Start display at page:

Download "WIDE-BASELINE MATTE PROPAGATION FOR INDOOR SCENES"

Transcription

1 WIDE-BASELINE MATTE PROPAGATION FOR INDOOR SCENES M. Sarim 1, A. Hilton 2, J.-Y. Guillemaut 3 University of Surrey, Guildford, UK. 1 m.farooqui@surrey.ac.uk 2 a.hilton@surrey.ac.uk 3 j.guillemaut@surrey.ac.uk Abstract Digital image matting is a process of extracting foreground objects from an image. This is extremely challenging for natural images and videos because of its ill posed nature. Initial user interaction is required to aid the algorithms in identifying the definite foreground and background regions. Recently techniques have been developed to estimate the alpha matte of an image using multi-view images of a foreground object. However these algorithms are only capable of handling narrow baseline views having small intensity and structural variations in the foreground. In this paper, we propose a novel non-parametric approach to generate alpha matte for wide-baseline multi-view images having different inter-view foreground appearance. Keywords: Digital matting, alpha matte, multiple view, trimap, wide-baseline. 1 Introduction Digital image matting is a classical problem of computer vision where a foreground object is extracted from an image along with its pixel-wise opacity to form a composite with a desired background. The problem has been extensively studied because of the increasing number of special effects in the media industry. An image can be thought of as a composite of three layers namely foreground, background and opacity generally referred to as an alpha matte. A composite image was first mathematically formulated in terms of these layers by Porter and Duff [17] as I = αf + (1 α) B. (1) Equation (1) is known as compositing equation, where I, F and B are the composite, foreground and background layers while α represents an alpha matte. The alpha matte is an image layer providing pixel s foreground opacity in the range of [0, 1]. The value α = 0 or α = 1 defines the definite background or foreground pixel respectively and 0 < α < 1 represents a mixed pixel with blending proportion defined by α. The solution of equation (1) is not possible as it is underconstrained. In a RGB colour space we have to solve the equation for seven unknown (all the variables on the right hand side) given only three equations corresponding to the RGB channels. The equation is constrained in a studio environment by using homogeneous known background colour typically blue or green [20]. The assumption that foreground colour distribution is different from background colour provides a straightforward solution to the compositing equation for alpha. However in natural scene these constraints are not available and the solution of equation (1) becomes extremely challenging. In natural images the constraints on the foreground and background regions are provided by the user in the form of a trimap. A trimap is typically a hand drawn segmentation of an image into three regions namely definite foreground, background and the unknown region. The regions are represented by white, black and gray colour on the trimap lattice respectively. A typical trimap of a natural image is shown in Fig 1 along with the estimated alpha matte and the new composite. Matting algorithms then utilise the statistics of the definite known regions to estimate alpha values for the unknown region where the pixels are usually a blend of foreground and background colour. Recently techniques [10, 12, 13, 16, 25, 26] have been developed to exploit multiple view statistics to estimate alpha. The main limitations of these approaches are their incapability to handle wide-baseline views and requirement of epipolar constraints. These algorithms work on the fundamental assumption that the foreground appearance in terms of intensity and shape is invariant across multiple views. This assumption only holds for narrow baseline views with similar projection such as camera array. The problem becomes more difficult for wide-baseline views captured by a surrounding camera setup because: (1) there is a significant change in foreground projection even in adjacent views due to occlusion and projective distortion (2) variation in luminance due to incident light and shadows result in changes in appearance with viewpoint and (3) background changes for different viewpoints. In this paper we present a novel non-parametric approach to estimate an alpha matte for wide-baseline views. Previously, inpainting techniques [8, 7] and view interpolation [9] have successfully used similar non-parametric approaches to represent local image statistics in a single view. Our algorithm uses a mean shift clustering [5, 6] to propagate a key view trimap across multiple views without using epipolar constraints. Once a trimap is transfered to the neighboring view of the key image, template based non-parametric matting algorithms are applied to extract the alpha matte. Since

2 (a) (b) (c) (d) Figure 1: (a) Original image, (b) trimap, (c) estimated alpha matte and (d) new composite. Images are taken from the data-set provided by [24]. our technique only relies on the user aided information available in the key view it can handle images captured by an ordinary uncalibrated cameras without fixation constraints. A fixation constraint is an assumption that the foreground object is centrally located across multiple views. The approach significantly reduces the user interaction required to extract alpha matte for multiple wide-baseline views which can later be used for 3D modeling and reconstruction or object insertion. 2 Related work 2.1 Single view matting Natural image matting is a well studied field of computer vision. Unlike studio images, natural images have no constraints on foreground and background colour. Therefore to initialise an algorithm, user interaction is required to aid the definition of foreground and background regions in an image. Once the definite foreground and background layers are identified by the user, algorithms then exploit the global or local statistics of these regions to estimate the alpha value of the undefined region. Approaches like [4, 11, 18] fit statistical models to the local foreground and background pixels, alpha value for the unknown pixels are computed using these local models. An isotropic mixture of Gaussian approach was proposed by Ruzon and Tomasi [18] to model the local known regions. The alpha value for an unknown pixel is computed by using these mixture of Gaussian distributions. Hillman et al. [11] extended the idea of [18] by using anisotropic distributions as the intensity variation in an image forms prolate rather then spherical clusters in colour space. They utilised principal component analysis to identify the major axis of these anisotropic clusters which are then used to estimate the alpha value of a local unknown pixel. Chuang et al. [4] formulated the matting problem in the well known Bayesian framework. They used a similar isotropic approach to [11], to model the local known pixels but unlike [11] they also considered the already estimated foreground and background colour of unknown pixels with in a predefined spatial window. Technique developed by Berman et al. [2], now available as a Corel plug in named Knockout, assumed the nearby regions to be locally smooth. Alpha value of an unknown pixel is computed by taking the weighted average of the local foreground and background colour values. Strong assumption made by these techniques regarding the smoothness and correlation of the nearby known pixels introduced a requirement for a precise trimap. Since these techniques are heavily biased toward the colour distribution of local known regions they tend to suffer if the local foreground and background clusters overlap. To avoid the errors with this local dependency, approaches like [1, 23] associate Gaussian mixture models to the known foreground and background region globally. Misclassification of colour samples is the fundamental limitation of sample based techniques. Algorithms like [15, 22] use local affinities to alleviate this problem. Poisson matting [22] assumed that the intensity variations in the foreground and background region is locally smooth. They computed the alpha value by solving the Poisson equation with the matte gradient field. Levin et al. [15] utilised the local smoothness assumption to fit a linear model to the foreground and background colours resulting in a closed form solution for alpha. A technique called Robust matting [24] is proposed which uses local colour sampling as well as the affinity approach similar to [15]. The algorithm uses optimised colour sampling to extract the higher confidence colour samples which are then combined with the affinity to obtain a matting energy function. Alpha values are estimated by minimising this energy function. Although affinity based approaches overcome the limitation of sample misclassification they are prone to accumulation of small errors in the final alpha matte because of their propagation behavior in estimating the alpha value. Recently a non-parametric template based matting technique is proposed in [19] which used known or globally inpainted background plate. The foreground colour for an unknown pixel is estimated by the median colour of the centre pixel in the few most similar local foreground templates. Since template matching preserve the spatial information along with colour, the algorithm is robust against highly textured natural images. The template based approach tends to produce error in the regions where the inpainted background is not similar to the true background.

3 Figure 2: Flow chart for the wide-baseline multi-view alpha matting 2.2 Multiple view matting All the techniques mentioned above use single view information to extract an alpha matte. If multiple views are available, an algorithm has more information at its disposal to better estimate an alpha matte for each view. Nearly all the multi-view matting technique assume the foreground is invariant across the views while the background is different. Using this assumption they formulate the matting problem in a triangular fashion [20], with pixels having a single foreground colour and multiple background colours. Approaches [12, 13] have used pixel variance across the views to extract a variance image. This reference image is then thresholded to generate the trimap. The techniques used the nearby variance information of the known regions to estimate the final alpha matte. Both of these techniques have a very narrow baseline of around 5cm, Joshi et al. [13] estimate an alpha matte for a single reference view while Hyun et al. [12] extended their approach by sharing the trimap across multiple views. Alpha mattes are generated by merging the foreground edges in the normal views and histogram equalised views. Wexler et al. [25] estimated the alpha matte using the relative motion of foreground and background. They assume that the rigid foreground is sweeping over a background in multiple images. They constructed the clean background plate from the planar background motion and then formulate the problem in a Bayesian framework to estimate the alpha matte. Their technique suffers for blurred foreground regions and require a planar motion to construct the clean background plate. Won et al. [26] build a high dimensional feature space from multi-resolution rectified stereo pairs in a Gaussian pyramid. They use local linear embedding to construct a trimap after which Bayesian matting [4] is employed to extract the final alpha matte. Hasinoff at el. [10] formulate the matting problem as estimating the 3D boundary curve and foreground colour that best fits the multi-view images. They used depth information across multiple views to estimate the background and foreground boundary colour. Their results are prone to stereo inaccuracies and do not exploit the colour statistics of an image. In defocus video matting McGuire et al. [16] used a specialised setup comprising three imaging sensors sharing the same centre of projection. The multiple views captured from this set up, which are focused separately on foreground and background regions, aiding the automatic trimap generation. They assume that the foreground, background depths and camera parameters are known and formulate the matting problem as an error minimization of quadratic function of alpha and foreground colour. Their method cannot handle fast moving blurred regions. Graph cut optimization approach is used by Campbell et al. [3] to perform a binary segmentation of a given view into foreground and background. They used fixation constraints to obtain the seed pixels, from widebaseline views, to construct the initial Gaussian mixture model of foreground colour. The process is iterated to improve the colour model until convergence. Image edges are then combine with the obtained colour model and graph cut algorithm is then applied to achieve the final segmentation. Since they rely on Gaussian mixture colour model their technique suffer when the foreground and background colour distribution overlaps. All multiple views matting algorithms to date assume a narrow baseline( separation in view orientation) between views such that the foreground has similar appearance. In this paper we address the problem of wide-baseline matting for camera views with (> 30 0 ) separation such that there are large changes in foreground appearance. Our approach does not require any camera calibration or specialised setup and does not make hard assumptions on the foreground colour and position across multiple views. The algorithm can handle wide-baseline views having significantly different foreground appearance. 3 Wide-baseline multi-view alpha matte estimation Our algorithm is composed of two main steps: (1) inter-view trimap propagation, and (2) alpha matte estimation using a nonparametric matting algorithm. An overview of our algorithm is shown in Fig 2. We have used three high definition cameras in a roughly 90 0 arc with 45 0 between views to capture foreground. To represent multiple views we use the notation [I l, I c, I r ] for left, centre and right view respectively. The main assumption of our approach is the static and known background for all the views represented correspondingly as [B l, B c, B r ]. 3.1 Inter-view trimap propagation To propagate a trimap through multiple views, initially, user has to define a trimap T c for the centre view I c which defines definite foreground and background pixels. The available backgrounds are clean from foreground shadow, this causes

4 (a) (b) (c) (d) Figure 3: (a) Centre view I c, (b) centre view background B c, (c) bimap and (d) user defined trimap T c. error in proper labeling of the region in the views [I l, I r ] which are contaminated by the foreground shadow. To overcome this problem we have modeled the shadow region present in the centre view as background by difference keying of the view I c from the pure background. Since we are dealing with wide-baseline views having orientation of = 45 0 making the background across the views largely different, difference keying also helps us to narrow down the search region for trimap propagation. Since the pure background for all the views is known, the trimap T c could be defined initially by performing a binary segmentation of I c by removing the background B c from it. The centre view I c and its background B c are shown in Fig 3(a,b). This segmentation, splits I c into two regions: (1) the foreground, blended and shadow contaminated background pixels and (2) the definite background pixels, let us call it a bimap. The subtraction is performed patch-wise rather than pixel-wise to avoid the background noise and mis-labeling of fine blended foreground structures as background. Image I c can be represented as a function of Euclidean coordinates as I c (x, y) so as its background B c (x, y). If a square patch of size [m] is used, the subtraction is performed according to the equation S(x, y) = g s= g t= h h (I(x + s, y + t) B(x + s, y + t)) 2, (2) where S(x, y) is the subtraction map which is the function of sum of square difference, while g = h = (m 1)/2. The definite background pixels, represented by blue in Fig 3(c), are labeled using a background distance threshold τ b as T c (x, y) = background if, S(x, y) τ b not background otherwise (3) The not background region B, represented by green in Fig 3(c), consist of the definite foreground, blended and shadow contaminated background pixels. The bimap is converted into a trimap T c by the user, manually defining the shadow region as background. To avoid the user interaction in defining the shadow region explicitly, a foreground extraction technique [14] could be used. Fig 3(d) shows the refined trimap T c where the foreground, blended and shadow regions are represented in traditional trimap form of white, gray and black colour respectively Template clustering using mean shift An inter-view trimap propagation, [T l T c T r ] could be achieved by brute force template matching between I c and [I l, I r ]. T l and T r are the propagated trimaps for the left and right view respectively. There are two main limitations which make the brute force implementation prohibitively expensive: (1) the wide-baseline camera setup projects a different foreground aspect in the left and right views and (2) in order to reliably estimate wide-baseline correspondence an initial surface reconstruction such as the visual hull is required [21]. However this requires the foreground segmentation to be known a priori. This work is focused on obtaining the segmentation and therefore a coarse reconstruction can not be performed to constraint the search and the orientation of surface patches. To alleviate the aforementioned problem, mean shift algorithm [5, 6] is employed to reduce the template search space in the centre view I c. We define two template spaces, namely foreground template space T f and background template space T b, constructed by placing a square patch of size n at every foreground and background pixel in I c in accordance with T c respectively. Each template effectively has 3n 2 dimensions in RGB space. Both template spaces are clustered individually using mean shift algorithm with spherical radius of r c, for the cluster window, in RGB colour space and mean shift vector threshold ɛ of 0.1. This template grouping reduces the template search space to an order of 10 2 from Mean shift is performed as [ ] C f k, cf,m k [ ] Cl b, c b,m l = meanshift ( T f, r c, ɛ ) = meanshift ( T b, r c, ɛ ) (4) where the foreground and background clusters are represented by C f k and Cb l respectively while c f,m k and c b,m l denote their mean template. k = 1,.., n f l = 1,.., n b are the number of foreground and background clusters formed Trimap label propagation Although the mean shift clustering reduces the search space considerably, further reduction in computational cost can be achieved by labeling the definite background in left and right views, [I l, I r ], by subtracting their respective backgrounds [B l, B r ]. We have used a similar approach to equation (3) to classify the definite background and not background

5 (a) (b) (c) (d) Figure 4: (a) Right image I r, (b) bimap of the view I r after subtracting the background B r, (c) unrefined trimap T r after label propagation to the green region of the bimap(white foreground, yellow background, red unknown pixels) and (d) the refined trimap in the traditional (white, gray, black) representation. regions, referred to as a bimap. Fig 4(b) shows the bimap for the right view I r, blue represents the definite background while green corresponds to the not background region. Now the problem is reduced to populating the trimap labels in the not background region, B, in the trimaps [Tl, T r] corresponding to views [I l, I r ]. Consider one view at a time, let us take I r. The trimap propagation T c T r is achieved by comparing every template in the not background region, B, of T r to the foreground and background mean template cluster spaces c f,m k and c b,m l respectively. For a B pixel p, a template P is extracted by localising a square patch of dimension n. The template should be dimensionally consistent to c f,m k and c b,m l for comparison. Now the patch P is compared to foreground and background mean template cluster spaces individually. The minimum normalised sum of square difference (NSSD) from the two search spaces is given by d f 1 ( (p) = min k=1 n f n 2 P, c f,m k d b 1 ( (p) = min l=1 n b n 2 P, c b,m l ) ). (5) Where d f (p) and d b (p) denote the minimum NSSD of pixel p to the mean foreground and background cluster space. Function (A, B) gives the sum of square difference in RGB space between the templates A and B, while n 2 is the number of pixels in the patch used for normalising SSD. The process is iterated for all the pixels in the not background region B of T r. We can visualise these minimum normalised sum of square differences as a difference image. Let us denote these foreground and background difference images by D f and D b respectively. The trimap label for all the B pixels is assigned by thresholding the ratio of the difference images separately for the foreground and background. For a B pixel p the label is tagged as foreground if, D b /D f ɛ f T r (p) = background if, D b /D f ɛ b (6) unknown otherwise. Where ɛ f and ɛ b are the foreground and background thresholds used for classification. The algorithm is iterated until all the B pixels are assigned a trimap label, an example is shown in Fig 4(c) for a right image I r in Fig 4(a). For the sake of visibility, the foreground, background and the unknown pixels, in the B region in Fig 4(c), are represented by white, yellow and red respectively. The trimap obtained suffers from some mis-classification which, if not rectified, can lead to erroneous alpha matte. Therefore a trimap refinement step is necessary prior to final alpha matte estimation process Trimap refinement The mis-classification is mainly caused by: (1) image noise, (2) presence of specular surfaces and (3) overlap of foreground and background distributions in colour space. We have used morphological operations to remove the small errors present in the trimap T r that occurred due to the image noise. To rectify the large erroneous regions, caused by specular reflection and intersection of foreground and background colour distributions, we assume that the foreground is opaque. Initially the regions having area less than the predefined area threshold ɛ a are identified. Let us take one of these regions as R regardless of its type as foreground, background or unknown. The region R is then dilated to get the surrounding pixels R s. If all the pixels in R s belongs to the foreground or background, the region R assigned the label accordingly otherwise it is labeled as an unknown region. Mathematically this correction can be written as foreground R background unknown if, R s foreground if, R s background otherwise. A refined trimap of an image, in traditional white, gray and black representation, is shown in Fig 4(d). Once the trimap is refined we can extract the final alpha matte by estimating the foreground colour for all the unknown pixels and using the background colour from the available background plate. 3.2 Alpha matte estimation Given a trimap for the wide-baseline views we estimate the alpha matte using the non-parametric approach introduced in [19]. We have utilised a non-parametric approach because of (7)

6 its strong mechanism to represents the local image features, colours and textures that attempts to preserve the spatial information of an image Foreground colour estimation A square patch of size n is localised at every unknown and foreground pixel, separately, to construct the template spaces for the unknown and foreground region. Let us denote these foreground and unknown template space by F and U respectively. To estimate the foreground colour, f(p), for a pixel p in the unknown region we consider the patch associated with it as u p and find the most similar patch f q in the foreground template space F. The colour of the foreground pixel q is assigned as the foreground colour f(p) of the unknown pixel p. The templates in F which are associated with the pixels present at the foreground boundary contain unknown pixels as well. To avoid the effect of these unknown pixels, the comparison is only performed between the pure foreground pixels present in the patches of the foreground template space F. Like most of the previous matting techniques, we assume that the foreground colour in the unknown region comes from the nearby known foreground region. We initially define the minimum size, r i, of the circular search region for all the pixels in the unknown region. To identify the final size,r s, of the search region for a pixel p, first the spatial distance between the pixel p and the nearest foreground pixel is computed denoted by r p, and then it is added to the initialized minimum size r i. Therefore the dimension of the circular foreground search region for a pixel p is give by r s = r i +r p as shown by the green region in Fig 5. The main reason to introduce the distance r p is to avoid the wrong estimation of the foreground colour for the pixels laying near the far edge of the unknown region. Now let us represent all the patches in the template space F, which are spatially contained in the search region r s, by F(r s ). The most similar patch f q to the unknown patch u p can be found as 1 f q = arg min f i F(r s) n f (u p, f i ). (8) Where, (u p, f i ) has the same definition as explained in section and n f is the number of foreground pixels present in the patch, f i, used for normalization to ensure the costs are comparable. The presence of noise in the foreground region leads to segmentation artifacts, therefore a more robust approach is required to estimate the foreground colour Robust foreground colour estimation A normalised sum of square difference vector D in RGB space for a patch u p is constructed as D i = 1 n f (u p, f i ). (9) f i F(r s) To robustly estimate the foreground colour for pixel p, the difference vector, D, is sorted as D j < D j+1. Now we consider Figure 5: Green portion shows the search area in the foreground region for the unknown pixel p. the centre pixel colour for the N most similar patches in the foreground template space F(r s ) as τ = {f1, c f2, c..., fn c }. The foreground colour f(p) for pixel p is estimated as the median of τ, that is f(p) = µ 1/2 (τ). In this paper we have used the three most similar patches, that is N = 3. The algorithm is iterated for all the pixels in the unknown region Alpha estimation The alpha value for a pixel p is estimated by reformatting the compositing equation (1) as α(p) = c(p) b(p) f(p) b(p). (10) Where, c(p) and b(p) are the composite and the background colour of pixel p taken from the images I r and its background image B r. Equation (10) is iterated for all the unknown pixels in the trimap T r to yield the final alpha matte, α r, for the right image I r. Similar trimap label propagation and alpha matte estimation processes are performed to generate the alpha matte, α l, for the left image I l. 4 Results and evaluation In this paper we used three different scenes for qualitative and quantitative evaluations. All the images are captured by high definition cameras in a studio. The cameras are placed in a circular arc in front of the foreground object. The pair of views roughly have angular separation of For this paper, the values of the parameters in our algorithm are set as m = 5, ɛ f = 0.9 and ɛ b = 0.3. The algorithm is implemented in Matlab and has a runtime of (12 13 min) for a pair of images, where major proportion of the time is consumed in high dimensional template clustering and foreground colour estimation. For evaluating the estimated mattes quantitatively,

7 Original Estimated alpha matte Trimap Groundtruth Original Estimated alpha matte Trimap Groundtruth Right camera view Centre camera view Left camera view Figure 6: Views from two different dance scenes along with their propagated trimaps, estimated alpha mattes and the ground truths.

8 Estimated alpha matte Trimap Groundtruth Original Right camera view Centre camera view Left camera view Figure 7: Views from an office scene along with their propagated trimaps, estimated alpha mattes and the ground truths. the ground truth mattes are generated by using the Closed form [15] matting technique. Initially precise trimap for all the views are drawn by the user and then the Closed form algorithm is used to estimate the ground truth matte individually for all the views. 4.1 Qualitative evaluation Fig [6,7] show the three different scenes along with their propagated trimaps, estimated alpha mattes and the ground truths. For the dance scenes in Fig 6 our algorithm is able to propagate the labels correctly even though in the centre view of the first image, the trouser of the boy is largely occluded. Our matting technique easily removed the shadow region present near the models feet, which is initially identified as an unknown region during the trimap propagation. The images in Fig 7 are difficult because of the presence of a large shadow region. The trimap propagation algorithm did classify the large part of the shadow region as background but the portion close to the chair and girl s feet has a strong shadow and the black trouser, shoes and chair made it difficult to tag the region as background. The matting algorithm also performed less well in this region as there is no colour information available to exploit and the local foreground region also contains black in it. The matting technique has produced good alpha matte for the rest of the foreground region. The alpha mattes estimated by our technique does not have visible artifacts compared to the ground truth in the dance scenes. The mattes are consistent across views and do not suffer from visible segmentation inaccuracies. 4.2 Quantitative evaluation For quantitative analysis we used two error estimates: (1) mean absolute error, MAE and (2) the number of pixels which have error greater than 90% of the maximum absolute error present in the matte, represented by NME. The MAE provide the overall error present in the matte while NME gives the large misclassification of foreground and background pixels. Fig 8 shows the mean absolute error in alpha matte against the ground truth, the alpha values are scaled to [0, 255]. It is clear from the chart that the errors are mainly produced in the shadow region as the centre view in all the scenes has a small error. The office scene has large error compared to the dance images because the local foreground colour distribution is similar to the strong shadow region.

9 mattes that our technique is capable of producing good alpha mattes for the wide-baseline views. The technique is robust against shadows but has a limitation for the background pixels that are contaminated by strong shadow and have black local foreground pixels. For such pixels the estimated foreground and background colours are not distinct enough to compute the proper alpha values. Future research will concentrate on building a better model for shadows and optimise the technique for aforementioned problems that occur in rare scenes. Also the technique would be extended to deal with the challenges presented by outdoor scenes. Acknowledgement Figure 8: Mean absolute error in the three views (right R, centre C and left L) of the different scenes. Figure 9: Number of pixels having error >90% of the maximum absolute error in the three views (right R, centre C and left L) of the different scences. The number of pixels having error greater than 90% of the maximum absolute error is plotted in Fig 9. For the dance scenes the algorithm showed robustness against foreground and background pixel misclassification in the regions which are free from strong shadow. Again the technique suffered in the office image to classify the pixels in the shadow contaminated background region with similar local foreground colour distribution. Overall the majority pf pixels are correctly classified with low error. 5 Conclusion We have presented a novel approach for matte propagation across wide-baseline views without using epipolar or fixation constraints. Previous multi-view matting techniques are limited to narrow baseline views having very small variation in foreground appearance, they also require the epipolar or fixation constraint to deal with the correspondence problem. It is clear from the evaluation and the visual analysis of the alpha This research was executed with the financial support of the EU IST FP7 project i3dpost. References [1] X. Bai and G. Sapiro. Geodesic matting: A framework for fast interactive image and video segmentation and matting. Int. J. Comput. Vision, 82(2): , [2] A. Berman, A. Dadourian, and P. Vlahos. Method of removing from an image the background surrounding a selected object. U.S. Patent 6,134,346, [3] N. D. F. Campbell, G. Vogiatzis, C. Hernandez, and R. Cipolla. Automatic 3d object segmentation in multiple views using volumetric graph-cuts. Image and Vision Computing, September [4] Y. Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski. A bayesian approach to digital matting. In Proceedings of IEEE CVPR 01, volume 2, pages , December [5] D. Comaniciu and P. Meer. Robust analysis of feature spaces: Color image segmentation. pages , [6] D. Comaniciu and P. Meer. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5): , [7] A. Criminisi, P. Prez, and K. Toyama. Object removal by exemplar-based inpainting. Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 2: , [8] A. Efros and T. Leung. Texture synthesis by nonparametric sampling. In IEEE International conference on computer vision, pages , [9] A. Fitzgibbon, Y. Wexler, and A. Zisserman. Image based redering using image based priors. In International conference on computer vision ICCV, pages , [10] Samuel W. Hasinoff, Sing Bing Kang, and Richard Szeliski. Boundary matting for view synthesis, 2004.

10 [11] P. Hillman, J. Hannah, and D. Renshaw. Alpha channel estimation in high resolution images and image sequences. In IEEE CVPR, pages , [12] M.H. Hyun, S.Y. Kim, and Y.S. Ho. Multi-view image matting and compositing using trimap sharing for natural 3-d scene generation. In 3DTV08, pages , [13] N. Joshi, W. Matusik, and S. Avidan. Natural video matting using camera arrays. ACM Trans. Graph., 25(3): , [25] Y. Wexler, A. W. Fitzgibbon, and A. Zisserman. Bayesian estimation of layers from multiple images. In ECCV 02: Proceedings of the 7th European Conference on Computer Vision-Part III, pages , London, UK, Springer-Verlag. [26] K.H. Won, S.Y. Park, and S.K. Jung. Natural image matting based on neighbor embedding. pages , [14] H. Kim and A. Hilton. Region-based foreground extraction. In Conference of Visual Media Production (CVMP), [15] A. Levin, D. Lischinski, and Y. Weiss. A closed form solution to natural image matting. Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 1:61 68, [16] M. McGuire, W. Matusik, H. Pfister, J. F. Hughes, and F. Durand. Defocus video matting. ACM Trans. Graph., 24(3): , [17] T. Porter and T. Duff. Compositing digital images. In ACM SIGGRAPH 84: Proceedings of the 11th annual conference on Computer graphics and interactive techniques, pages , [18] M. A. Ruzon and C. Tomasi. Alpha estimation in natural images. In CVPR, pages 18 25, June [19] M. Sarim, A. Hilton, and J.-Y.Guillemaut. Nonparametric patch based video matting. British Machine Vision Conference (BMVC), [20] A. R. Smith and J. F. Blinn. Blue screen matting. In ACM SIGGRAPH 96: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pages , [21] J. Starck, G. Miller, and A. Hilton. Volumetric stereo with silhouette and feature constraints. British Machine Vision Conference (BMVC), 3: , [22] J. Sun, J. Jia, C.K. Tang, and H. Y. Shum. Poisson matting. ACM Transactions on Graphics, 23(3): , [23] J. Wang and M. F. Cohen. An iterative optimization approach for unified image segmentation and matting. In ICCV 05: Proceedings of the Tenth IEEE International Conference on Computer Vision, pages , Washington, DC, USA, IEEE Computer Society. [24] J. Wang and M. F. Cohen. Optimized color sampling for robust matting. Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 0:1 8, 2007.

Colour Image Segmentation Technique for Screen Printing

Colour Image Segmentation Technique for Screen Printing 60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone

More information

Edge tracking for motion segmentation and depth ordering

Edge tracking for motion segmentation and depth ordering Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk

More information

3D Model based Object Class Detection in An Arbitrary View

3D Model based Object Class Detection in An Arbitrary View 3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/

More information

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006 Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,

More information

The Visual Internet of Things System Based on Depth Camera

The Visual Internet of Things System Based on Depth Camera The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed

More information

Vision based Vehicle Tracking using a high angle camera

Vision based Vehicle Tracking using a high angle camera Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work

More information

Effective Gradient Domain Object Editing on Mobile Devices

Effective Gradient Domain Object Editing on Mobile Devices Effective Gradient Domain Object Editing on Mobile Devices Yingen Xiong, Dingding Liu, Kari Pulli Nokia Research Center, Palo Alto, CA, USA Email: {yingen.xiong, dingding.liu, kari.pulli}@nokia.com University

More information

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006 More on Mean-shift R.Collins, CSE, PSU Spring 2006 Recall: Kernel Density Estimation Given a set of data samples x i ; i=1...n Convolve with a kernel function H to generate a smooth function f(x) Equivalent

More information

Fast Image Labeling for Creating High-Resolution Panoramic Images on Mobile Devices

Fast Image Labeling for Creating High-Resolution Panoramic Images on Mobile Devices 2009 11th IEEE International Symposium on Multimedia Fast Image Labeling for Creating High-Resolution Panoramic Images on Mobile Devices Yingen Xiong and Kari Pulli Nokia Research Center Palo Alto, CA

More information

Tracking and Recognition in Sports Videos

Tracking and Recognition in Sports Videos Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey mustafa.teke@gmail.com b Department of Computer

More information

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,

More information

Highlight Removal by Illumination-Constrained Inpainting

Highlight Removal by Illumination-Constrained Inpainting Highlight Removal by Illumination-Constrained Inpainting Ping Tan Stephen Lin Long Quan Heung-Yeung Shum Microsoft Research, Asia Hong Kong University of Science and Technology Abstract We present a single-image

More information

Segmentation of building models from dense 3D point-clouds

Segmentation of building models from dense 3D point-clouds Segmentation of building models from dense 3D point-clouds Joachim Bauer, Konrad Karner, Konrad Schindler, Andreas Klaus, Christopher Zach VRVis Research Center for Virtual Reality and Visualization, Institute

More information

Interactive Offline Tracking for Color Objects

Interactive Offline Tracking for Color Objects Interactive Offline Tracking for Color Objects Yichen Wei Jian Sun Xiaoou Tang Heung-Yeung Shum Microsoft Research Asia, Beijing, China {yichenw,jiansun,xitang,hshum}@microsoft.com Abstract In this paper,

More information

Building an Advanced Invariant Real-Time Human Tracking System

Building an Advanced Invariant Real-Time Human Tracking System UDC 004.41 Building an Advanced Invariant Real-Time Human Tracking System Fayez Idris 1, Mazen Abu_Zaher 2, Rashad J. Rasras 3, and Ibrahiem M. M. El Emary 4 1 School of Informatics and Computing, German-Jordanian

More information

Part-Based Recognition

Part-Based Recognition Part-Based Recognition Benedict Brown CS597D, Fall 2003 Princeton University CS 597D, Part-Based Recognition p. 1/32 Introduction Many objects are made up of parts It s presumably easier to identify simple

More information

Character Image Patterns as Big Data

Character Image Patterns as Big Data 22 International Conference on Frontiers in Handwriting Recognition Character Image Patterns as Big Data Seiichi Uchida, Ryosuke Ishida, Akira Yoshida, Wenjie Cai, Yaokai Feng Kyushu University, Fukuoka,

More information

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling , March 13-15, 2013, Hong Kong Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling Naveed Ahmed Abstract We present a system for spatio-temporally

More information

Tracking performance evaluation on PETS 2015 Challenge datasets

Tracking performance evaluation on PETS 2015 Challenge datasets Tracking performance evaluation on PETS 2015 Challenge datasets Tahir Nawaz, Jonathan Boyle, Longzhen Li and James Ferryman Computational Vision Group, School of Systems Engineering University of Reading,

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

Template-based Eye and Mouth Detection for 3D Video Conferencing

Template-based Eye and Mouth Detection for 3D Video Conferencing Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer

More information

Space-Time Video Completion

Space-Time Video Completion Space-Time Video Completion Y. Wexler E. Shechtman M. Irani Dept. of Computer Science and Applied Math The Weizmann Institute of Science Rehovot, 7600 Israel Abstract We present a method for space-time

More information

Image Segmentation and Registration

Image Segmentation and Registration Image Segmentation and Registration Dr. Christine Tanner (tanner@vision.ee.ethz.ch) Computer Vision Laboratory, ETH Zürich Dr. Verena Kaynig, Machine Learning Laboratory, ETH Zürich Outline Segmentation

More information

A Short Introduction to Computer Graphics

A Short Introduction to Computer Graphics A Short Introduction to Computer Graphics Frédo Durand MIT Laboratory for Computer Science 1 Introduction Chapter I: Basics Although computer graphics is a vast field that encompasses almost any graphical

More information

Face Model Fitting on Low Resolution Images

Face Model Fitting on Low Resolution Images Face Model Fitting on Low Resolution Images Xiaoming Liu Peter H. Tu Frederick W. Wheeler Visualization and Computer Vision Lab General Electric Global Research Center Niskayuna, NY, 1239, USA {liux,tu,wheeler}@research.ge.com

More information

Announcements. Active stereo with structured light. Project structured light patterns onto the object

Announcements. Active stereo with structured light. Project structured light patterns onto the object Announcements Active stereo with structured light Project 3 extension: Wednesday at noon Final project proposal extension: Friday at noon > consult with Steve, Rick, and/or Ian now! Project 2 artifact

More information

Mean-Shift Tracking with Random Sampling

Mean-Shift Tracking with Random Sampling 1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow 02/09/12 Feature Tracking and Optical Flow Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Many slides adapted from Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve

More information

Classifying Manipulation Primitives from Visual Data

Classifying Manipulation Primitives from Visual Data Classifying Manipulation Primitives from Visual Data Sandy Huang and Dylan Hadfield-Menell Abstract One approach to learning from demonstrations in robotics is to make use of a classifier to predict if

More information

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow , pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices

More information

Real-Time Tracking of Pedestrians and Vehicles

Real-Time Tracking of Pedestrians and Vehicles Real-Time Tracking of Pedestrians and Vehicles N.T. Siebel and S.J. Maybank. Computational Vision Group Department of Computer Science The University of Reading Reading RG6 6AY, England Abstract We present

More information

High Quality Image Deblurring Panchromatic Pixels

High Quality Image Deblurring Panchromatic Pixels High Quality Image Deblurring Panchromatic Pixels ACM Transaction on Graphics vol. 31, No. 5, 2012 Sen Wang, Tingbo Hou, John Border, Hong Qin, and Rodney Miller Presented by Bong-Seok Choi School of Electrical

More information

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER Fatemeh Karimi Nejadasl, Ben G.H. Gorte, and Serge P. Hoogendoorn Institute of Earth Observation and Space System, Delft University

More information

How To Analyze Ball Blur On A Ball Image

How To Analyze Ball Blur On A Ball Image Single Image 3D Reconstruction of Ball Motion and Spin From Motion Blur An Experiment in Motion from Blur Giacomo Boracchi, Vincenzo Caglioti, Alessandro Giusti Objective From a single image, reconstruct:

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition

Journal of Industrial Engineering Research. Adaptive sequence of Key Pose Detection for Human Action Recognition IWNEST PUBLISHER Journal of Industrial Engineering Research (ISSN: 2077-4559) Journal home page: http://www.iwnest.com/aace/ Adaptive sequence of Key Pose Detection for Human Action Recognition 1 T. Sindhu

More information

Fast Matting Using Large Kernel Matting Laplacian Matrices

Fast Matting Using Large Kernel Matting Laplacian Matrices Fast Matting Using Large Kernel Matting Laplacian Matrices Kaiming He 1 1 Department of Information Engineering The Chinese University of Hong Kong Jian Sun 2 2 Microsoft Research Asia Xiaoou Tang 1,3

More information

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data

Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear

More information

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R. Epipolar Geometry We consider two perspective images of a scene as taken from a stereo pair of cameras (or equivalently, assume the scene is rigid and imaged with a single camera from two different locations).

More information

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER

HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER HSI BASED COLOUR IMAGE EQUALIZATION USING ITERATIVE n th ROOT AND n th POWER Gholamreza Anbarjafari icv Group, IMS Lab, Institute of Technology, University of Tartu, Tartu 50411, Estonia sjafari@ut.ee

More information

Color Segmentation Based Depth Image Filtering

Color Segmentation Based Depth Image Filtering Color Segmentation Based Depth Image Filtering Michael Schmeing and Xiaoyi Jiang Department of Computer Science, University of Münster Einsteinstraße 62, 48149 Münster, Germany, {m.schmeing xjiang}@uni-muenster.de

More information

Cloud-Empowered Multimedia Service: An Automatic Video Storytelling Tool

Cloud-Empowered Multimedia Service: An Automatic Video Storytelling Tool Cloud-Empowered Multimedia Service: An Automatic Video Storytelling Tool Joseph C. Tsai Foundation of Computer Science Lab. The University of Aizu Fukushima-ken, Japan jctsai@u-aizu.ac.jp Abstract Video

More information

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections Maximilian Hung, Bohyun B. Kim, Xiling Zhang August 17, 2013 Abstract While current systems already provide

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

WATER BODY EXTRACTION FROM MULTI SPECTRAL IMAGE BY SPECTRAL PATTERN ANALYSIS

WATER BODY EXTRACTION FROM MULTI SPECTRAL IMAGE BY SPECTRAL PATTERN ANALYSIS WATER BODY EXTRACTION FROM MULTI SPECTRAL IMAGE BY SPECTRAL PATTERN ANALYSIS Nguyen Dinh Duong Department of Environmental Information Study and Analysis, Institute of Geography, 18 Hoang Quoc Viet Rd.,

More information

A ROBUST BACKGROUND REMOVAL ALGORTIHMS

A ROBUST BACKGROUND REMOVAL ALGORTIHMS A ROBUST BACKGROUND REMOVAL ALGORTIHMS USING FUZZY C-MEANS CLUSTERING ABSTRACT S.Lakshmi 1 and Dr.V.Sankaranarayanan 2 1 Jeppiaar Engineering College, Chennai lakshmi1503@gmail.com 2 Director, Crescent

More information

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM Amol Ambardekar, Mircea Nicolescu, and George Bebis Department of Computer Science and Engineering University

More information

An Iterative Image Registration Technique with an Application to Stereo Vision

An Iterative Image Registration Technique with an Application to Stereo Vision An Iterative Image Registration Technique with an Application to Stereo Vision Bruce D. Lucas Takeo Kanade Computer Science Department Carnegie-Mellon University Pittsburgh, Pennsylvania 15213 Abstract

More information

Automatic Detection of PCB Defects

Automatic Detection of PCB Defects IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 6 November 2014 ISSN (online): 2349-6010 Automatic Detection of PCB Defects Ashish Singh PG Student Vimal H.

More information

Object tracking & Motion detection in video sequences

Object tracking & Motion detection in video sequences Introduction Object tracking & Motion detection in video sequences Recomended link: http://cmp.felk.cvut.cz/~hlavac/teachpresen/17compvision3d/41imagemotion.pdf 1 2 DYNAMIC SCENE ANALYSIS The input to

More information

A Prototype For Eye-Gaze Corrected

A Prototype For Eye-Gaze Corrected A Prototype For Eye-Gaze Corrected Video Chat on Graphics Hardware Maarten Dumont, Steven Maesen, Sammy Rogmans and Philippe Bekaert Introduction Traditional webcam video chat: No eye contact. No extensive

More information

Dense Matching Methods for 3D Scene Reconstruction from Wide Baseline Images

Dense Matching Methods for 3D Scene Reconstruction from Wide Baseline Images Dense Matching Methods for 3D Scene Reconstruction from Wide Baseline Images Zoltán Megyesi PhD Theses Supervisor: Prof. Dmitry Chetverikov Eötvös Loránd University PhD Program in Informatics Program Director:

More information

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion

More information

Wii Remote Calibration Using the Sensor Bar

Wii Remote Calibration Using the Sensor Bar Wii Remote Calibration Using the Sensor Bar Alparslan Yildiz Abdullah Akay Yusuf Sinan Akgul GIT Vision Lab - http://vision.gyte.edu.tr Gebze Institute of Technology Kocaeli, Turkey {yildiz, akay, akgul}@bilmuh.gyte.edu.tr

More information

Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

More information

Automatic Labeling of Lane Markings for Autonomous Vehicles

Automatic Labeling of Lane Markings for Autonomous Vehicles Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 jkiske@stanford.edu 1. Introduction As autonomous vehicles become more popular,

More information

Binary Image Scanning Algorithm for Cane Segmentation

Binary Image Scanning Algorithm for Cane Segmentation Binary Image Scanning Algorithm for Cane Segmentation Ricardo D. C. Marin Department of Computer Science University Of Canterbury Canterbury, Christchurch ricardo.castanedamarin@pg.canterbury.ac.nz Tom

More information

High Quality Image Magnification using Cross-Scale Self-Similarity

High Quality Image Magnification using Cross-Scale Self-Similarity High Quality Image Magnification using Cross-Scale Self-Similarity André Gooßen 1, Arne Ehlers 1, Thomas Pralow 2, Rolf-Rainer Grigat 1 1 Vision Systems, Hamburg University of Technology, D-21079 Hamburg

More information

A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA. Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo. University of Padova, Italy

A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA. Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo. University of Padova, Italy A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo University of Padova, Italy ABSTRACT Current Time-of-Flight matrix sensors allow for the acquisition

More information

Simultaneous Gamma Correction and Registration in the Frequency Domain

Simultaneous Gamma Correction and Registration in the Frequency Domain Simultaneous Gamma Correction and Registration in the Frequency Domain Alexander Wong a28wong@uwaterloo.ca William Bishop wdbishop@uwaterloo.ca Department of Electrical and Computer Engineering University

More information

Multivariate data visualization using shadow

Multivariate data visualization using shadow Proceedings of the IIEEJ Ima and Visual Computing Wor Kuching, Malaysia, Novembe Multivariate data visualization using shadow Zhongxiang ZHENG Suguru SAITO Tokyo Institute of Technology ABSTRACT When visualizing

More information

An Algorithm for Classification of Five Types of Defects on Bare Printed Circuit Board

An Algorithm for Classification of Five Types of Defects on Bare Printed Circuit Board IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 5, No. 3, July 2011 CSES International 2011 ISSN 0973-4406 An Algorithm for Classification of Five Types of Defects on Bare

More information

Canny Edge Detection

Canny Edge Detection Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties

More information

Circle Object Recognition Based on Monocular Vision for Home Security Robot

Circle Object Recognition Based on Monocular Vision for Home Security Robot Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang

More information

Real Time Target Tracking with Pan Tilt Zoom Camera

Real Time Target Tracking with Pan Tilt Zoom Camera 2009 Digital Image Computing: Techniques and Applications Real Time Target Tracking with Pan Tilt Zoom Camera Pankaj Kumar, Anthony Dick School of Computer Science The University of Adelaide Adelaide,

More information

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo

More information

Object Recognition and Template Matching

Object Recognition and Template Matching Object Recognition and Template Matching Template Matching A template is a small image (sub-image) The goal is to find occurrences of this template in a larger image That is, you want to find matches of

More information

AUTOMATIC CROWD ANALYSIS FROM VERY HIGH RESOLUTION SATELLITE IMAGES

AUTOMATIC CROWD ANALYSIS FROM VERY HIGH RESOLUTION SATELLITE IMAGES In: Stilla U et al (Eds) PIA11. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 38 (3/W22) AUTOMATIC CROWD ANALYSIS FROM VERY HIGH RESOLUTION SATELLITE IMAGES

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

3 Image-Based Photo Hulls. 2 Image-Based Visual Hulls. 3.1 Approach. 3.2 Photo-Consistency. Figure 1. View-dependent geometry.

3 Image-Based Photo Hulls. 2 Image-Based Visual Hulls. 3.1 Approach. 3.2 Photo-Consistency. Figure 1. View-dependent geometry. Image-Based Photo Hulls Greg Slabaugh, Ron Schafer Georgia Institute of Technology Center for Signal and Image Processing Atlanta, GA 30332 {slabaugh, rws}@ece.gatech.edu Mat Hans Hewlett-Packard Laboratories

More information

Human behavior analysis from videos using optical flow

Human behavior analysis from videos using optical flow L a b o r a t o i r e I n f o r m a t i q u e F o n d a m e n t a l e d e L i l l e Human behavior analysis from videos using optical flow Yassine Benabbas Directeur de thèse : Chabane Djeraba Multitel

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

Machine vision systems - 2

Machine vision systems - 2 Machine vision systems Problem definition Image acquisition Image segmentation Connected component analysis Machine vision systems - 1 Problem definition Design a vision system to see a flat world Page

More information

Colorado School of Mines Computer Vision Professor William Hoff

Colorado School of Mines Computer Vision Professor William Hoff Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Introduction to 2 What is? A process that produces from images of the external world a description

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

More Local Structure Information for Make-Model Recognition

More Local Structure Information for Make-Model Recognition More Local Structure Information for Make-Model Recognition David Anthony Torres Dept. of Computer Science The University of California at San Diego La Jolla, CA 9093 Abstract An object classification

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING

QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING MRS. A H. TIRMARE 1, MS.R.N.KULKARNI 2, MR. A R. BHOSALE 3 MR. C.S. MORE 4 MR.A.G.NIMBALKAR 5 1, 2 Assistant professor Bharati Vidyapeeth s college

More information

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition 1. Image Pre-Processing - Pixel Brightness Transformation - Geometric Transformation - Image Denoising 1 1. Image Pre-Processing

More information

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr

Object Recognition. Selim Aksoy. Bilkent University saksoy@cs.bilkent.edu.tr Image Classification and Object Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr Image classification Image (scene) classification is a fundamental

More information

A New Image Edge Detection Method using Quality-based Clustering. Bijay Neupane Zeyar Aung Wei Lee Woon. Technical Report DNA #2012-01.

A New Image Edge Detection Method using Quality-based Clustering. Bijay Neupane Zeyar Aung Wei Lee Woon. Technical Report DNA #2012-01. A New Image Edge Detection Method using Quality-based Clustering Bijay Neupane Zeyar Aung Wei Lee Woon Technical Report DNA #2012-01 April 2012 Data & Network Analytics Research Group (DNA) Computing and

More information

Get Out of my Picture! Internet-based Inpainting

Get Out of my Picture! Internet-based Inpainting WHYTE et al.: GET OUT OF MY PICTURE! 1 Get Out of my Picture! Internet-based Inpainting Oliver Whyte 1 Josef Sivic 1 Andrew Zisserman 1,2 1 INRIA, WILLOW Project, Laboratoire d Informatique de l Ecole

More information

Lighting Estimation in Indoor Environments from Low-Quality Images

Lighting Estimation in Indoor Environments from Low-Quality Images Lighting Estimation in Indoor Environments from Low-Quality Images Natalia Neverova, Damien Muselet, Alain Trémeau Laboratoire Hubert Curien UMR CNRS 5516, University Jean Monnet, Rue du Professeur Benoît

More information

Automatic parameter regulation for a tracking system with an auto-critical function

Automatic parameter regulation for a tracking system with an auto-critical function Automatic parameter regulation for a tracking system with an auto-critical function Daniela Hall INRIA Rhône-Alpes, St. Ismier, France Email: Daniela.Hall@inrialpes.fr Abstract In this article we propose

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)

More information

Determining optimal window size for texture feature extraction methods

Determining optimal window size for texture feature extraction methods IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec

More information

Speed Performance Improvement of Vehicle Blob Tracking System

Speed Performance Improvement of Vehicle Blob Tracking System Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu, nevatia@usc.edu Abstract. A speed

More information

Reconstructing 3D Pose and Motion from a Single Camera View

Reconstructing 3D Pose and Motion from a Single Camera View Reconstructing 3D Pose and Motion from a Single Camera View R Bowden, T A Mitchell and M Sarhadi Brunel University, Uxbridge Middlesex UB8 3PH richard.bowden@brunel.ac.uk Abstract This paper presents a

More information

Automated Process for Generating Digitised Maps through GPS Data Compression

Automated Process for Generating Digitised Maps through GPS Data Compression Automated Process for Generating Digitised Maps through GPS Data Compression Stewart Worrall and Eduardo Nebot University of Sydney, Australia {s.worrall, e.nebot}@acfr.usyd.edu.au Abstract This paper

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

LIBSVX and Video Segmentation Evaluation

LIBSVX and Video Segmentation Evaluation CVPR 14 Tutorial! 1! LIBSVX and Video Segmentation Evaluation Chenliang Xu and Jason J. Corso!! Computer Science and Engineering! SUNY at Buffalo!! Electrical Engineering and Computer Science! University

More information

Fusing Time-of-Flight Depth and Color for Real-Time Segmentation and Tracking

Fusing Time-of-Flight Depth and Color for Real-Time Segmentation and Tracking Fusing Time-of-Flight Depth and Color for Real-Time Segmentation and Tracking Amit Bleiweiss 1 and Michael Werman 1 School of Computer Science The Hebrew University of Jerusalem Jerusalem 91904, Israel

More information

Object Tracking System Using Motion Detection

Object Tracking System Using Motion Detection Object Tracking System Using Motion Detection Harsha K. Ingle*, Prof. Dr. D.S. Bormane** *Department of Electronics and Telecommunication, Pune University, Pune, India Email: harshaingle@gmail.com **Department

More information

Probabilistic Latent Semantic Analysis (plsa)

Probabilistic Latent Semantic Analysis (plsa) Probabilistic Latent Semantic Analysis (plsa) SS 2008 Bayesian Networks Multimedia Computing, Universität Augsburg Rainer.Lienhart@informatik.uni-augsburg.de www.multimedia-computing.{de,org} References

More information

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree.

Cees Snoek. Machine. Humans. Multimedia Archives. Euvision Technologies The Netherlands. University of Amsterdam The Netherlands. Tree. Visual search: what's next? Cees Snoek University of Amsterdam The Netherlands Euvision Technologies The Netherlands Problem statement US flag Tree Aircraft Humans Dog Smoking Building Basketball Table

More information

Tracking in flussi video 3D. Ing. Samuele Salti

Tracking in flussi video 3D. Ing. Samuele Salti Seminari XXIII ciclo Tracking in flussi video 3D Ing. Tutors: Prof. Tullio Salmon Cinotti Prof. Luigi Di Stefano The Tracking problem Detection Object model, Track initiation, Track termination, Tracking

More information

OBJECT TRACKING USING LOG-POLAR TRANSFORMATION

OBJECT TRACKING USING LOG-POLAR TRANSFORMATION OBJECT TRACKING USING LOG-POLAR TRANSFORMATION A Thesis Submitted to the Gradual Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements

More information

Common Core Unit Summary Grades 6 to 8

Common Core Unit Summary Grades 6 to 8 Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations

More information