Stereo+Kinect for High Resolution Stereo Correspondences

Size: px
Start display at page:

Download "Stereo+Kinect for High Resolution Stereo Correspondences"

Transcription

1 Stereo+Kinect for High Resolution Stereo Correspondences Gowri Somanath University of Delaware Scott Cohen, Brian Price Adobe Research Chandra Kambhamettu University of Delaware Abstract In this work, we combine the complementary depth sensors Kinect and stereo image matching to obtain high quality correspondences. Our goal is to obtain a dense disparity map at the spatial and depth resolution of the stereo cameras (4-12 MP). We propose a global optimization scheme, where both the data and smoothness costs are derived using sensor confidences and low resolution geometry from Kinect. A spatially varying search range is used to limit the number of potential disparities at each pixel. The smoothness prior is based on available low resolution depth from Kinect rather than image gradients, thus performing better in both textured areas with smooth depth and textureless areas with depth gradient. We also propose a spatially varying smoothness weight to better handle occlusion areas, and the relative contribution of the two energy terms. We demonstrate how the two sensors can be effectively fused to obtain correct scene depth in ambiguous areas, as well as fine structural details in textured areas. 1. Introduction Recent years have seen increasing popularity of 3D content and sensors, and many commercial and consumer level capture and display devices are stereoscopic. Many applications are in entertainment, where existing high resolution still or video sensors have been extended for stereo capture. Example applications include consistent segmentation, object extraction, depth manipulation and view synthesis. The focus of our work is to obtain high quality stereo correspondences for such multi-mega pixel stereo images/videos towards the above applications. The two broad challenges we face are in the scene and the scale of the problem. The bane of stereo is matching in ambiguous regions with low texture or repeated textures, which abound in natural scenes. The second challenge is the computational complexity due to the large image size (4-12 Mega-Pixels (MP)) and disparity range ( integer disparity levels). Gowri Somanath was supported by Adobe Research for this work. Figure 1. Top: The setup used in our experiments. Bottom: DSLR image of a sample scene and Depth map from Kinect, where the black indicate areas with no depth reading from Kinect. In this paper we address the challenges through fusion of stereo image matching with a complementary depth sensor, such as Kinect. Though Kinect does not suffer from ambiguity in low or repeated textures, its spatial and depth resolution is at least an order of magnitude lower than commonly used cameras for the applications above. On the other hand, a calibrated stereo setup using high resolution cameras can provide higher depth resolution. Figure 1 illustrates the complementary nature of the two sensors. The repeated chessboard pattern and the single colored boards form a challenge for stereo matching. Areas with low reflectance, such as the black wall, cannot be resolved by Kinect. Both Kinect and stereo matching can obtain depth estimates in regions with non-ambiguous texture, such as those on the toys and cloths. However, the low depth resolution of the Kinect does not recover fine structural details. We thus propose a stereo algorithm that combines the information from the stereo RGB images, and the low resolution depth information from Kinect, to obtain a dense disparity map at the resolution of the stereo images. Global optimization based algorithms have often been found most suitable to obtain smooth and dense depth maps essential for target applications. However, they suffer from large memory and computation time requirements, and fail to scale well for large disparity volumes from 4-12MP images and disparity levels. In our experiments on 1

2 such large problems, alpha expansion [1] on a traditional stereo formulation took several hours to converge. Previous works have only been reported with images less than 2MP. We obtain depth maps of high spatial and depth resolution through a global optimization framework. The confidence of the sensors, and the low resolution geometry information from the Kinect are used to derive both the data and smoothness cost in our energy function. The main contributions of our work is through the introduction of a framework to obtain high resolution correspondences through fusion of stereo matching and Kinect. Using the depth estimates from Kinect and confidences of both sensors, the data cost is calculated at only a sparse set of labels. Also the matching cost combines the image consistency with the geometry prior available from Kinect. The smoothness prior is based on the combined depth and image gradient information. This offers a significant advantage over the traditional use of image gradient in both textured and non-textured areas. For example, there can be geometrically smooth regions with strong texture gradients, or a single colored object with a smooth depth gradient. We demonstrate how our proposed changes to the data and smoothness terms can effectively combine the advantages of the two systems and recover the depth better than either sensor in isolation. In addition the sparse labels reduces the time and memory complexity for alpha expansion, thus converging orders of magnitude faster. As our primary goal is to obtain correspondences at the resolution of the stereo images, our scheme is in line with the growing trend towards scalable algorithms for large disparity volumes. The image resolution facilitates spatial resolution, while the large number of disparity levels aids better reconstruction of geometry. While most works are restricted to 1-2 MP, we make a large jump towards 4-12MP images. The large spatial and depth resolution gap introduces challenges in registration of the two systems, leading to errors in re-projection/alignment and sparsity ( 10% of stereo image pixels have depth information from Kinect). The proposed scheme has been designed to handle these effects, and can be easily adapted to any other source of depth. 2. Related Work The stereo literature is vast. Here we discuss related works, under three categories, that have employed alternate depth sensors in combination with single or stereo cameras. The first set of schemes obtain multiple samples from a single moving depth sensor to improve the accuracy or density [8, 2, 10]. Structure from Motion and tracking is used to register multiple scans. Though the schemes can provide high quality depth map as a final result, there is no clear relation between the final accuracy and the number of samples required. There is an inherent assumption that the sensor provides a depth at every point in the scene, which may not be true for certain surface colors/materials. The second category of work combines a depth sensor with a color image from a higher resolution camera for super-resolution [3, 14, 9]. Color information is used to obtain the final depth map by up-sampling, assuming image edges to be potential depth edges. Though the above methods obtain spatial super-resolution, there is no increase in depth resolution or accuracy. The third group of work combines a depth sensor with a stereo camera system [5, 15, 13, 12] and is the closest to our line of work. Similar to the papers discussed above, most of these works assume that fairly accurate depth information is available for each surface in the scene. In [5], a pixel Photonic Mixer Device (PMD) is used with pixel stereo camera. Intensity from the PMD is used to determine a binary confidence mask. The value from the PMD is used at pixels of high intensity, while the remaining pixels are matched through stereo. Thus there is inherently no increase in depth resolution even in textured areas. In [15], a pixel Time-of-Flight (ToF) sensor is combined with a pixel stereo setup. Belief Propagation is used for fusion. The data term is formed from a weighted linear combination of stereo matching costs and depth from ToF. Unlike our scheme, matching costs are calculated at all disparities at each pixel and the weights are not based on sensor confidences. In [13], a ToF sensor is combined with a stereo camera. The ToF depth is up-sampled using joint bilateral filtering, and a confidence map is obtained based on the ToF signal strength at each pixel. A stereo confidence map is computed based on local image features. A final weight map is derived as a product of the two, and used to combine the respective cost volumes. The final depth map is determined by a greedy approach. Though the method employs sensor confidence for fusion, the lack of a global optimization can lead to noisy results in regions where both sensors have low confidence. Recently, [12] proposed a global stereo scheme using sparse ground control points (GCPs), which are high confidence depth values obtained from sparse image matching or depth sensors. Like some of the earlier techniques, the depth value at a GCP is assumed to be correct and there is no provision for using sensor confidence. The first two terms are standard data and smoothness terms from color consistency. The third term is the GCP energy. The sparse depth at GCPs is interpolated using an adaptive propagation algorithm. The GCP energy is setup to penalize disparity assignments that diverge from the interpolated value at each pixel. Since a scalar parameter is used to control the deviation, it can lead to wrong estimates in surfaces where GCPs are absent or very sparse, resulting in invalid interpolation. In addition, the high computational cost required the authors to resize the images to less than 2MP. In the next section we provide an overview of how we

3 overcome the above limitations, followed by details of the individual components and results. 3. Proposed Method The goal of our work is to fuse the information from the Kinect and traditional stereo matching to leverage the advantages of both. That is, obtain depth through Kinect in ambiguous regions and increase depth resolution and precision from stereo matching of the high-resolution images. We combine the information using an energy minimization approach. Given the set of image pixels P and labels L = {L 1, L 2,..., L max } corresponding to disparities, an image labeling f assigns a disparity f p L to each pixel p P. The final labeling is obtained by minimizing the following energy using alpha expansion and Graph Cuts [1] E(f) = p P D(p, f p ) + (p,q) N V (p, q, f p, f q ). (1) The data cost D(p, f p ) is the cost of assigning label f p to pixel p. The smoothness cost V (p, q, f p, f q ) is the cost of labeling neighboring pixels p and q as f p and f q. Stereo matching algorithms have used this framework before. The data cost D(p, f p ) is traditionally computed using the color difference between the left and right images, and the smoothness term V (p, q, f p, f q ) is computed using the color similarity of neighboring pixels. We enhance both the terms in this model by including information from the Kinect. The data cost D(p, f p ) is improved in two ways. First, the number of possible labels is greatly reduced to values around the possible disparities indicated by Kinect. Second, the labels can be biased directly to be similar to the Kinect output. This helps in areas such as the flat planes in Figure 1 because the stereo color information is completely ambiguous while the Kinect gives fairly good depth within its precision limits. It also significantly accelerates the algorithm since fewer labels are considered at each pixel. Details are given in Section 3.2. The smoothness term V (p, q, f p, f q ) is set up to combine Kinect and image information in two ways. Our first modification is use of an image combining color and depth information to derive the neighborhood costs, and it has two effects. First it allows supression of non-depth edges in the image. We do so by using depth discontinuities suggested confidently by Kinect instead of differences in color to decide where not to require smoothness in the disparity labeling. Many images have color edges that do not correspond to depth discontinuities such as the checkerboard pattern in Figure 1, and these edges can confuse the stereo algorithm. The Kinect depth map, correctly, does not register these edges as depth discontinuities. Unfortunately, the depth discontinuities produced by the Kinect are not spatially precise and are at a lower resolution. We align the Kinect depth edges to the high-resolution image color edges. The second advantage of using the combined image is the introduction of depth gradients not visible in flat colored regions. The traditional formulation may miss depth gradients in textureless regions where no color difference is observed. We use the confident Kinect disparities to guide the label gradients. Our second modification to the smoothness term is that we use a spatially varying relative weight between the data and smoothness cost to better handle occlusion and alignment error as detailed in Section Setup and Initialization Before describing the details of our framework, we discuss the setup and calibration process. We use two Canon EOS 7D cameras, with the Kinect mounted as shown in Figure 1. The Kinect depth map and stereo images are and pixels, respectively. Figure 1 shows a sample scene and the Kinect depth values. The scene shows various cases where the sensors complement each other. Kinect does not provide depth readings on the the black walls and some parts of the table, as indicated by the black regions. The single colored boards, and the chessboard pattern are challenging for stereo matching. The objects such as the toys and textured cloths are regions where both stereo and Kinect perform fairly well. As our results will demonstrate, our scheme can obtain more details in the last type of region given the high resolution images. We are able to guide disparity optimization in texture-less areas using the Kinect information, and obtain the depth gradients on the boards. The stereo pair is calibrated using [11] and rectified using [4]. To calibrate the Kinect and Stereo co-ordinate system, we capture multiple calibration board images from the IR sensor and stereo. Given the individual intrinsic and extrinsic parameters we reconstruct 3D points in the Kinect co-ordinate system, and corresponding corners in the stereo images. To transfer the valid Kinect depths to the stereo images, we first estimate the rotation, translation and scale that align the two 3D point clouds. We found that large image alignment errors (re-projection error onto stereo) can arise if the 3D point alignment error is optimized. Since Kinect accuracy is a function of surface distance, orientation and reflectance, so is the error. Techniques have been proposed to reduce errors due to distance, but it is not always possible to correct those due to orientation or reflectance. In order to make the algorithm general and avoid scene specific calibrations, we adapt our stereo algorithm to handle the alignment errors. However, we also found that such errors can be minimized by using a dense set of pose and orientations of the calibration board, and estimating the transformation by optimizing the re-projection error onto the stereo images. Once the Kinect 3D points are transformed to the stereo co-ordinate system, we project them onto both the left and right views to obtain the disparity map. Figure 2 shows

4 where α, β = 1 α are scalar weights. C I, C G are truncated costs measuring color and gradient difference as follows: C I (p, f p ) = min(τ c, I r (p) I l (p f p ) ). C G (p, f p ) = min(τ g, G r (p) G l (p f p ) ). By integrating the Kinect information, we improve this in two ways. First, we limit the search range of possible disparities at pixel p to a set L p which is a range around local Kinect estimates. Second, we directly bias the stereo disparities f p to be similar to the Kinect disparities. This is useful in texture-less and ambiguous regions with a repeating pattern. Thus we define our data cost as { D D(p, f p ) = tr (p, f p ) + ρc K (p, f p ), if f p L p., otherwise. (3) (a) (c) (d) Figure 2. (a) Binary map indicating the pixels in the stereo image where we have depth values from Kinect. (b) Nearest-neighbor up-sampling of the transferred depth values from Kinect. (c) and (d) show a small part of the image and the corresponding depth values (in color) from Kinect prior to up-sampling. White area indicates missing values. (Best viewed on computer screen) the transferred depth onto the sample scene. Figure 2(a) shows a binary map, where the dark pixels indicate those where the Kinect depth value was transferred. Figure 2(c-d) shows a pixels region, and illustrates the sparsity of the transferred values. Note that the density of the Kinect transferred depth is non-uniform and varies based on the distance, surface orientation and the reflectance of the surface. Based on the viewpoint and occlusion, certain depth boundaries also contain multiple values in a neighborhood. In further sections we detail how we handle some of the above, and their effects on results Data cost Traditionally, the data cost is calculated for each disparity label using the differences in color between the rectified left and right stereo images I l and I r and/or the differences in image gradients G l and G r. For a pixel p, the traditional matching cost for assigning label f p is calculated as (b) D tr (p, f p ) = αc I (p, f p ) + βc G (p, f p ), f p L, (2) where C K (p, f p ) penalizes deviations from Kinect disparities, and ρ is a scalar weight. The Kinect gives good estimates of the disparities at many scene points, although they have limited depth resolution. We alter the search range according to a confidence measure for the Kinect samples. Kinect based term: The term penalizing deviations of stereo disparities from Kinect disparities is defined as C K (p, f p ) = f p K u (p) 2 /(2 S(p) 2 ), where K u (p) is the Kinect disparity at stereo image pixel p, and S(p) is the search range obtained using Kinect and image confidence. The Kinect disparity K u (p) in the stereo image is obtained as follows. To avoid clutter in notation we describe the confidence and cost calculation with respect to the reference image I r without the use of the subscripts. Let K be the sparse disparity map obtained from transferring the valid Kinect depth values to the reference image (see Figure 2(a)), and K n the corresponding nearest neighbor upsampled map (see Figure 2(b)). In order to smooth the upsampled map, and align the depth edges closer to the corresponding image edge, we obtain a map K u through guided filtering of K n using the stereo image as the guide [6]. The cost C K penalizes deviation of disparities around the Kinect suggested value based on the relative confidence of the Kinect and image matching result. This is controlled through the search range S(p) at a pixel, which also controls the set of potential disparities L p at the pixel. As detailed further, pixels which can be confidently matched through image information are set to have a larger search range. This ensures we can obtain higher resolution in depth than the Kinect in textured areas. On the other hand, pixels in texture-less areas are given a lower S(p) so that there is a larger penalty for deviation from Kinect disparity. Disparity Search Ranges, Sensor Confidences: For each pixel, we limit the potential disparities to a range around each sparse Kinect disparity present in a neighborhood of the pixel. The search range at each pixel is calculated as S(p) = max(s k (p), S i (p)), where S k, S i are derived from Kinect and image confidence as follows. To obtain the Kinect confidence based search map, S k, we first measure the density of valid values in K on neighborhoods of pixels. Due to the sparsity of K, some neighborhoods have fewer Kinect suggested disparities. In addition, the alignment errors can lead to multiple candidates (foreground and background) at depth boundaries as illustrated in Figure 2(d). Thus in regions of both low and high density, we would like stereo matching to test a wider range of labels, and around both foreground and background values at boundaries. We thus compute the ab-

5 3.3. Smoothness Cost: The smoothness term V (p, q, fp, fq ) has traditionally been derived from the color similarity of neighboring pixels V tr (p, q, fp, fq ) = λ fp fq e Ir (p) Ir (q) /σi. (a) Search map S (b) Smoothness map λ (c) Traditional smoothness cost (d) Our smoothness cost Figure 3. The different maps derived from Kinect and stereo image information towards data and smoothness cost estimation. The display mapping is black (low) to white (high value). solute value of the measured density minus its mode over the image, and smooth the result. Given this filtered density map M, we set Sk (p) = min(ms, ms M (p)) with limit parameter ms. To estimate the image confidence based search map, Si, we look at the uniqueness of a 5 5 patch. The score is based on the lowest color based error of the patch in the scanline. A region with low texture or repeating pattern would have neighboring patches with very low error, and thus cannot be confidently matched. In such regions, we would like to limit the number of labels tested around the value suggested by Kinect. On the other hand, in regions of high (non-repeating) texture, we would like to allow a larger search to extract fine geometries not observed by Kinect. We derive the color error based score map U as U (p) = min q R(p),q6=p X Ir (p + δ) Ir (q + δ), (p+δ) N (p) where R(p) = [max(1, p ms ), min(w, p + ms )] is the set of pixels along the scanline, w the width of the image, and N (p) is a neighborhood around p. Given the score map U, we calculate Si (p) = 1 + (1 e U (p)/σu ) ms. Given the search map S, we determine the subset Lp L of potential labels for pixel p as the union of intervals around disparities suggested by Kinect: Lp = q N (p) [K(q) S(p), K(q) + S(p)]. Based on the scene depth, image resolution, sparsity, and alignment error discussed before, we use a neighborhood N of pixels. For pixels with no Kinect data we test for all disparities. Figure 3(a) shows the map for the sample scene from Figure 1. Note how the ambiguous regions have lower search range compared to edges and textured regions. (4) This is based on the assumption that depth discontinuities coincide with image edges. The converse of the above heuristic, however, is not necessarily true. The Kinect provides regions of discontinuities, and also indicates areas with smooth depth gradients. We use the information in two ways. First, we derive the gradients based on a filtered and up-sampled depth map from Kinect. This eliminates image gradients which do not correspond to depth gradients and allows better label transitions. Second, the regions with discontinuity are also indicative of occlusion. Since our aim is to obtain a dense and smooth depth map, we allow the smoothness term to have a larger weight in such regions by using a spatially varying weight. Thus we define our smoothness term as V (p, q, fp, fq ) = λ(p) fp fq e J(p) J(q) /σs. (5) We obtain the spatially varying smoothness weight λ(p) and the map J to guide label gradient as follows. Let M[a,b] denote the filtered Kinect density map M (detailed in previous section) normalized to the range [a, b]. To refresh, this map M indicated regions where Kinect values can be trusted (lower values) and those which were ambiguous to Kinect (higher values). We calculate J(p) = M[0,1] Irg (p)+(1 M[0,1] )Ku (p), where Irg is the grayscale version of Ir. This map is used to derive the smoothness cost based on the gradients from the low resolution scene depth where the kinect information can be confidently used, after suitable up-sampling and filtering. In other regions it employs the image information as done in the traditional formulation. Figure 3(c-d) demonstrates the advantage of our map compared to the traditional cost derived from image gradients. As can be observed from Figure 3(c) (plot of e Ir (p) Ir (q) /σi ), various strong image edges are present within geometrically smooth regions such as the chessboard pattern and textures on the cloths. In contrast, our processing retains only the depth edges and suppresses images edges which do not correspond to depth gradients as shown by the plot of e J(p) J(q) /σs in Figure 3(d). More importantly, the use of the low resolution depth from Kinect in the component Ku, provides depth gradients in texture-less areas such as the single colored boards. Traditionally, the weight λ is a fixed scalar, but we observed that different regions benefit from varying contribution from data cost and smoothness cost. In regions where the image texture was found confident, the data cost should contribute higher. In regions near depth edges, the presence of occlusion and the alignment error benefit from a

6 (a) Greedy on traditional DC (b) Greedy on our DC (c) Greedy on filtered DC [7](d) Joint bilateral up-sampling (e) Traditional GC (f) Our result Figure 4. Comparison of results from different schemes on the sample scene from Figure 1. Here DC stands for data cost and GC stands for Graph Cuts optimization. larger smoothness weight. Since the search range and data cost have already been tuned to balance the image and Kinect confidence, we handle the latter factor in the smoothness weight map. This is done by mapping regions of low Kinect confidence to have a larger smoothness cost. Thus, λ = M[s1,s2 ], where s1 and s2 are based on the desired low and high smoothness weight. Figure 3(b) shows the map corresponding to the sample scene. Parameters: For all our experiments σu = 50, ms = 30, α = 0.5, ρ = 100, σs = 10, s1 = 10 and s2 = Experiments and results We present various qualitative results in this section and quantitative comparison in the supplementary file 1. All the images are best viewed on a computer monitor with a higher zoom or as high resolution images in the supplementary file. Alpha expansion on the traditional stereo formulation took several hours to converge, while it completes in our fusion formulation in minutes. Recent fusion methods using global optimization, such as [12], were only demonstrated on resized 2MP images due to high computational needs and thus cannot be run on our 4-12MP images. Latest local methods such as [7], though scalable, cannot solve the 1 PUT WEBSITE LINK Figure 5. Two views of the 3D point cloud reconstructions of Santa figurine from our results (left) and Kinect nearest neighbor upsampling (right). The reconstruction shown here is part of the Santa scene shown in Figure 6. problem of ambiguity in texture-less or repeating texture regions as shown in our comparisons. Patch-based matching cannot fully overcome the matching ambiguity either. We start with the sample scene from Figure 1 to illustrate the differences between various schemes since it contains surfaces where the two sensors behave differently. We use integer disparities and the sample scene contains 300 labels for the cropped image size of pixels after rectification (to exclude boundaries and blank areas from rectification warping). Figure 4 compares the different schemes. The greedy approach assigns the label with minimum data cost to each pixel. Figure 4(a) shows the greedy algorithm applied on the cost volume from the traditional scheme (Eqn. 2). Figure 4(b) shows the greedy answer from the proposed data cost (Eqn. 3). We can observe that this already improves the result in texture-less regions such as the green board where Kinect information was available. However, ambiguous regions of the wall with no Kinect information still suffer from noisy and incorrect labeling. Recently [7] proposed a cost volume filtering using image as a guide [6]. Though the method was successfully used to recover sharp depth discontinuities and small structures, it does not address the problem in texture-less areas. Also, as discussed before, image edges not corresponding to depth discontinuities can lead to incorrect geometry. This is demonstrated in Figure 4(c). Joint bilateral up-sampling of the transferred Kinect depth also suffers from some of the above problems as shown in Figure 4(d). Due to the sparsity of the Kinect samples, we must use a large sigma and range for the bilateral filtering, which in turn leads to over smoothing in some regions. The mixed depth values at boundaries discussed before leads to further errors. Since up-sampling

7 methods do not allow preference of one value over another, such errors and those resulting from noisy sensors cannot be fully corrected. We tested with different parameter values and show the best results. A global optimization can overcome some of the problems. In Figure 4(e), we show the results from traditional stereo. Though noise is reduced, the repeating texture of the chessboard and the lack of gradients in the single colored boards leads to incorrect labeling. For example, the tilt of the green board is not captured. As shown in Figure 4(f), our result overcomes the problems discussed above. Note the correct depth gradients on the texture-less regions, correct depth edges compared to up-sampled result, and low noise compared to the greedy approaches. We illustrate the recovery of fine structures in Figure 6. Rows 1-2 show parts of the sample scene. Note correct recovery of the indentation on the right side of the Gasoline can, and fine details of the scarf on the teddy. Rows 3-4 show results from the Santa-scene. Observe that our results provide better gradient on the two textured cloths. They were pinned on two ends, resulting in a U-like depression, which is better recovered in both our results and the traditional stereo, as compared to the Kinect and its up-sampled results. Rows 5-6 show a scene with a specular, concave ceramic bowl. Note the correct recovery of the concave shape on the top half of the bowl compared to the traditional stereo formulation. The thin edges of the bowl and the folds on the cloth below it are recovered correctly in contrast to the Kinect and up-sampled results. In row 7 note the correct disparity for the repeating pattern (chessboards) compared to traditional stereo. Our results also recover better depth boundaries compared to the up-sampling schemes. Figure 5 shows 3D reconstructions from our method and Kinect up-sampling for the Santa figurine. The black arrows point to some regions where our method captured better geometry. For example, the spring structure in the mid-center of Santa, the correct recovery of the top of the head and hand, more levels in the face and nose area. The comparisons thus illustrate that the use of Kinect alone or its up-sampled results are not sufficient to obtain good correspondences at the resolution of the stereo images, which is the goal of our work. It can, however, be used effectively in regions where traditional stereo matching fails. 5. Conclusion In this paper we showed the effective fusion of Kinect and stereo sensors to obtain high quality correspondences at the spatial and depth resolution of the stereo images. We proposed novel data and smoothness costs for a global optimization framework that combines depth information from Kinect with stereo matching. Compared to the traditional stereo formulation, the optimization was modified by limiting the number of potential disparity labels at each pixel given the Kinect depth estimate, biasing the final disparity toward the Kinect-suggested geometry, and encouraging depth discontinuities to align with the Kinect data. Our results show gain in depth resolution, recovery of fine geometry, and correct depths in ambiguous regions, thus overcoming the weakness of both sensors towards effective fusion. A potential future direction would be to fit low dimension parametric models such as planes/spheres to parts of the Kinect data, and using that to enforce a stronger geometry/gradient prior in the smoothness term. References [1] Y. Boykov, O. Veksler, and R. Zabih. Fast approximate energy minimization via graph cuts. PAMI, 23(11): , , 3 [2] Y. Cui, S. Schuon, D. Chan, S. Thrun, and C. Theobalt. 3d shape scanning with a time-of-flight camera. In CVPR, [3] J. Diebel and S. Thrun. An application of markov random fields to range sensing. In NIPS, [4] A. Fusiello and L. Irsara. Quasi-euclidean uncalibrated epipolar rectification. In ICPR, [5] U. Hahne and M. Alexa. Depth imaging by combining timeof-flight and on-demand stereo. In DAGM Workshop on Dynamic 3D Imaging, [6] K. He, J. Sun, and X. Tang. Guided image filtering. In ECCV, , 6 [7] A. Hosni, C. Rhemann, M. Bleyer, C. Rother, and M. Gelautz. Fast cost-volume filtering for visual correspondence and beyond. PAMI, [8] R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohli, J. Shotton, S. Hodges, and A. W. Fitzgibbon. Kinectfusion: Real-time dense surface mapping and tracking. In ISMAR, [9] J. Park, H. Kim, Y.-W. Tai, M. Brown, and I. Kweon. High quality depth map upsampling for 3d-tof cameras. In ICCV, [10] S. Schuon, C. Theobalt, J. Davis, and S. Thrun. Lidarboost: Depth superresolution for tof 3d shape scanning. In CVPR, [11] K. H. Strobl, W. Sepp, S. Fuchs, C. Paredes, M. Smisek, and K. Arbter. DLR CalDe and DLR CalLab. 3 [12] L. Wang and R. Yang. Global stereo matching leveraged by sparse ground control points. In CVPR, , 6 [13] Q. Yang, K.-H. Tan, W. B. Culbertson, and J. G. Apostolopoulos. Fusion of active and passive sensors for fast 3d capture. In IEEE International Workshop on Multimedia Signal Processing, [14] Q. Yang, R. Yang, J. Davis, and D. Nister. Spatial-depth super resolution for range images. In CVPR, [15] J. Zhu, L. Wang, J. Gao, and R. Yang. Spatial-temporal fusion for high accuracy depth maps using dynamic mrfs. PAMI,

8 Images Our result Traditional stereo Nearest neighbor Joint bilateral up-sampling Figure 6. Results and comparisons on different scenes. The scenes were approximately 2-3 meters from the cameras, and the number of disparity labels range from levels. The images are best viewed on a computer screen. Rows 1-2: Details of structures recovered in the sample scene of Figures 1, 4. Rows 3-4: Results and details for Santa-scene. 3D reconstructions are shown in Figure 5. Rows 5-6: Results and details for Bowl scene. Row 7: Results for room scene.

A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA. Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo. University of Padova, Italy

A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA. Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo. University of Padova, Italy A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo University of Padova, Italy ABSTRACT Current Time-of-Flight matrix sensors allow for the acquisition

More information

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling , March 13-15, 2013, Hong Kong Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling Naveed Ahmed Abstract We present a system for spatio-temporally

More information

A Noise-Aware Filter for Real-Time Depth Upsampling

A Noise-Aware Filter for Real-Time Depth Upsampling A Noise-Aware Filter for Real-Time Depth Upsampling Derek Chan Hylke Buisman Christian Theobalt Sebastian Thrun Stanford University, USA Abstract. A new generation of active 3D range sensors, such as time-of-flight

More information

Single Depth Image Super Resolution and Denoising Using Coupled Dictionary Learning with Local Constraints and Shock Filtering

Single Depth Image Super Resolution and Denoising Using Coupled Dictionary Learning with Local Constraints and Shock Filtering Single Depth Image Super Resolution and Denoising Using Coupled Dictionary Learning with Local Constraints and Shock Filtering Jun Xie 1, Cheng-Chuan Chou 2, Rogerio Feris 3, Ming-Ting Sun 1 1 University

More information

Automatic Labeling of Lane Markings for Autonomous Vehicles

Automatic Labeling of Lane Markings for Autonomous Vehicles Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 jkiske@stanford.edu 1. Introduction As autonomous vehicles become more popular,

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

The Visual Internet of Things System Based on Depth Camera

The Visual Internet of Things System Based on Depth Camera The Visual Internet of Things System Based on Depth Camera Xucong Zhang 1, Xiaoyun Wang and Yingmin Jia Abstract The Visual Internet of Things is an important part of information technology. It is proposed

More information

3D Scanner using Line Laser. 1. Introduction. 2. Theory

3D Scanner using Line Laser. 1. Introduction. 2. Theory . Introduction 3D Scanner using Line Laser Di Lu Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute The goal of 3D reconstruction is to recover the 3D properties of a geometric

More information

Terrain Traversability Analysis using Organized Point Cloud, Superpixel Surface Normals-based segmentation and PCA-based Classification

Terrain Traversability Analysis using Organized Point Cloud, Superpixel Surface Normals-based segmentation and PCA-based Classification Terrain Traversability Analysis using Organized Point Cloud, Superpixel Surface Normals-based segmentation and PCA-based Classification Aras Dargazany 1 and Karsten Berns 2 Abstract In this paper, an stereo-based

More information

Patch Based Synthesis for Single Depth Image Super-Resolution

Patch Based Synthesis for Single Depth Image Super-Resolution Patch Based Synthesis for Single Depth Image Super-Resolution Oisin Mac Aodha, Neill D.F. Campbell, Arun Nair, and Gabriel J. Brostow University College London http://visual.cs.ucl.ac.uk/pubs/depthsuperres/

More information

Current status of image matching for Earth observation

Current status of image matching for Earth observation Current status of image matching for Earth observation Christian Heipke IPI - Institute for Photogrammetry and GeoInformation Leibniz Universität Hannover Secretary General, ISPRS Content Introduction

More information

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006 Practical Tour of Visual tracking David Fleet and Allan Jepson January, 2006 Designing a Visual Tracker: What is the state? pose and motion (position, velocity, acceleration, ) shape (size, deformation,

More information

Color Segmentation Based Depth Image Filtering

Color Segmentation Based Depth Image Filtering Color Segmentation Based Depth Image Filtering Michael Schmeing and Xiaoyi Jiang Department of Computer Science, University of Münster Einsteinstraße 62, 48149 Münster, Germany, {m.schmeing xjiang}@uni-muenster.de

More information

An Iterative Image Registration Technique with an Application to Stereo Vision

An Iterative Image Registration Technique with an Application to Stereo Vision An Iterative Image Registration Technique with an Application to Stereo Vision Bruce D. Lucas Takeo Kanade Computer Science Department Carnegie-Mellon University Pittsburgh, Pennsylvania 15213 Abstract

More information

3D/4D acquisition. 3D acquisition taxonomy 22.10.2014. Computer Vision. Computer Vision. 3D acquisition methods. passive. active.

3D/4D acquisition. 3D acquisition taxonomy 22.10.2014. Computer Vision. Computer Vision. 3D acquisition methods. passive. active. Das Bild kann zurzeit nicht angezeigt werden. 22.10.2014 3D/4D acquisition 3D acquisition taxonomy 3D acquisition methods passive active uni-directional multi-directional uni-directional multi-directional

More information

Face Model Fitting on Low Resolution Images

Face Model Fitting on Low Resolution Images Face Model Fitting on Low Resolution Images Xiaoming Liu Peter H. Tu Frederick W. Wheeler Visualization and Computer Vision Lab General Electric Global Research Center Niskayuna, NY, 1239, USA {liux,tu,wheeler}@research.ge.com

More information

Window-based, discontinuity preserving stereo

Window-based, discontinuity preserving stereo indow-based, discontinuity preserving stereo Motilal Agrawal SRI International 333 Ravenswood Ave. Menlo Park, CA 94025 agrawal@ai.sri.com Larry S. Davis Univ. of Maryland Dept. of Computer Science College

More information

Tracking Moving Objects In Video Sequences Yiwei Wang, Robert E. Van Dyck, and John F. Doherty Department of Electrical Engineering The Pennsylvania State University University Park, PA16802 Abstract{Object

More information

A Short Introduction to Computer Graphics

A Short Introduction to Computer Graphics A Short Introduction to Computer Graphics Frédo Durand MIT Laboratory for Computer Science 1 Introduction Chapter I: Basics Although computer graphics is a vast field that encompasses almost any graphical

More information

Topographic Change Detection Using CloudCompare Version 1.0

Topographic Change Detection Using CloudCompare Version 1.0 Topographic Change Detection Using CloudCompare Version 1.0 Emily Kleber, Arizona State University Edwin Nissen, Colorado School of Mines J Ramón Arrowsmith, Arizona State University Introduction CloudCompare

More information

Mean-Shift Tracking with Random Sampling

Mean-Shift Tracking with Random Sampling 1 Mean-Shift Tracking with Random Sampling Alex Po Leung, Shaogang Gong Department of Computer Science Queen Mary, University of London, London, E1 4NS Abstract In this work, boosting the efficiency of

More information

PatchMatch Stereo - Stereo Matching with Slanted Support Windows

PatchMatch Stereo - Stereo Matching with Slanted Support Windows M. BLEYER, C. RHEMANN, C. ROTHER: PATCHMATCH STEREO 1 PatchMatch Stereo - Stereo Matching with Slanted Support Windows Michael Bleyer 1 bleyer@ims.tuwien.ac.at Christoph Rhemann 1 rhemann@ims.tuwien.ac.at

More information

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition 1. Image Pre-Processing - Pixel Brightness Transformation - Geometric Transformation - Image Denoising 1 1. Image Pre-Processing

More information

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM Amol Ambardekar, Mircea Nicolescu, and George Bebis Department of Computer Science and Engineering University

More information

High Quality Image Deblurring Panchromatic Pixels

High Quality Image Deblurring Panchromatic Pixels High Quality Image Deblurring Panchromatic Pixels ACM Transaction on Graphics vol. 31, No. 5, 2012 Sen Wang, Tingbo Hou, John Border, Hong Qin, and Rodney Miller Presented by Bong-Seok Choi School of Electrical

More information

Sachin Patel HOD I.T Department PCST, Indore, India. Parth Bhatt I.T Department, PCST, Indore, India. Ankit Shah CSE Department, KITE, Jaipur, India

Sachin Patel HOD I.T Department PCST, Indore, India. Parth Bhatt I.T Department, PCST, Indore, India. Ankit Shah CSE Department, KITE, Jaipur, India Image Enhancement Using Various Interpolation Methods Parth Bhatt I.T Department, PCST, Indore, India Ankit Shah CSE Department, KITE, Jaipur, India Sachin Patel HOD I.T Department PCST, Indore, India

More information

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC

Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain

More information

Character Image Patterns as Big Data

Character Image Patterns as Big Data 22 International Conference on Frontiers in Handwriting Recognition Character Image Patterns as Big Data Seiichi Uchida, Ryosuke Ishida, Akira Yoshida, Wenjie Cai, Yaokai Feng Kyushu University, Fukuoka,

More information

Shear :: Blocks (Video and Image Processing Blockset )

Shear :: Blocks (Video and Image Processing Blockset ) 1 of 6 15/12/2009 11:15 Shear Shift rows or columns of image by linearly varying offset Library Geometric Transformations Description The Shear block shifts the rows or columns of an image by a gradually

More information

High-resolution Imaging System for Omnidirectional Illuminant Estimation

High-resolution Imaging System for Omnidirectional Illuminant Estimation High-resolution Imaging System for Omnidirectional Illuminant Estimation Shoji Tominaga*, Tsuyoshi Fukuda**, and Akira Kimachi** *Graduate School of Advanced Integration Science, Chiba University, Chiba

More information

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH VRVis Research Center for Virtual Reality and Visualization, Virtual Habitat, Inffeldgasse

More information

Colorado School of Mines Computer Vision Professor William Hoff

Colorado School of Mines Computer Vision Professor William Hoff Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Introduction to 2 What is? A process that produces from images of the external world a description

More information

Template-based Eye and Mouth Detection for 3D Video Conferencing

Template-based Eye and Mouth Detection for 3D Video Conferencing Template-based Eye and Mouth Detection for 3D Video Conferencing Jürgen Rurainsky and Peter Eisert Fraunhofer Institute for Telecommunications - Heinrich-Hertz-Institute, Image Processing Department, Einsteinufer

More information

A Methodology for Obtaining Super-Resolution Images and Depth Maps from RGB-D Data

A Methodology for Obtaining Super-Resolution Images and Depth Maps from RGB-D Data A Methodology for Obtaining Super-Resolution Images and Depth Maps from RGB-D Data Daniel B. Mesquita, Mario F. M. Campos, Erickson R. Nascimento Computer Science Department Universidade Federal de Minas

More information

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow , pp.233-237 http://dx.doi.org/10.14257/astl.2014.51.53 A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow Giwoo Kim 1, Hye-Youn Lim 1 and Dae-Seong Kang 1, 1 Department of electronices

More information

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES B. Sirmacek, R. Lindenbergh Delft University of Technology, Department of Geoscience and Remote Sensing, Stevinweg

More information

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Optical Flow Shenlong Wang CSC2541 Course Presentation Feb 2, 2016 Outline Introduction Variation Models Feature Matching Methods End-to-end Learning based Methods Discussion Optical Flow Goal: Pixel motion

More information

Canny Edge Detection

Canny Edge Detection Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties

More information

Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects

Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects Interactive Segmentation, Tracking, and Kinematic Modeling of Unknown 3D Articulated Objects Dov Katz, Moslem Kazemi, J. Andrew Bagnell and Anthony Stentz 1 Abstract We present an interactive perceptual

More information

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA N. Zarrinpanjeh a, F. Dadrassjavan b, H. Fattahi c * a Islamic Azad University of Qazvin - nzarrin@qiau.ac.ir

More information

Efficient Belief Propagation for Early Vision

Efficient Belief Propagation for Early Vision Efficient Belief Propagation for Early Vision Pedro F. Felzenszwalb and Daniel P. Huttenlocher Department of Computer Science, Cornell University {pff,dph}@cs.cornell.edu Abstract Markov random field models

More information

High Performance GPU-based Preprocessing for Time-of-Flight Imaging in Medical Applications

High Performance GPU-based Preprocessing for Time-of-Flight Imaging in Medical Applications High Performance GPU-based Preprocessing for Time-of-Flight Imaging in Medical Applications Jakob Wasza 1, Sebastian Bauer 1, Joachim Hornegger 1,2 1 Pattern Recognition Lab, Friedrich-Alexander University

More information

Announcements. Active stereo with structured light. Project structured light patterns onto the object

Announcements. Active stereo with structured light. Project structured light patterns onto the object Announcements Active stereo with structured light Project 3 extension: Wednesday at noon Final project proposal extension: Friday at noon > consult with Steve, Rick, and/or Ian now! Project 2 artifact

More information

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite

Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite Philip Lenz 1 Andreas Geiger 2 Christoph Stiller 1 Raquel Urtasun 3 1 KARLSRUHE INSTITUTE OF TECHNOLOGY 2 MAX-PLANCK-INSTITUTE IS 3

More information

Feature Tracking and Optical Flow

Feature Tracking and Optical Flow 02/09/12 Feature Tracking and Optical Flow Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Many slides adapted from Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve

More information

3 Image-Based Photo Hulls. 2 Image-Based Visual Hulls. 3.1 Approach. 3.2 Photo-Consistency. Figure 1. View-dependent geometry.

3 Image-Based Photo Hulls. 2 Image-Based Visual Hulls. 3.1 Approach. 3.2 Photo-Consistency. Figure 1. View-dependent geometry. Image-Based Photo Hulls Greg Slabaugh, Ron Schafer Georgia Institute of Technology Center for Signal and Image Processing Atlanta, GA 30332 {slabaugh, rws}@ece.gatech.edu Mat Hans Hewlett-Packard Laboratories

More information

Building an Advanced Invariant Real-Time Human Tracking System

Building an Advanced Invariant Real-Time Human Tracking System UDC 004.41 Building an Advanced Invariant Real-Time Human Tracking System Fayez Idris 1, Mazen Abu_Zaher 2, Rashad J. Rasras 3, and Ibrahiem M. M. El Emary 4 1 School of Informatics and Computing, German-Jordanian

More information

A Prototype For Eye-Gaze Corrected

A Prototype For Eye-Gaze Corrected A Prototype For Eye-Gaze Corrected Video Chat on Graphics Hardware Maarten Dumont, Steven Maesen, Sammy Rogmans and Philippe Bekaert Introduction Traditional webcam video chat: No eye contact. No extensive

More information

Bayesian Image Super-Resolution

Bayesian Image Super-Resolution Bayesian Image Super-Resolution Michael E. Tipping and Christopher M. Bishop Microsoft Research, Cambridge, U.K..................................................................... Published as: Bayesian

More information

ROBUST COLOR JOINT MULTI-FRAME DEMOSAICING AND SUPER- RESOLUTION ALGORITHM

ROBUST COLOR JOINT MULTI-FRAME DEMOSAICING AND SUPER- RESOLUTION ALGORITHM ROBUST COLOR JOINT MULTI-FRAME DEMOSAICING AND SUPER- RESOLUTION ALGORITHM Theodor Heinze Hasso-Plattner-Institute for Software Systems Engineering Prof.-Dr.-Helmert-Str. 2-3, 14482 Potsdam, Germany theodor.heinze@hpi.uni-potsdam.de

More information

3D Model based Object Class Detection in An Arbitrary View

3D Model based Object Class Detection in An Arbitrary View 3D Model based Object Class Detection in An Arbitrary View Pingkun Yan, Saad M. Khan, Mubarak Shah School of Electrical Engineering and Computer Science University of Central Florida http://www.eecs.ucf.edu/

More information

Automotive Applications of 3D Laser Scanning Introduction

Automotive Applications of 3D Laser Scanning Introduction Automotive Applications of 3D Laser Scanning Kyle Johnston, Ph.D., Metron Systems, Inc. 34935 SE Douglas Street, Suite 110, Snoqualmie, WA 98065 425-396-5577, www.metronsys.com 2002 Metron Systems, Inc

More information

Point Cloud Simulation & Applications Maurice Fallon

Point Cloud Simulation & Applications Maurice Fallon Point Cloud & Applications Maurice Fallon Contributors: MIT: Hordur Johannsson and John Leonard U. of Salzburg: Michael Gschwandtner and Roland Kwitt Overview : Dense disparity information Efficient Image

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS

VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS VEHICLE LOCALISATION AND CLASSIFICATION IN URBAN CCTV STREAMS Norbert Buch 1, Mark Cracknell 2, James Orwell 1 and Sergio A. Velastin 1 1. Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE,

More information

RIEGL VZ-400 NEW. Laser Scanners. Latest News March 2009

RIEGL VZ-400 NEW. Laser Scanners. Latest News March 2009 Latest News March 2009 NEW RIEGL VZ-400 Laser Scanners The following document details some of the excellent results acquired with the new RIEGL VZ-400 scanners, including: Time-optimised fine-scans The

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

A technical overview of the Fuel3D system.

A technical overview of the Fuel3D system. A technical overview of the Fuel3D system. Contents Introduction 3 How does Fuel3D actually work? 4 Photometric imaging for high-resolution surface detail 4 Optical localization to track movement during

More information

Wii Remote Calibration Using the Sensor Bar

Wii Remote Calibration Using the Sensor Bar Wii Remote Calibration Using the Sensor Bar Alparslan Yildiz Abdullah Akay Yusuf Sinan Akgul GIT Vision Lab - http://vision.gyte.edu.tr Gebze Institute of Technology Kocaeli, Turkey {yildiz, akay, akgul}@bilmuh.gyte.edu.tr

More information

How To Analyze Ball Blur On A Ball Image

How To Analyze Ball Blur On A Ball Image Single Image 3D Reconstruction of Ball Motion and Spin From Motion Blur An Experiment in Motion from Blur Giacomo Boracchi, Vincenzo Caglioti, Alessandro Giusti Objective From a single image, reconstruct:

More information

How To Fuse A Point Cloud With A Laser And Image Data From A Pointcloud

How To Fuse A Point Cloud With A Laser And Image Data From A Pointcloud REAL TIME 3D FUSION OF IMAGERY AND MOBILE LIDAR Paul Mrstik, Vice President Technology Kresimir Kusevic, R&D Engineer Terrapoint Inc. 140-1 Antares Dr. Ottawa, Ontario K2E 8C4 Canada paul.mrstik@terrapoint.com

More information

The Trade-off between Image Resolution and Field of View: the Influence of Lens Selection

The Trade-off between Image Resolution and Field of View: the Influence of Lens Selection The Trade-off between Image Resolution and Field of View: the Influence of Lens Selection I want a lens that can cover the whole parking lot and I want to be able to read a license plate. Sound familiar?

More information

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY V. Knyaz a, *, Yu. Visilter, S. Zheltov a State Research Institute for Aviation System (GosNIIAS), 7, Victorenko str., Moscow, Russia

More information

COLOR and depth information provide complementary

COLOR and depth information provide complementary 1 Joint depth and color camera calibration with distortion correction Daniel Herrera C., Juho Kannala, and Janne Heikkilä Abstract We present an algorithm that simultaneously calibrates two color cameras,

More information

Lecture 12: Cameras and Geometry. CAP 5415 Fall 2010

Lecture 12: Cameras and Geometry. CAP 5415 Fall 2010 Lecture 12: Cameras and Geometry CAP 5415 Fall 2010 The midterm What does the response of a derivative filter tell me about whether there is an edge or not? Things aren't working Did you look at the filters?

More information

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS

VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS VEHICLE TRACKING USING ACOUSTIC AND VIDEO SENSORS Aswin C Sankaranayanan, Qinfen Zheng, Rama Chellappa University of Maryland College Park, MD - 277 {aswch, qinfen, rama}@cfar.umd.edu Volkan Cevher, James

More information

Eye contact over video Kjeldskov, Jesper; Skov, Mikael; Smedegaard, Jacob Haubach; Paay, Jeni; Nielsen, Thomas S.

Eye contact over video Kjeldskov, Jesper; Skov, Mikael; Smedegaard, Jacob Haubach; Paay, Jeni; Nielsen, Thomas S. Aalborg Universitet Eye contact over video Kjeldskov, Jesper; Skov, Mikael; Smedegaard, Jacob Haubach; Paay, Jeni; Nielsen, Thomas S. Published in: Extended Abstracts of the AMC CHI Conference on Human

More information

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '9) ISSN: 179-519 435 ISBN: 978-96-474-51-2 An Energy-Based Vehicle Tracking System using Principal

More information

Automatic Reconstruction of Parametric Building Models from Indoor Point Clouds. CAD/Graphics 2015

Automatic Reconstruction of Parametric Building Models from Indoor Point Clouds. CAD/Graphics 2015 Automatic Reconstruction of Parametric Building Models from Indoor Point Clouds Sebastian Ochmann Richard Vock Raoul Wessel Reinhard Klein University of Bonn, Germany CAD/Graphics 2015 Motivation Digital

More information

Removing Moving Objects from Point Cloud Scenes

Removing Moving Objects from Point Cloud Scenes 1 Removing Moving Objects from Point Cloud Scenes Krystof Litomisky klitomis@cs.ucr.edu Abstract. Three-dimensional simultaneous localization and mapping is a topic of significant interest in the research

More information

ENGN 2502 3D Photography / Winter 2012 / SYLLABUS http://mesh.brown.edu/3dp/

ENGN 2502 3D Photography / Winter 2012 / SYLLABUS http://mesh.brown.edu/3dp/ ENGN 2502 3D Photography / Winter 2012 / SYLLABUS http://mesh.brown.edu/3dp/ Description of the proposed course Over the last decade digital photography has entered the mainstream with inexpensive, miniaturized

More information

RGB-D Mapping: Using Kinect-Style Depth Cameras for Dense 3D Modeling of Indoor Environments

RGB-D Mapping: Using Kinect-Style Depth Cameras for Dense 3D Modeling of Indoor Environments RGB-D Mapping: Using Kinect-Style Depth Cameras for Dense 3D Modeling of Indoor Environments Peter Henry 1, Michael Krainin 1, Evan Herbst 1, Xiaofeng Ren 2, Dieter Fox 1 Abstract RGB-D cameras (such as

More information

Tutorial for Tracker and Supporting Software By David Chandler

Tutorial for Tracker and Supporting Software By David Chandler Tutorial for Tracker and Supporting Software By David Chandler I use a number of free, open source programs to do video analysis. 1. Avidemux, to exerpt the video clip, read the video properties, and save

More information

V-PITS : VIDEO BASED PHONOMICROSURGERY INSTRUMENT TRACKING SYSTEM. Ketan Surender

V-PITS : VIDEO BASED PHONOMICROSURGERY INSTRUMENT TRACKING SYSTEM. Ketan Surender V-PITS : VIDEO BASED PHONOMICROSURGERY INSTRUMENT TRACKING SYSTEM by Ketan Surender A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science (Electrical Engineering)

More information

Determining optimal window size for texture feature extraction methods

Determining optimal window size for texture feature extraction methods IX Spanish Symposium on Pattern Recognition and Image Analysis, Castellon, Spain, May 2001, vol.2, 237-242, ISBN: 84-8021-351-5. Determining optimal window size for texture feature extraction methods Domènec

More information

Camera Resolution Explained

Camera Resolution Explained Camera Resolution Explained FEBRUARY 17, 2015 BY NASIM MANSUROV Although the megapixel race has been going on since digital cameras had been invented, the last few years in particular have seen a huge

More information

Automatic 3D Mapping for Infrared Image Analysis

Automatic 3D Mapping for Infrared Image Analysis Automatic 3D Mapping for Infrared Image Analysis i r f m c a d a r a c h e V. Martin, V. Gervaise, V. Moncada, M.H. Aumeunier, M. irdaouss, J.M. Travere (CEA) S. Devaux (IPP), G. Arnoux (CCE) and JET-EDA

More information

Subspace Analysis and Optimization for AAM Based Face Alignment

Subspace Analysis and Optimization for AAM Based Face Alignment Subspace Analysis and Optimization for AAM Based Face Alignment Ming Zhao Chun Chen College of Computer Science Zhejiang University Hangzhou, 310027, P.R.China zhaoming1999@zju.edu.cn Stan Z. Li Microsoft

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

EECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines

EECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines EECS 556 Image Processing W 09 Interpolation Interpolation techniques B splines What is image processing? Image processing is the application of 2D signal processing methods to images Image representation

More information

Automatic Restoration Algorithms for 35mm film

Automatic Restoration Algorithms for 35mm film P. Schallauer, A. Pinz, W. Haas. Automatic Restoration Algorithms for 35mm film. To be published in Videre, Journal of Computer Vision Research, web: http://mitpress.mit.edu/videre.html, 1999. Automatic

More information

Robot Perception Continued

Robot Perception Continued Robot Perception Continued 1 Visual Perception Visual Odometry Reconstruction Recognition CS 685 11 Range Sensing strategies Active range sensors Ultrasound Laser range sensor Slides adopted from Siegwart

More information

Image Segmentation and Registration

Image Segmentation and Registration Image Segmentation and Registration Dr. Christine Tanner (tanner@vision.ee.ethz.ch) Computer Vision Laboratory, ETH Zürich Dr. Verena Kaynig, Machine Learning Laboratory, ETH Zürich Outline Segmentation

More information

Real-Time Stereo Reconstruction in Robotically Assisted Minimally Invasive Surgery

Real-Time Stereo Reconstruction in Robotically Assisted Minimally Invasive Surgery Real-Time Stereo Reconstruction in Robotically Assisted Minimally Invasive Surgery Abstract. The recovery of tissue structure and morphology during robotic assisted surgery is an important step towards

More information

Accurate and robust image superresolution by neural processing of local image representations

Accurate and robust image superresolution by neural processing of local image representations Accurate and robust image superresolution by neural processing of local image representations Carlos Miravet 1,2 and Francisco B. Rodríguez 1 1 Grupo de Neurocomputación Biológica (GNB), Escuela Politécnica

More information

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.

Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R. Epipolar Geometry We consider two perspective images of a scene as taken from a stereo pair of cameras (or equivalently, assume the scene is rigid and imaged with a single camera from two different locations).

More information

B2.53-R3: COMPUTER GRAPHICS. NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions.

B2.53-R3: COMPUTER GRAPHICS. NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. B2.53-R3: COMPUTER GRAPHICS NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. 2. PART ONE is to be answered in the TEAR-OFF ANSWER

More information

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palmprint Recognition By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palm print Palm Patterns are utilized in many applications: 1. To correlate palm patterns with medical disorders, e.g. genetic

More information

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to:

Data Storage. Chapter 3. Objectives. 3-1 Data Types. Data Inside the Computer. After studying this chapter, students should be able to: Chapter 3 Data Storage Objectives After studying this chapter, students should be able to: List five different data types used in a computer. Describe how integers are stored in a computer. Describe how

More information

VOLUMNECT - Measuring Volumes with Kinect T M

VOLUMNECT - Measuring Volumes with Kinect T M VOLUMNECT - Measuring Volumes with Kinect T M Beatriz Quintino Ferreira a, Miguel Griné a, Duarte Gameiro a, João Paulo Costeira a,b and Beatriz Sousa Santos c,d a DEEC, Instituto Superior Técnico, Lisboa,

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Geometric Camera Parameters

Geometric Camera Parameters Geometric Camera Parameters What assumptions have we made so far? -All equations we have derived for far are written in the camera reference frames. -These equations are valid only when: () all distances

More information

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang Recognizing Cats and Dogs with Shape and Appearance based Models Group Member: Chu Wang, Landu Jiang Abstract Recognizing cats and dogs from images is a challenging competition raised by Kaggle platform

More information

AUTOMATIC CROWD ANALYSIS FROM VERY HIGH RESOLUTION SATELLITE IMAGES

AUTOMATIC CROWD ANALYSIS FROM VERY HIGH RESOLUTION SATELLITE IMAGES In: Stilla U et al (Eds) PIA11. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences 38 (3/W22) AUTOMATIC CROWD ANALYSIS FROM VERY HIGH RESOLUTION SATELLITE IMAGES

More information

Introduction. C 2009 John Wiley & Sons, Ltd

Introduction. C 2009 John Wiley & Sons, Ltd 1 Introduction The purpose of this text on stereo-based imaging is twofold: it is to give students of computer vision a thorough grounding in the image analysis and projective geometry techniques relevant

More information

Advanced Methods for Pedestrian and Bicyclist Sensing

Advanced Methods for Pedestrian and Bicyclist Sensing Advanced Methods for Pedestrian and Bicyclist Sensing Yinhai Wang PacTrans STAR Lab University of Washington Email: yinhai@uw.edu Tel: 1-206-616-2696 For Exchange with University of Nevada Reno Sept. 25,

More information

Scanners and How to Use Them

Scanners and How to Use Them Written by Jonathan Sachs Copyright 1996-1999 Digital Light & Color Introduction A scanner is a device that converts images to a digital file you can use with your computer. There are many different types

More information

Colour Image Segmentation Technique for Screen Printing

Colour Image Segmentation Technique for Screen Printing 60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone

More information

Vision based Vehicle Tracking using a high angle camera

Vision based Vehicle Tracking using a high angle camera Vision based Vehicle Tracking using a high angle camera Raúl Ignacio Ramos García Dule Shu gramos@clemson.edu dshu@clemson.edu Abstract A vehicle tracking and grouping algorithm is presented in this work

More information

Choosing a digital camera for your microscope John C. Russ, Materials Science and Engineering Dept., North Carolina State Univ.

Choosing a digital camera for your microscope John C. Russ, Materials Science and Engineering Dept., North Carolina State Univ. Choosing a digital camera for your microscope John C. Russ, Materials Science and Engineering Dept., North Carolina State Univ., Raleigh, NC One vital step is to choose a transfer lens matched to your

More information