Efficient Depth Edge Detection Using Structured Light

Similar documents
Virtual Fitting by Single-shot Body Shape Estimation

Template-based Eye and Mouth Detection for 3D Video Conferencing

3D Scanner using Line Laser. 1. Introduction. 2. Theory

An Energy-Based Vehicle Tracking System using Principal Component Analysis and Unsupervised ART Network

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

Projection Center Calibration for a Co-located Projector Camera System

Real-time 3D Scanning System for Pavement Distortion Inspection

Automatic Labeling of Lane Markings for Autonomous Vehicles

Measuring Line Edge Roughness: Fluctuations in Uncertainty

Vision based Vehicle Tracking using a high angle camera

Canny Edge Detection

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

CS635 Spring Department of Computer Science Purdue University

Multivariate data visualization using shadow

Announcements. Active stereo with structured light. Project structured light patterns onto the object

Speed Performance Improvement of Vehicle Blob Tracking System

Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences

A Study on M2M-based AR Multiple Objects Loading Technology using PPHT

Color Segmentation Based Depth Image Filtering

Vision-Based Blind Spot Detection Using Optical Flow

Edge tracking for motion segmentation and depth ordering

A Reliability Point and Kalman Filter-based Vehicle Tracking Technique

REAL TIME TRAFFIC LIGHT CONTROL USING IMAGE PROCESSING

Digital Camera Imaging Evaluation

Subspace Analysis and Optimization for AAM Based Face Alignment

Sensitivity Analysis 3.1 AN EXAMPLE FOR ANALYSIS

Poker Vision: Playing Cards and Chips Identification based on Image Processing

A Learning Based Method for Super-Resolution of Low Resolution Images

T-REDSPEED White paper

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Build Panoramas on Android Phones

High speed 3D capture for Configuration Management DOE SBIR Phase II Paul Banks

Segmentation of building models from dense 3D point-clouds

Design of Multi-camera Based Acts Monitoring System for Effective Remote Monitoring Control

3D Vehicle Extraction and Tracking from Multiple Viewpoints for Traffic Monitoring by using Probability Fusion Map

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

Detection of the Single Image from DIBR Based on 3D Warping Trace and Edge Matching

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE.

Manufacturing Process and Cost Estimation through Process Detection by Applying Image Processing Technique


An Iterative Image Registration Technique with an Application to Stereo Vision

QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING

Self-Calibrated Structured Light 3D Scanner Using Color Edge Pattern

Colorado School of Mines Computer Vision Professor William Hoff

DINAMIC AND STATIC CENTRE OF PRESSURE MEASUREMENT ON THE FORCEPLATE. F. R. Soha, I. A. Szabó, M. Budai. Abstract

Using Photorealistic RenderMan for High-Quality Direct Volume Rendering

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

This document is downloaded from DR-NTU, Nanyang Technological University Library, Singapore.

A ROBUST BACKGROUND REMOVAL ALGORTIHMS

Limitations of Human Vision. What is computer vision? What is computer vision (cont d)?

HIGH-PERFORMANCE INSPECTION VEHICLE FOR RAILWAYS AND TUNNEL LININGS. HIGH-PERFORMANCE INSPECTION VEHICLE FOR RAILWAY AND ROAD TUNNEL LININGS.

Determining optimal window size for texture feature extraction methods

Assessment of Camera Phone Distortion and Implications for Watermarking

Simultaneous Gamma Correction and Registration in the Frequency Domain

Mobile Robot Localization using Range Sensors : Consecutive Scanning and Cooperative Scanning

Face Recognition in Low-resolution Images by Using Local Zernike Moments

3D Model based Object Class Detection in An Arbitrary View

Applied Mathematical Sciences, Vol. 7, 2013, no. 112, HIKARI Ltd,

The Car Tutorial Part 1 Creating a Racing Game for Unity

A Binary Adaptable Window SoC Architecture for a StereoVision Based Depth Field Processor

The Advantages of Using a Fixed Stereo Vision sensor

Improvements in Real-Time Correlation-Based Stereo Vision

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

A General Framework for Tracking Objects in a Multi-Camera Environment

Automatic 3D Mapping for Infrared Image Analysis

Building an Advanced Invariant Real-Time Human Tracking System

3D/4D acquisition. 3D acquisition taxonomy Computer Vision. Computer Vision. 3D acquisition methods. passive. active.

The Periodic Moving Average Filter for Removing Motion Artifacts from PPG Signals

Towards License Plate Recognition: Comparying Moving Objects Segmentation Approaches

I-SiTE - Laser Scanning Revolutionises Site Survey

Efficient Background Subtraction and Shadow Removal Technique for Multiple Human object Tracking

Circle Object Recognition Based on Monocular Vision for Home Security Robot

The Limits of Human Vision

Color holographic 3D display unit with aperture field division

Video in Logger Pro. There are many ways to create and use video clips and still images in Logger Pro.

Multispectral stereo acquisition using 2 RGB cameras and color filters: color and disparity accuracy

Spatio-Temporally Coherent 3D Animation Reconstruction from Multi-view RGB-D Images using Landmark Sampling

REPRESENTATION, CODING AND INTERACTIVE RENDERING OF HIGH- RESOLUTION PANORAMIC IMAGES AND VIDEO USING MPEG-4

How To Test A Robot Platform And Its Components

Mesh Moving Techniques for Fluid-Structure Interactions With Large Displacements

Teaching Image Computation: From Computer Graphics to. Computer Vision

Object Tracking System Using Motion Detection

Spatio-Temporal Mapping -A Technique for Overview Visualization of Time-Series Datasets-

Synthetic Sensing: Proximity / Distance Sensors

PASSIVE DRIVER GAZE TRACKING WITH ACTIVE APPEARANCE MODELS

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014

Edge detection. (Trucco, Chapt 4 AND Jain et al., Chapt 5) -Edges are significant local changes of intensity in an image.

A Survey of Video Processing with Field Programmable Gate Arrays (FGPA)

Understanding astigmatism Spring 2003

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

Effective Use of Android Sensors Based on Visualization of Sensor Information

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

WHITE PAPER. Methods for Measuring Flat Panel Display Defects and Mura as Correlated to Human Visual Perception

3D Vision An enabling Technology for Advanced Driver Assistance and Autonomous Offroad Driving

More Local Structure Information for Make-Model Recognition

A NEW SUPER RESOLUTION TECHNIQUE FOR RANGE DATA. Valeria Garro, Pietro Zanuttigh, Guido M. Cortelazzo. University of Padova, Italy

Video-Rate Stereo Vision on a Reconfigurable Hardware. Ahmad Darabiha Department of Electrical and Computer Engineering University of Toronto

Transcription:

Efficient Depth Edge Detection Using Structured Light Jiyoung Park Cheolhwon Kim Juneho Yi Matthew Turk School of Information and Communication Engineering Computer Science Department Sungkyunkwan University, Korea University of California Biometric Engineering Research Center Santa Barbara, CA 93106 {jiyp, ani4one, jhyi}@ece.skku.ac.kr mturk@cs.ucsb.edu Abstract. This research features a novel approach that efficiently detects depth edges in real world scenes. Depth edges play a very important role in many computer vision problems because they represent object contours. We strategically project structured light and exploit distortion of the light pattern in the structured light image along depth discontinuities to reliably detect depth edges. Distortion along depth discontinuities may not occur or be large enough to detect depending on the distance from the camera or projector. For practical application of the proposed approach, we have presented methods that guarantee the occurrence of the distortion along depth discontinuities for a continuous range of object location. Experimental results show that the proposed method accurately detects depth edges of human hand and body shapes as well as general objects. 1 Introduction Object contour is valuable information in image analysis problems such as object recognition and tracking. Object contours can be represented by depth discontinuities (a.k.a. depth edges). However, the use of traditional Canny edges cannot distinguish between texture edges and depth edges. We describe a structured light based framework for reliably capturing depth edges in real world scenes without dense 3D reconstruction. Depth edges directly represent shape features that are valuable information in computer vision [1, 2]. Unfortunately, few research results have been reported that provide only depth discontinuities without computing 3D information at every pixel in the input image of a scene. On the other hand, most effort has been devoted to stereo vision problems in order to obtain depth information. In fact, stereo methods for 3D reconstruction fail in textureless regions and along occluding edges with low intensity variation [3]. Recently, the use of structured light was reported to compute 3D coordinates at every pixel in the input image [4, 5]. However, the fact that this approach needs a number of structured light images makes it hard to be applicable in realtime. One notable technique was reported recently for non-photorealistic rendering [6]. They capture a sequence of images in which different light sources illuminate the scene from various positions. Then they use shadows in each image to assemble a depth edge map. This technique was applied to finger spelling recognition [7]. Although very attractive, it only works where shadows can be reliably created. In contrast, our method is shadow free. In addition, by a slight modification of the imaging system so that it can capture white and structured images at the same time, it

can be easily applied to dynamic scenes where the camera moves. The remaining of this paper is organized as follows. In section 2, we describe the procedure to detect depth edges in a patterned image. Section 3 presents our methods that guarantee the occurrence of the distortion along depth discontinuities for a continuous range of object location. We report our experimental results in section 4. Finally, conclusions and future work are discussed in section 5. 2 Detecting depth edges We detect depth edges by projecting structured light onto a scene and exploiting distortion of light pattern in the structured light image along depth discontinuities. Fig. 1 illustrates the basic method to compute depth edges. First, as can be seen in Fig. 1 (a), we project a white light and structured light consecutively onto a scene where depth edges are to be detected. Second, we extract horizontal patterns simply by differencing the white light and structured light images. We call this difference image the patterned image (see Fig. 1 (b)). Third, we exploit distortion of light pattern in the structured light image along depth edges. We use 2D Gabor filtering that is known to be useful in segregating textural regions. The amplitude response of the Gabor filter is very low where distortion of light pattern occurs. We then accurately locate depth edges using edge information from the white light image. Fig. 1 (c) illustrates this process and a final depth edge map is shown in Fig. 1 (d). (a) (b) (c) (d) Fig. 1 The basic idea to compute a depth edge map: (a) capture of a white light and structured light image, (b) patterned image, (c) detection of depth edges by applying a Gabor filter to the patterned image with edge information from the white light image, (d) final depth edge map However, distortion along depth discontinuities may not occur or be sufficient to detect depending on the distance from the camera or projector. Fig. 2 shows an example situation. Along the depth edges between objects A and B and between objects C and D, the distortion of pattern almost disappears. This makes it infeasible to detect these depth edges using a Gabor filter. For practical application of the proposed approach, it is essential to have a solution that guarantees the occurrence of the distortion along depth discontinuities irrespective of object location.

(a) (b) (c) Fig. 2 Problem of disappearance of distortion along depth edges depending on the distance of an object from the camera and projector: (a) white light image (b) patterned image (c) amplitude response of Gabor filter applied to the patterned image. 3 Detectable range of depth edges We have described that we can easily detect depth edges by exploiting the distortion along depth discontinuities in the patterned image. However, as previously mentioned, the distortion may not occur or be sufficient to detect depending the distance of depth edges from the camera or projector. In this section, we present methods to guarantee the occurrence of the distortion for a continuous range of object location. 3.1 Reliably detectable distortion In order to compute the exact range where depth edges are detectable, we have modeled the imaging geometry of the camera, projector and object as illustrated in Fig. 3. The solid line represents a light ray from the projector. When structured light is projected onto object points A and B, they are imaged at different locations in the image plane due to different depth values. That is, distortion of horizontal pattern occurs along the depth discontinuity. The amount of distortion is denoted by. Note that the width of horizontal stripes projected onto object locations A and B are the same in the image plane although they have different depth values. This is because the perspective effect of the camera and projector cancel each other out. From this model, we can derive the following equation using similar triangles: = fd 1 1 fdr = (1) a b a( a + r) In order for a depth edge to be detectable by applying a Gabor filter, the disparity of the same horizontal stripe,, in the image plane should be above a certain amount. We have confirmed through experiments that an offset of at least 2/3 of the width of the horizontal stripe (w) is necessary for reliable detection of the distortion. Thus, the range of for reliable detection of pattern distortion can be written: 2w 4w 2wk + 2wk +, k = 0,1, L. (2) 3 3 From equation (2), we can know that there are ranges where we cannot detect depth edges due to the lack of distortion depending the distance of a depth edge from the camera or projector. Therefore, for practical application of the proposed approach, we need to guarantee that we are operating within these detectable regions.

(a) (b) Fig. 3 Imaging geometry and the amount of distortion: (a) spatial relation of camera, projector and two object points viewed from the side (b) the magnitude of pattern distortion, Δ, in a real image 3.2 Extending the detectable range of depth edges We propose two methods to extend the range so that detection of distortion is guaranteed. The first method is based on a single camera and projector setup that uses several structured light images with different width of horizontal stripes. We use additional structured light whose spatial period is halved such as, w 2 =2w 1, w 3 =2w 2, w 4 =2w 3,. When n such structured light images are used, the range of detectable distortion,, is as follows. 2 3w 1 < < ( 2 n 2 3) w 1. (3) The second method exploits an additional camera or projector. As illustrated in Fig. 4, this method is equivalent to adding a new curve, d=d 2 (the dotted line), that is different from d 1. Recall that d denotes the distance between the camera and the projector. The new detectable range is created by partially overlapping the ranges by the two curves. A and B represent respectively detectable and undetectable ranges in. They correspond to X and Y regions in a. When a1>a 2 and a 4 >a 3, the undetectable range B in X is overlapped with the detectable range A in Y. Similarly, the undetectable range B in Y is overlapped with the detectable range A in X. Therefore, if we consider both X and Y, we can extend the range where the detection of the distortion is guaranteed. To satisfy the condition a 1 >a 2 and a 4 >a 3, equation (4) must hold where the range of is γ wk + αw < < γwk + βw, k = 0,1, L. α + γ + γk β + γk d1 < d2 < d1 β + γk α + γk (4) 3.3 Computation of the detectable range of depth edges The detectable range of depth edges [a min, a max ] is computed in the following two steps: - Step 1: Determination of the width of a stripe, w, in the structured light First, we set a max to the distance from the camera to the farthest background. Given r min, w can be computed by equation (5) which is derived from equation (1). 3 fd1rmin w = (5) 2amax ( amax + rmin ) Thus, given a max and r min, we can compute the ideal width of stripes of the structured light. Using this structured light, we can detect depth edges of all object points that are located in the range, a = [0, a max ], and apart from each other no less than r min.

Fig. 4 Extending the detectable range using an additional camera or projector Fig. 5 Computation of the detectable range of depth edges - Step 2: The minimum of the detectable range, a min Given w from step 1, we can compute amin that corresponds to the upper limit of, u, as shown in Fig. 5. We have described two methods in section 3.2 for extending the detectable range of depth edges. The expression for u is different depending on which method is used. After determining u and rmax, a min can be computed by equation (1). r max denotes the maximum distance between object points in the range [a min, a max ] that guarantees the occurrence of the distortion along depth discontinuities. Clearly, the distance between any two object points is bounded by (a max.- a min ). Therefore, when a min and r max satisfy equation (6), we are guaranteed to detect depth edges of all object points located in the range [a min, a max ], and apart from each other no less than r min and no more than r max amax amin = rmax (6) In this case, u, a min and r max have the following relationship. fd1rmax u = (7) amin( amin + rmax ) Then r max becomes: 2 amaxu rmax = (8) fd1 + amaxu Substituting equation (8) into equation (6), we obtain the following equation. fd1amax amin = (9) fd1 + uamax This way, we can employ structured light of optimal spatial resolution that is most appropriate for given application. Furthermore, we can use this method in an active way to collect information about the scene. 4 Experimental results For capturing structured light images, we have used a HP xb31 DLP projector and Canon IXY 500 digital camera. In this section, we present experimental results for two different experimental setups.

A. 1 camera and 1 projector In order to extend detectable range of depth edges, this setup uses the first method (as explained in section 3.2) that just employs additional structured light whose spatial frequency is halved. Fig. 6 shows the result of depth edge detection using three structured light images with different width of horizontal stripes. Setting f=3m, a max =3m, d=17.3cm and r min =10 cm, w 1 and a min are determined as 0.84cm and 2.33m, respectively. Each Gabor amplitude map (Fig. 6 (c)~(e)) shows that we cannot detect all the depth edges in the scene using a single structured light image. However, combining the results from the three cases, we can obtain the final Gabor amplitude map as in Fig. 6 (f). The result Fig. 6 (g) shows that this method is capable of detecting depth edges of all the objects located in the detectable range. We have also compared the result with the output of the traditional Canny edge detector Fig. 6 (h). B. 1 camera and 2 projectors We can apply both extension methods when using a single camera and two projectors. The detectable range can be extended more than setup A when the same number of the structured light images is used. Fig. 7 shows the experimental result when two structured lights are used for each projector. When f =3m, a max =3m, d 1 = 17.3cm, d 2 =20.7cm, r min =10cm are used, w 1 and a min are determined as 0.84cm and 1.94m, respectively. Fig. 7 (b)~(e) show how the four structured light images play complementary roles to produce a final depth edge map. (a) Front view (b) Side view (c) w = w 1 = 0.84 cm (d) w = 2 w 1 = 1.68 cm (e) w = 4 w = 3.36 cm (f) Combined Gabor amplitude map 1 (g) Depth edges (h) Canny edges Fig. 6 Detecting depth edges using a single camera and projector (a) Front view (b) Structured Light I, Projector I (c) Structured Light I, Projector II (d) Structured Light II, Projector I (e) Structured Light II, Projector II (f) Combined Gabor amplitude map (g) Depth edges (h) Canny edges Fig. 7 Detecting depth edges using a single camera and two projectors, Fig. 8 shows the result of detecting of hand and human body contours. Our method accurately detects depth edges by eliminating inner texture edges by using only a single camera and one structured light image. The result shows that our method is effectively applicable to gesture recognition.

(a) (b) Fig. 8 (a) Detection of depth edges in the case of hand gestures for fingerspelling, (b) Detecting human body contours for gesture recognition, from left to right: white light image, Gabor amplitude map, depth edges and canny edges. 5 Conclusions We have proposed a new approach using structured light that efficiently computes depth edges. Through a modeled imaging geometry and mathematical analysis, we have also presented two setups that guarantee the occurrence of the distortion along depth discontinuities for a continuous range of object location. These methods enable the proposed approach to be practically applicable to real world scenes. We have demonstrated very promising experimental results. We have also observed that infrared projectors show the same distortion characteristics in patterned images. This makes us directly apply the same analysis from the LCD projectors to infrared projectors. By bypassing dense 3D reconstruction that is computationally expensive, our methods can be easily extended to dynamic scenes as well. We believe that this research will contribute to great improvement of many computer vision solutions that rely on shape features. Acknowledgement This work was supported in part by the Korea Science and Engineering Foundation (KOSEF) through the Biometrics Engineering Research Center (BERC) at Yonsei University. References 1. T. A.Cass, Robust Affine Structure Matching for 3D Object Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1264-1265, 1998. 2. I. Weiss and M. Ray, Model-based recognition of 3D object from single vision, IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 116-128, 2001. 3. T. Frohlinghaus and J. M. Buhmann Regularizing phase-based stereo, Proceedings of 13th International Conference on Pattern Recognition, pp. 451-455, 1996. 4. S. H. Lee, J. M. Choi, D. S Kim, B. C. Jung, J. K. Na, and H. M. Kim, An Active 3D Robot Camera for Home Environment, Proceedings of 4th IEEE Sensors Conference, 2004. 5. D. Scharstein and R. Szeliski, High-Accuracy Stereo Depth Maps Using Structured Light, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 195-202, 2003. 6. R. Raskar, K. H. Tan, R. Feris, J. Yu, and M. Turk, Non-photorealistic Camera: Depth Edge Detection and Stylized Rendering Using Multi-Flash Imaging, Proceedings of ACM SIGGRAPH Conference, Vol. 23, pp. 679-688, 2004. 7. R. Feris, M. Turk, R. Raskar, K. Tan, and G. Ohashi, Exploiting Depth Discontinuities for Vision-based Fingerspelling Recognition, IEEE Workshop on Real-Time Vision for Human-Computer Interaction, 2004.