Introduction to Computer Vision for Robotics. Overview. Jan-Michael Frahm

Similar documents
Epipolar Geometry. Readings: See Sections 10.1 and 15.6 of Forsyth and Ponce. Right Image. Left Image. e(p ) Epipolar Lines. e(q ) q R.

Feature Tracking and Optical Flow

Geometric Camera Parameters

B4 Computational Geometry

2D Geometrical Transformations. Foley & Van Dam, Chapter 5

Make and Model Recognition of Cars

Interactive 3D Scanning Without Tracking

MetropoGIS: A City Modeling System DI Dr. Konrad KARNER, DI Andreas KLAUS, DI Joachim BAUER, DI Christopher ZACH

EXPERIMENTAL EVALUATION OF RELATIVE POSE ESTIMATION ALGORITHMS

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

A Learning Based Method for Super-Resolution of Low Resolution Images

Rotation and Inter interpolation Using Quaternion Representation

Relating Vanishing Points to Catadioptric Camera Calibration

Introduction Epipolar Geometry Calibration Methods Further Readings. Stereo Camera Calibration

Optical Tracking Using Projective Invariant Marker Pattern Properties

Camera geometry and image alignment

Flow Separation for Fast and Robust Stereo Odometry

Wii Remote Calibration Using the Sensor Bar

w = COI EYE view direction vector u = w ( 010,, ) cross product with y-axis v = w u up vector

An Iterative Image Registration Technique with an Application to Stereo Vision

CS 534: Computer Vision 3D Model-based recognition

Projective Geometry. Projective Geometry

The Geometry of Perspective Projection

Build Panoramas on Android Phones

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

Least-Squares Intersection of Lines

Randomized Trees for Real-Time Keypoint Recognition

Solution Guide III-C. 3D Vision. Building Vision for Business. MVTec Software GmbH

Tracking in flussi video 3D. Ing. Samuele Salti

DESIGN & DEVELOPMENT OF AUTONOMOUS SYSTEM TO BUILD 3D MODEL FOR UNDERWATER OBJECTS USING STEREO VISION TECHNIQUE

Core Maths C2. Revision Notes

Automatic georeferencing of imagery from high-resolution, low-altitude, low-cost aerial platforms

Lecture 2: Homogeneous Coordinates, Lines and Conics

A Prototype For Eye-Gaze Corrected

Segmentation of building models from dense 3D point-clouds

Face Model Fitting on Low Resolution Images

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

Removing Moving Objects from Point Cloud Scenes

4BA6 - Topic 4 Dr. Steven Collins. Chap. 5 3D Viewing and Projections

3D Human Face Recognition Using Point Signature

Distinctive Image Features from Scale-Invariant Keypoints

Detecting and positioning overtaking vehicles using 1D optical flow

Affine Transformations

Introduction Feature Estimation Keypoint Detection Keypoints + Features. PCL :: Features. Michael Dixon and Radu B. Rusu

2-View Geometry. Mark Fiala Ryerson University

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

Probabilistic Latent Semantic Analysis (plsa)

Accurate and robust image superresolution by neural processing of local image representations

Epipolar Geometry and Visual Servoing

Mean-Shift Tracking with Random Sampling

Vision based Vehicle Tracking using a high angle camera

Solving Simultaneous Equations and Matrices

Convolution. 1D Formula: 2D Formula: Example on the web:

High-Resolution Multiscale Panoramic Mosaics from Pan-Tilt-Zoom Cameras

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

Human behavior analysis from videos using optical flow

PHOTOGRAMMETRIC TECHNIQUES FOR MEASUREMENTS IN WOODWORKING INDUSTRY

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

How To Analyze Ball Blur On A Ball Image

Geometric Transformation CS 211A

2.1 Three Dimensional Curves and Surfaces

Dynamic composition of tracking primitives for interactive vision-guided navigation

Lecture 12: Cameras and Geometry. CAP 5415 Fall 2010

Localization of Mobile Robots Using Odometry and an External Vision Sensor

3D Arm Motion Tracking for Home-based Rehabilitation

PCL Tutorial: The Point Cloud Library By Example. Jeff Delmerico. Vision and Perceptual Machines Lab 106 Davis Hall UB North Campus.

Universidad de Cantabria Departamento de Tecnología Electrónica, Ingeniería de Sistemas y Automática. Tesis Doctoral

SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA

Video stabilization for high resolution images reconstruction

THE problem of visual servoing guiding a robot using

Projective Reconstruction from Line Correspondences

Bayesian Image Super-Resolution

The use of computer vision technologies to augment human monitoring of secure computing facilities

Real-Time Camera Tracking Using a Particle Filter

Reciprocal Image Features for Uncalibrated Helmholtz Stereopsis

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Distance measuring based on stereoscopic pictures

Model-Based Video Coding Using Colour and Depth Cameras

3D Model based Object Class Detection in An Arbitrary View

Robot Perception Continued

RECOGNIZING SIMPLE HUMAN ACTIONS USING 3D HEAD MOVEMENT

3D Tranformations. CS 4620 Lecture 6. Cornell CS4620 Fall 2013 Lecture Steve Marschner (with previous instructors James/Bala)

Real-Time Automated Simulation Generation Based on CAD Modeling and Motion Capture

Constrained curve and surface fitting

2D Geometric Transformations

PHYSIOLOGICALLY-BASED DETECTION OF COMPUTER GENERATED FACES IN VIDEO

Metrics on SO(3) and Inverse Kinematics

Digital Image Increase

Common Core Unit Summary Grades 6 to 8

Essential Mathematics for Computer Graphics fast

Robotics. Chapter 25. Chapter 25 1

MIFT: A Mirror Reflection Invariant Feature Descriptor

GPS-aided Recognition-based User Tracking System with Augmented Reality in Extreme Large-scale Areas

MSc in Autonomous Robotics Engineering University of York

Edge detection. (Trucco, Chapt 4 AND Jain et al., Chapt 5) -Edges are significant local changes of intensity in an image.

DICOM Correction Item

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Geometry: Unit 1 Vocabulary TERM DEFINITION GEOMETRIC FIGURE. Cannot be defined by using other figures.

Announcements. Active stereo with structured light. Project structured light patterns onto the object

Integrated Sensor Analysis Tool (I-SAT )

Transcription:

Introduction to Computer Vision for Robotics Jan-Michael Frahm Overview Camera model Multi-view geometr Camera pose estimation Feature tracking & matching Robust pose estimation

Homogeneous coordinates e 3 M Unified notation: include origin in affine basis a 3 e 2 o a 2 a Affine basis matri M e a r r r r e e2 e3 o a 2 = a3 Homogeneous Coordinates of M Properties of affine transformation Transformation T affine combines linear mapping and coordinate shift in homogeneous coordinates Linear mapping with A 33 matri coordinate shift with t 3 translation vector A t = = 33 3 M TaffineM M Parallelism is preserved 2 22 23 affine = a3 a32 a33 tz ratios of length, area, and volume are preserved Transformations can be concatenated: if M = T M and M = T M M = T T M = T M T a a2 a3 t a a a t 2 2 2 2 2 2

Projective geometr Projective space P 2 is space of ras emerging from O view point O forms projection center for all ras ras v emerge from viewpoint into scene ra g is called projective point, defined as scaled v: g=lv w R 3 v = O v g( λ) = = λv, λ R w Projective and homogeneous points Given: Plane Π in R 2 embedded in P 2 at coordinates w= viewing ra g intersects plane at v (homogeneous coordinates) all points on ra g project onto the same homogeneous point v projection of g onto Π is defined b scaling v=g/l = g/w Π (Ρ 2 ) w R 3 g( λ) = λv = λ = w w= v w v = g = = w w O 3

Finite and infinite points All ras g that are not parallel to Π intersect at an affine point v on Π. The ra g(w=) does not intersect Π. Hence v is not an affine point but a direction. Directions have the coordinates (,,z,) T Projective space combines affine space with infinite points (directions). P 3 g g g 2 w= Π (Ρ 2 ) v v v2 O v g = Affine and projective transformations Affine transformation leaves infinite points at infinit ' M = TaffineM X a a2 a3 t X a2 a22 a23 t = Z a3 a32 a33 t z Z Projective transformations move infinite points into finite affine space ' M = TprojectiveM p X a a2 a3 t X a a a t = = p 2 22 23 λ λ zp Z a3 a32 a33 tz Z w w 4 w42 w43 w44 Eample: Parallel lines intersect at the horizon (line of infinite points). We can see this intersection due to perspective projection! 4

Pinhole Camera (Camera obscura) Interior of camera obscura Camera obscura (Sunda Magazine, 838) (France, 83) Pinhole camera model Image sensor aperture Center (c, c ) object View direction Focal length f image lens 5

Perspective projection Perspective projection in P 3 models pinhole camera: scene geometr is affine Ρ 3 space with coordinates M=(X,,Z,) T camera focal point in O=(,,,) T, camera viewing direction along Z image plane (,) in Π(Ρ 2 ) aligned with (X,) at Z= Z Scene point M projects onto point M p on plane surface X Z (Optical ais) M = Z Π 3 Π (Ρ 2 ) Z M p O Camera center X Image plane ZX p X Z Z p Z Z Mp = = = ZZ Z Z Z Z Z Z Projective Transformation Projective Transformation maps M onto M p in Ρ 3 space O M p X M ρ ρm p p X X p = TpM ρ = = Z Z Z W z Z ρ = = projective scale factor Z Projective Transformation linearizes projection 6

Perspective Projection Dimension reduction from Ρ 3 into Ρ 2 b projection onto Π(Ρ 2 ) Π (Ρ 2 ) O m p X M p p p p p p Z = Z Perspective projection P from Ρ 3 onto Ρ 2 : X Z ρmp = DpTpM = PM ρ =, ρ = Z Z z Image plane and image sensor A sensor with picture elements (Piel) is added onto the image plane Piel coordinates m = (,) T Piel scale f= (f,f ) T Focal length Z Z (Optical ais) Projection center Image center c= (c, c ) T X Image sensor Image-sensor mapping: m = Km p Piel coordinates are related to image coordinates b affine transformation K with five parameters: f s c K = f c Image center c at optical ais distance Z p (focal length) and Piel size determines piel resolution f, f image skew s to model angle between piel rows and columns 7

Projection in general pose T cam R C = T P m p Rotation [R] Projection: ρ m = PM p Projection center C M T T R R C Tscene = T cam = T World coordinates Projection matri P Camera projection matri P combines: inverse affine transformation T cam - from general pose to origin Perspective projection P to image plane at Z = affine mapping K from image to sensor coordinates scene pose transformation: T projection: P = = [ I ] scene T T R R C = T f s c sensor calibration: K = f c T T ρm = PM, P = KP Tscene = K R R C 8

2-view geometr: F-Matri Projection onto two views: P K R I = [ ] [ ] [ ] ρ m = P M = K R I M ρ m = K R I M ρ M M P Z m O X e m C P K R I C P [ ] [ ] [ ] [ ] = ρ m = PM = K R I C M M = + K R I M K R I C O ρ m = K R R K ρ m K R C ρ m = ρ H m + e H m Epipolar line X X = = + M = M + O Z Z The Fundamental Matri F The projective points e and (H m ) define a plane in camera (epipolar plane Π e ) the epipolar plane intersect the image plane in a line (epipolar line u e ) the corresponding point m lies on that line: m T u e = If the points (e ),(m ),(H m ) are all collinear, then the collinearit theorem applies: (m T e H m ) =. T collinearit of m, e, H m m ( e H m ) = ez e [ e] = ez e e e [ ] F3 3 Fundamental Matri F F = [ e ] H Epipolar constraint T m Fm = 9

Estimation of F from correspondences Given a set of corresponding points, solve linearil for the 9 elements of F in projective coordinates since the epipolar constraint is homogeneous up to scale, onl eight elements are independent since the operator [e] and hence F have rank 2, F has onl 7 independent parameters (all epipolar lines intersect at e) each correspondence gives collinearit constraint => solve F with minimum of 7 correspondences for N>7 correspondences minimize distance point-line: N n= T 2 ( m Fm ) min!, n, n m Fm = T i i det( F ) = (Rank 2 constraint) The Essential Matri E F is the most general constraint on an image pair. If the camera calibration matri K is known, then more constraints are available Essential Matri E m T ~ ~ ~ m ~ Fm = E = [ e] T T T ( Km ) F( Km ) = m ( K FK ) R with [ e] = ez e e 4243 E holds the relative orientation of a calibrated camera pair. It has 5 degrees of freedom: 3 from rotation matri R ik, 2 from direction of translation e, the epipole. e z E e e

Estimation of P from E From E we can obtain a camera projection matri pair: E=Udiag(,,)V T P =[I 33 3 ] and there are four choices for P : P =[UWV T +u 3 ] or P =[UWV T -u 3 ] or P =[UW T V T +u 3 ] or P =[UW T V T -u 3 ] with W = onl one with 3D point in front of both cameras four possible configurations: 3D Feature Reconstruction corresponding point pair (m, m ) is projected from 3D feature point M M is reconstructed from b (m, m ) triangulation M has minimum distance of intersection P m I m I M d 2 d min! constraints: I d = T I d = T F minimize reprojection error: P ( m P M) + ( m PM) min. 2 2

Multi View Tracking 2D match: Image correspondence (m, mi) 3D match: Correspondence transfer (mi, M) via P 3D Pose estimation of Pi with mi - Pi M => min. P m M 3D match m mi P 2D match Pi Minimize lobal reprojection error: N K i = k = m PM k, i i k 2 min! Correspondences matching vs. tracking Image-to-image correspondences are essential to 3D reconstruction KLT-tracker SIFT-matcher Etract features independentl and then match b comparing descriptors [Lowe 24] Etract features in first images and find same feature back in net view [Lucas & Kanade 98], [Shi & Tomasi 994] Small difference between frames potential large difference overall 2

SIFT-detector Scale and image-plane-rotation invariant feature descriptor [Lowe 24] SIFT-detector Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging parameters 3

Difference of Gaussian for Scale invariance Gaussian Difference of Gaussian Difference-of-Gaussian with constant ratio of scales is a close approimation to Lindeberg s scale-normalized Laplacian [Lindeberg 998] Difference of Gaussian for Scale invariance Difference-of-Gaussian with constant ratio of scales is a close approimation to Lindeberg s scale-normalized Laplacian [Lindeberg 998] 4

Blur Subtract Detect maima and minima of difference-of-gaussian in scale space Fit a quadratic to surrounding values for sub-piel and sub-scale interpolation (Brown & Lowe, 22) Talor epansion around point: Ke point localization Offset of etremum (use finite differences for derivatives): Orientation normalization Histogram of local gradient directions computed at selected scale Assign principal orientation at peak of smoothed histogram Each ke specifies stable 2D coordinates (,, scale, orientation) 2π 5

Eample of kepoint detection Threshold on value at DOG peak and on ratio of principle curvatures (Harris approach) (a) 23389 image (b) 832 DOG etrema (c) 729 left after peak value threshold (d) 536 left after testing ratio of principle curvatures courtes Lowe SIFT vector formation Thresholded image gradients are sampled over 66 arra of locations in scale space Create arra of orientation histograms 8 orientations 44 histogram arra = 28 dimensions eample 22 histogram arra Lowe 6

Sift feature detector Robust data selection: RANSAC Estimation of plane from point data {potential plane points} Select m samples Compute n-parameter solution Evaluate on {potential plane points} {inlier samples } Best solution so far? es no Keep it no -(-(#inlier/ {potential plane points} ) n ) steps >.99 Best solution, {inlier}, {outlier} 7

RANSAC: Evaluate Hpotheses Evaluate cost function 2 2 c λ ε + c c 2 2 + c λ ε < + c c else if 2λ ε c + c 2 2 λ ε 2 c 2 2 ( + λ ε ) ε 2 ε potential points ε ε ε n ε ε ε n ε ε ε n plane hpotheses ε ε ε n Robust Pose Estimation Calibrated Camera Bootstrap known 3D points {2D-2D correspondences} {2D-3D correspondences} E-RANSAC(5-point algorithm) E Estimate P,P P,P RANSAC(3-point algorithm) P nonlinear refinement with all inliers P nonlinear refinement with all inliers P,P triangulate points 8

References [Lowe 24] David Lowe, Distinctive Image Features from Scale- Invariant Kepoints, IJCV, 6(2), 24, pp9- [Lucas & Kanade 98] Bruce D. Lucas and Takeo Kanade. An Iterative Image Registration Technique with an Application to Stereo Vision. In Proceedings International Joint Conference on Artificial Intelligence, 98. [Shi & Tomasi 994] Jianbo Shi and Carlo Tomasi, Good Features to Track, IEEE Conference on Computer Vision and Pattern Recognition 994 [Baker & Matthews 24] S. Baker and I. Matthews, Lucas-Kanade 2 ears On: A Unifing Framework.International Journal of Computer Vision, 56(3):22 255, March 24. [Fischler & Bolles] M.A. Fischler and R.C. Bolles, Random sample consensus: A paradigm for model fitting with applications to image analsis and automated cartograph.cacm, 24(6), June 8. References [Mikolajczk 23], K.Mikolajczk, C.Schmid. A Performance Evaluation of Local Descriptors. CVPR 23 [Lindeberg 998], T. Lindeberg, "Feature detection with automatic scale selection," International Journal of Computer Vision, vol. 3, no. 2, 998 [Hartle 23] R. hartle and A. Zisserman, Multiple View Geometr in Computer Vision, 2 nd edition, Cambridge Press, 23 9