Introduction to 3D Reconstruction and Stereo Vision Francesco Isgrò 3D Reconstruction and Stereo p.1/37
What are we going to see today? Shape from X Range data 3D sensors Active triangulation Stereo vision epipolar constraint fundamental matrix more and more next classes... 3D Reconstruction and Stereo p.2/37
Range Data Range images are a special class of digital images Each pixel expresses the distance of a visible point in the scene from a known reference frame A range image reproduces the 3D structure of a scene It is best thought of as a sampled surface 3D Reconstruction and Stereo p.3/37
3D Reconstruction and Stereo p.4/37
Representation of range data Range data can be represented in two basic forms: xyz form or cloud of points a list of 3D coordinates in a given reference frame no specific order is required r ij form a matrix of depth values of points along the directions of the xy image axes the points follow a specific order, given by the xs and ys 3D Reconstruction and Stereo p.5/37
Range sensors A range sensor is a device used to acquire range data. Range sensors may measure: depth at one point only shape or surface profiles full surfaces 3D Reconstruction and Stereo p.6/37
We distinguish between active range sensors: project energy (light, sonar, pulse) on the scene and detect its position to perform the measure; or exploit the effects of controlled changes of some sensor parameters (e.g. focus) passive range sensors: rely only on image intensities to perform the measure 3D Reconstruction and Stereo p.7/37
Active Radar and sonar Moirè interferometry Focusing/defocusing Zoom Active triangulation Motion Passive Stereopsis Motion Contours Texture Shading 3D Reconstruction and Stereo p.8/37
Active triangulation A beam of light strikes the surface of the scene, and some of the light bounces toward a sensor (camera). The centre of the images reflection is triangulated against the laser line of sight. 3D Reconstruction and Stereo p.9/37
Z P(X,Y,Z) plane of light θ light projector X Y Z = b x camera b f cot θ x f X x y f 3D Reconstruction and Stereo p.10/37
The stripe points in the images must be easy to identify. Typical solutions projecting laser light, making the stripe brighter than the rest of the image (concavities may create reflections) projecting a black line onto a matte white or grey object (location may be confused by other dark patches, e.g. shadows) 3D Reconstruction and Stereo p.11/37
The IMPACT system 3D Reconstruction and Stereo p.12/37
Stereopsis Stereo vision refers to the ability to infer information on the3d structure and distance of a scene from two (or more) images taken from different viewpoints. A stereo system must solve two main problems Finding correspondences: which parts in the left and right images are projections of the same scene element Reconstruction: determining the 3D structure from the correspondences 3D Reconstruction and Stereo p.13/37
A simple stereo system P' Q P P Q' I l I r x l Z x r p l q l p r q r c l p l p r c r f O l O r O l T O r (a) (b) Z = ft d, d = x r x l 3D Reconstruction and Stereo p.14/37
Parameters of a stereo system Intrinsic parameters characterise the mapping of an image point from camera to pixel coordinates in each camera two full-rank 3 3 matrices A and A Extrinsic parameters describe the relative position and orientation of the two cameras A rotation matrix R and a translation vector T 3D Reconstruction and Stereo p.15/37
The two projection matrices can be written as In the uncalibrated case as Q = A[I; 0] Q = A [R; T]. Q = [I; 0] Q = [Q ; q ]. 3D Reconstruction and Stereo p.16/37
Homography of a plane Given a plane τ in space H : τ π H : τ π H τ : π π 3D Reconstruction and Stereo p.17/37
In the case of Euclidean cameras if τ = (n, d) ( ) H τ = A R tnt A 1 d The homography of the plane at infinity H = lim A d, (R tnt d ) A 1 = A RA 1 3D Reconstruction and Stereo p.18/37
Epipolar geometry π Epipolar line π e Epipolar plane O O P e Epipolar line p p 3D Reconstruction and Stereo p.19/37
Epipolar geometry Epipolar line π e Epipolar plane O O P π e Epipolar line p p epipolar line is the image of an optical ray epipole is the image in one camera of the centre of the other camera all epipolar lines intersect in the epipole 3D Reconstruction and Stereo p.19/37
Fundamental matrix The relation F associating each point p on π with its corresponding epipolar line λ p is projective linear. The matrix governing this relation is called fundamental matrix F = [ẽ ] Q Q 1 3D Reconstruction and Stereo p.20/37
Given two corresponding points p and p : the epipolar line λ p is F p p is on λ p therefore p t F t p = 0 3D Reconstruction and Stereo p.21/37
Properties of the fundamental matrix F is rank-2 it has 7 degrees of freedom: F encodes the geometry of two cameras F = [ẽ ] H τ λ P = F t p Fẽ = 0 and F t ẽ = 0 3D Reconstruction and Stereo p.22/37
The fundamental matrix in the calibrated case F = [A t] A RA 1 = A t [t] RA 1. Introducing the essential matrix E = [t] R F = A t EA 1 3D Reconstruction and Stereo p.23/37
Estimation of the fundamental matrix The fundamental matrix can be computed once a set of corresponding points ( p i p i) among the images has been determined Linear solution Least Squares Non-linear estimation Robust estimation 3D Reconstruction and Stereo p.24/37
Linear solution F is 3 3, determined up to a scale factor. We know that for corresponding points p t i F p i = 0 8 correspondences are enough to build a linear system of 8 equations in 9 unknowns Each equation row of A is Af = 0, [x i x i, y ix i, x i, x iy i, y iy i, y i, x i, Y i, 1] 3D Reconstruction and Stereo p.25/37
Coordinates of corresponding points are not exact. We use more than 8 points in order to minimise the effect of the error in the coordinates We solve Af = 0, where A is n 9 It is solved by using Singular Value Decomposition of A 3D Reconstruction and Stereo p.26/37
Singular Value Decomposition Any m n matrix A can be written as A = UWV t U and V are orthogonal U is m n W is a n n not-negative diagonal matrix V is n n the entries w j on the diagonal of W are called singular values The columns of V corresponding to w j = 0 generate the null space of A 3D Reconstruction and Stereo p.27/37
Solution by SVD A must be rank-8 as the null space must be of dimension 1 Because of the noise corrupting the coordinates of the points in general A is full rank Solution is the column vector v j corresponding to the smallest singular value w j 3D Reconstruction and Stereo p.28/37
Data normalisation The linear algorithm performs badly when pixel coordinates are used, because A is badly conditioned Performance are improved normalising the data in both images: Translate data so that centroid is the origin Scale coordinates so that average distance from the origin is 2 In this way the average point is [1, 1, 1] t 3D Reconstruction and Stereo p.29/37
Correcting F Computing F linearly the rank-2 condition does not hold. The closest rank-2 matrix F can be adjusted using SVD F = U diag(r, s, t) V t Set F = U diag(r, s, 0) V t 3D Reconstruction and Stereo p.30/37
Linear regression Let us suppose that the following set of n observations is given y 1 x 11 x 12 x 1p 1 x 1p y 2 x 21 x 22 x 2p 1 x 2p. y n x n1 x n2 x np 1 x np We assume that, for each i = 1,, n y i = x i1 θ 1 + + x 1p θ p + e i e i is the error term normally distributed with mean zero and unknown standard deviation σ. 3D Reconstruction and Stereo p.31/37
An estimator tries to determine the vector θ of the model parameters The estimated values ˆθ, are called regression coefficients The values ŷ i = x i1ˆθ1 + + x ipˆθp are called predicted values of y i, and the values are called residuals r i = y i ŷ i Usually the estimator determines the vector θ which minimise a function of the residuals. 3D Reconstruction and Stereo p.32/37
Maximum likelihood estimator We want to maximise the probability of a set of parameters to generate the data Assuming Gaussian noise the probability is exp i ( 1 2 r 2 i σ 2 ) y 3D Reconstruction and Stereo p.33/37
Maximising it is equivalent to minimise the negative of its logarithm ri 2 n log y σ2 i i r 2 i σ 2 This method is called Least Squares 3D Reconstruction and Stereo p.34/37
Least Squares and SVD In general to use SVD to solve overdetermined systems Ax = b It can be shown that with SVD we find the vector ˆx solving min x Ax b 2 SVD gives an optimal solution in a Least Squares sense 3D Reconstruction and Stereo p.35/37
Non-linear methods Better results can be obtained by using the linear method as an initial estimate for an iterative process with a different cost function Minimising distance from epipolar lines min F (d 2 ( p, F p) + d 2 ( p, F t p )), i Minimising distance from reprojected points min ( p i Q F PFi 2 + p i Q F P ) Fi 2 F i, 3D Reconstruction and Stereo p.36/37
Is Least Squares enough? Given a set of data we call outliers those points which deviate from the distribution followed by the majority of data For these points the residuals distribution is different from the one supposed for the error term Least Squares assumes the noise is Gaussian If only one data point does not follow this assumption the estimation is far from the correct value In our case of point correspondences we can have two types of outliers due to bad locations false matches Methods based on robust statistics do exist. 3D Reconstruction and Stereo p.37/37