Feature Descriptors, Detection and Matching

Similar documents
A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

Image Segmentation and Registration

Build Panoramas on Android Phones

Convolution. 1D Formula: 2D Formula: Example on the web:

Feature Tracking and Optical Flow

Probabilistic Latent Semantic Analysis (plsa)

Distinctive Image Features from Scale-Invariant Keypoints

Android Ros Application

Automatic georeferencing of imagery from high-resolution, low-altitude, low-cost aerial platforms

Randomized Trees for Real-Time Keypoint Recognition

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Part-Based Recognition

Local features and matching. Image classification & object localization

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Classifying Manipulation Primitives from Visual Data

Canny Edge Detection

Point Matching as a Classification Problem for Fast and Robust Object Pose Estimation

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

BRIEF: Binary Robust Independent Elementary Features

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

MIFT: A Mirror Reflection Invariant Feature Descriptor

3D Model based Object Class Detection in An Arbitrary View

Machine vision systems - 2

Admin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis

Object tracking & Motion detection in video sequences

Image Stitching based on Feature Extraction Techniques: A Survey

Taking Inverse Graphics Seriously

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

FastKeypointRecognitioninTenLinesofCode

OBJECT TRACKING USING LOG-POLAR TRANSFORMATION

Incremental PCA: An Alternative Approach for Novelty Detection

Tracking in flussi video 3D. Ing. Samuele Salti

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Edge-based Template Matching and Tracking for Perspectively Distorted Planar Objects

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

The use of computer vision technologies to augment human monitoring of secure computing facilities

Object Recognition and Template Matching

EFFICIENT VEHICLE TRACKING AND CLASSIFICATION FOR AN AUTOMATED TRAFFIC SURVEILLANCE SYSTEM

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

J. P. Oakley and R. T. Shann. Department of Electrical Engineering University of Manchester Manchester M13 9PL U.K. Abstract

A Learning Based Method for Super-Resolution of Low Resolution Images

Jiří Matas. Hough Transform

SURF: Speeded Up Robust Features

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms

Simultaneous Gamma Correction and Registration in the Frequency Domain

A Comparative Study of SIFT and its Variants

The Scientific Data Mining Process

FREAK: Fast Retina Keypoint

Sharpening through spatial filtering

Camera geometry and image alignment

Feature Point Selection using Structural Graph Matching for MLS based Image Registration

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Evaluation of Interest Point Detectors

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections

Removing Moving Objects from Point Cloud Scenes

Edge detection. (Trucco, Chapt 4 AND Jain et al., Chapt 5) -Edges are significant local changes of intensity in an image.

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

Traffic Flow Monitoring in Crowded Cities

Supporting Online Material for

Matt Cabot Rory Taca QR CODES

Automatic Restoration Algorithms for 35mm film

Make and Model Recognition of Cars

EXPERIMENTAL EVALUATION OF RELATIVE POSE ESTIMATION ALGORITHMS

Optical Tracking Using Projective Invariant Marker Pattern Properties

A Comparison of Keypoint Descriptors in the Context of Pedestrian Detection: FREAK vs. SURF vs. BRISK

DATA ANALYSIS II. Matrix Algorithms

Automatic 3D Mapping for Infrared Image Analysis

Geometric Camera Parameters

Module II: Multimedia Data Mining

Least-Squares Intersection of Lines

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

Finding people in repeated shots of the same scene

3D Scanner using Line Laser. 1. Introduction. 2. Theory

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

Segmentation of building models from dense 3D point-clouds

TouchPaper - An Augmented Reality Application with Cloud-Based Image Recognition Service

An Iterative Image Registration Technique with an Application to Stereo Vision

Two-Frame Motion Estimation Based on Polynomial Expansion

Introduction to Robotics Analysis, Systems, Applications

More Local Structure Information for Make-Model Recognition

REAL-TIME OBJECT DETECTION AND TRACKING FOR INDUSTRIAL APPLICATIONS

Face detection is a process of localizing and extracting the face region from the

The Photosynth Photography Guide

ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER

Relating Vanishing Points to Catadioptric Camera Calibration

Big Ideas in Mathematics

Transcription:

Feature Descriptors, Detection and Matching Goal: Encode distinctive local structure at a collection of image points for matching between images, despite modest changes in viewing conditions (changes in scale, orientation, contrast, etc.). Key Issues: Feature point detection Feature descriptors Potential applications Image panoramas (image matching and registration) Long range motion (tracking by detection) Stereoscopic vision / 3D reconstruction Object recognition Readings: D. Lowe (2004) Distinctive image features from scale-invariant keypoints. IJCV, 60(2): 91-110. K. Mikolajczyk and C. Schmid (2004) Scale and affine invariant interest point detectors, IJCV, 60(1): 63-86. M. Brown and D. Lowe (2007) Automatic panoramic image stitching using invariant features. IJCV, 74(1):59 73. Matlab Tutorials: SIFTtutorial/tutorial.m (utvis) CSC2503: Feature Descriptors, Detection and Matching c D.J. Fleet and A.D. Jepson, 2011 Page: 1

Panoramic Images Using Local Features 1) Detect local features in each of the input images. 2) Find corresponding feature pairs in different images. 3) Use the feature correspondences to align the images. CSC2503: Feature Descriptors, Detection and Matching Page: 2

Local Image Features Idea: Encode the image structure in spatial neighbourhoods (i.e., what the image patch looks like) at a set of feature points chosen at selected scales/orientations (i.e., where to do the encoding). What s a good feature? Locality: Small regions are less sensitive to view-dependent image deformations, and other parts of the object can be occluded. Pose invariance: The feature-point detector can select a canonical position, scale and orientation for subsequent matching. Distinctiveness: The feature descriptors should permit a high detection rate (0.7-0.9) and low false positive rate (e.g. 10 3 ). Repeatability: We should be able to detect the same points despite changes in viewing conditions. Applications: 3D reconstruction from multiple views, motion tracking, object recognition, image retrieval, robot navigation, etc. CSC2503: Feature Descriptors, Detection and Matching Page: 3

Feature Points in Laplacian Scale-Space Consider extremal points in the magnitude of the difference of Gaussian (DOG) filtered images (i.e., in the Laplacian Pyramid): DOG( x,σ) [G( x,σ) G( x,ρσ)] I( x), withρ > 1 is the spacing of adjacent scales (typicallyρis2 1 4 or 23). 1 I.e., find locations x and scalesσ at which DOG( x,σ) is maximal. Remarks: Contrast changes affect DOG( x,σ), but not extremal points. Extremal points are roughly co-variant with scale and translation, and independent of orientation (about the feature point). CSC2503: Feature Descriptors, Detection and Matching Page: 4

Harris Corner Points For distinctiveness it is useful to ensure that the neighbourhoods of each feature point have sufficiently rich image texture. Harris Corner Points: Compute the2 2 orientation tensort( x,σ) [( ) Ix ( x,σ) ( ) ] T( x,σ) G( x,2σ) I x ( x,σ) I y ( x,σ). I y ( x,σ) wherei x ( x,σ) = G x ( x,σ) I( x) and similarly fori y. Locations where both eigenvalues of T, λ 1 and λ 2, are large (w.r.t., width of Gaussian support), will have high contrast and a wide range of orientations. Therefore, either threshold or find local maxima in R λ 1 λ 2 k(λ 1 +λ 2 ) 2, wherek is an empirical constant (typically between 0.04 and 0.06). Extremal points of the Laplacian pyramid at x and σ that do not have sufficicently largerare therefore culled. CSC2503: Feature Descriptors, Detection and Matching Page: 5

Empirical Performance Natural images are warped computationally, using parametric deformations so ground truth feature correspondence is known. Repeatability Rate: Number of matching pairs of points (to within a specified tolerance), divided by the average number of detected points in two views. SIFT: Extrema with respect to spatial position and scale in DOG( x, σ) (Lowe, 2004). Harris-Laplacian: Extrema with respect to position in R( x, σ), with R( x, σ) sufficiently large, and with respect to scale in DOG( x, σ) (Mikolajczyk & Schmid, 2001). In practice, it is useful to consider extrema of DOG( x,σ) with respect to x andσ, for whichr( x,σ) is sufficiently large. CSC2503: Feature Descriptors, Detection and Matching Page: 6

Canonical Local Orientation We also want the feature descriptor to be defined with respect to a local canonical orientation. In this way we will be able to build a descriptor that is (approximately) invariant to scale, position, and orientation. Two ways to define local orientation: Use leading eigenvector oft( x,σ), ifλ 1 /λ 2 is sufficicently large. Find the highest peak in the orientation histogram of local gradient magnitudes. Arrows show direction and scale of detected SIFT features The critical property of an feature point detector is that it identifies image positions and scales( x,σ) of the same points on an object, despite significant changes in the imaging geometry, lighting, and noise. CSC2503: Feature Descriptors, Detection and Matching Page: 7

Feature Descriptors Given a feature point at location x, scale σ, and orientation θ, we describe the image structure in a neighbourhood of x, aligned with θ, and proportional to σ. To facilitate matching, the descriptor should be distinctive and insensitive to local image deformations. SIFT: The scale-invariant feature transform of a neighbourhood is a 128-dimensional vector of histograms of image gradients. The region, at the appropriate scale and orientation, is divided into a 4 4 square grid, each cell of which yields a histogram with 8 orientation bins. Remarks: Spatial histograms give some insensitivity to deformation. Other descriptors can be formed, e.g., from higher-order Gaussian derivative filters, steerable filters, or the phase of bandpass filters. CSC2503: Feature Descriptors, Detection and Matching Page: 8

Viewpoint Insensitivity A set of local image features is extracted from a model image. Each feature encodes a record of its Position, that is, the pixel location x; Scale, the particular value of σ; Orientation, the dominant orientation in the local image neighbourhood; Local Image Structure in Canonical Coordinates, encoded in terms of gradient histograms (eg. SIFT), and/or other properties. The local image structure is encoded relative to the position, scale and orientation determined by the feature point detector. Given a test image, local features can be extracted in the same manner. The features from the test image can be compared directly to the features obtained from the model image, despite changes in position, scale and orientation. Partial invariance to viewing geometry is a consequence of encoding the local image structure relative to position, scale and orientation at the feature point. We reply on the feature point detector to get these quantities the same in both the model and test images. CSC2503: Feature Descriptors, Detection and Matching Page: 9

Feature Point Representation and Indexing The similarity between two SIFT feature vectors is given by the Euclidean distance between them (if the vectors are normalized to unit length, the angle between the vectors can also be used). Matching between two images involves computing the distance between all possible pairs of detected features, and selecting as matching pairs those features whose nearest-neighbor is closer than some threshold. However, SIFT is typically used for matching new images against a large database of known objects, or for aligning large collections of images. In either case, the number of features that have to be matched is potentially very large. Computing the distance between every possible pair of features quickly becomes impractical if not impossible. Therefore, SIFT matching under realistic conditions relies on the use of special data structures or approximate nearest-neighbor algorithms. Typical data structures include k-d trees, a variation of binary trees that recursively divide the data space into smaller hyper-boxes to speed-up search. CSC2503: Feature Descriptors, Detection and Matching Page: 10

Panoramic Image Stitching Stitching together multiple images of a scene into a single photograph. 1. Detect SIFT features on all input images. 2. Match SIFT features between all pairs of images. 3. Estimate Homographies for image pairs with matching points. 4. Bundle adjustment to refine global alignment. 5. Warp and pyramid blending to create panorama. CSC2503: Feature Descriptors, Detection and Matching Page: 11

Panoramic Image Stitching Half of input images aligned All input images aligned Final result after pyramid blending CSC2503: Feature Descriptors, Detection and Matching Page: 12

Aligment Errors and Blending The aligment process assumes that the camera rotates around its optical centre during the taking of the input images. This is generally not true unless a special tripod head is used. In practice we can cope with small translations and off-center rotations with the help of pyramid blending (see the Pyramid Notes). Linear blending Pyramid blending This also makes the process robust to errors in the estimation of the alignment parameters for the set of input images. CSC2503: Feature Descriptors, Detection and Matching Page: 13

SIFT Invariance and Panoramas Since SIFT features are (approximately) invariant to rotation, translation, and scaling, we can stitch images that were taken with different camera orientations, and even with varying focal lengths. Final panorama Input images CSC2503: Feature Descriptors, Detection and Matching Page: 14

Matching the Valbonne Church With changes in scale and position: With changes in scale, position, 3D viewpoint, and brightness: CSC2503: Feature Descriptors, Detection and Matching Page: 15