Local Features Tutorial: Nov. 8, 04

Similar documents
Distinctive Image Features from Scale-Invariant Keypoints

A Study on SURF Algorithm and Real-Time Tracking Objects Using Optical Flow

siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service

Android Ros Application

Convolution. 1D Formula: 2D Formula: Example on the web:

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Image Segmentation and Registration

Probabilistic Latent Semantic Analysis (plsa)

Stitching of X-ray Images

Bildverarbeitung und Mustererkennung Image Processing and Pattern Recognition

Randomized Trees for Real-Time Keypoint Recognition

Feature Tracking and Optical Flow

Canny Edge Detection

Local features and matching. Image classification & object localization

Automatic georeferencing of imagery from high-resolution, low-altitude, low-cost aerial platforms

Point Matching as a Classification Problem for Fast and Robust Object Pose Estimation

VEHICLE TRACKING USING FEATURE MATCHING AND KALMAN FILTERING

Practical Tour of Visual tracking. David Fleet and Allan Jepson January, 2006

Blind Deconvolution of Barcodes via Dictionary Analysis and Wiener Filter of Barcode Subsections

Object Recognition and Template Matching

A Hardware Architecture for Scalespace Extrema Detection

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS

Footwear Print Retrieval System for Real Crime Scene Marks

CS 534: Computer Vision 3D Model-based recognition

A Comparative Study between SIFT- Particle and SURF-Particle Video Tracking Algorithms

Correlation and Convolution Class Notes for CMSC 426, Fall 2005 David Jacobs

3D Object Recognition in Clutter with the Point Cloud Library

3D Model based Object Class Detection in An Arbitrary View

Sharpening through spatial filtering

So which is the best?

AN IMPROVED DOUBLE CODING LOCAL BINARY PATTERN ALGORITHM FOR FACE RECOGNITION

The use of computer vision technologies to augment human monitoring of secure computing facilities

Aliasing, Image Sampling and Reconstruction

Feature Point Selection using Structural Graph Matching for MLS based Image Registration

Module II: Multimedia Data Mining

Jiří Matas. Hough Transform

Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report

Part-Based Recognition

More Local Structure Information for Make-Model Recognition

Classifying Manipulation Primitives from Visual Data

Make and Model Recognition of Cars

Edge detection. (Trucco, Chapt 4 AND Jain et al., Chapt 5) -Edges are significant local changes of intensity in an image.

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Traffic Flow Monitoring in Crowded Cities

Image Gradients. Given a discrete image Á Òµ, consider the smoothed continuous image ܵ defined by

Robert Collins CSE598G. More on Mean-shift. R.Collins, CSE, PSU CSE598G Spring 2006

A Simple Feature Extraction Technique of a Pattern By Hopfield Network

Edge-based Template Matching and Tracking for Perspectively Distorted Planar Objects

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Adaptive Face Recognition System from Myanmar NRC Card

EECS 556 Image Processing W 09. Interpolation. Interpolation techniques B splines

MIFT: A Mirror Reflection Invariant Feature Descriptor

3D Human Face Recognition Using Point Signature

FPGA-BASED PARALLEL HARDWARE ARCHITECTURE FOR REAL-TIME OBJECT CLASSIFICATION. by Murad Mohammad Qasaimeh

Simultaneous Gamma Correction and Registration in the Frequency Domain

jorge s. marques image processing

Visualization and Feature Extraction, FLOW Spring School 2016 Prof. Dr. Tino Weinkauf. Flow Visualization. Image-Based Methods (integration-based)

Recognizing Cats and Dogs with Shape and Appearance based Models. Group Member: Chu Wang, Landu Jiang

QUALITY TESTING OF WATER PUMP PULLEY USING IMAGE PROCESSING

ILLUMINATION NORMALIZATION METHODS

Automated Location Matching in Movies

COIN-O-MATIC: A fast system for reliable coin classification

Tracking and Recognition in Sports Videos

Image Hallucination Using Neighbor Embedding over Visual Primitive Manifolds

An Intelligent Wellness Keeper for Food Nutrition with Graphical Icons

Optical Flow. Shenlong Wang CSC2541 Course Presentation Feb 2, 2016

Fast and Robust Normal Estimation for Point Clouds with Sharp Features

Cloud-Based Image Coding for Mobile Devices Toward Thousands to One Compression

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Admin stuff. 4 Image Pyramids. Spatial Domain. Projects. Fourier domain 2/26/2008. Fourier as a change of basis

Automatic Grocery Shopping Assistant

FACE RECOGNITION BASED ATTENDANCE MARKING SYSTEM

CSISE: CLOUD-BASED SEMANTIC IMAGE SEARCH ENGINE. A THESIS IN Computer Science

An Efficient Gradient based Registration Technique for Coin Recognition

J. P. Oakley and R. T. Shann. Department of Electrical Engineering University of Manchester Manchester M13 9PL U.K. Abstract

Mean-Shift Tracking with Random Sampling

OBJECT TRACKING USING LOG-POLAR TRANSFORMATION

ACCURACY ASSESSMENT OF BUILDING POINT CLOUDS AUTOMATICALLY GENERATED FROM IPHONE IMAGES

A Learning Based Method for Super-Resolution of Low Resolution Images

Image Classification for Dogs and Cats

Combining Local Recognition Methods for Better Image Recognition

BRIEF: Binary Robust Independent Elementary Features

Computational Foundations of Cognitive Science

Machine vision systems - 2

An Iterative Image Registration Technique with an Application to Stereo Vision

Removing Moving Objects from Point Cloud Scenes

The Delicate Art of Flower Classification

521466S Machine Vision Assignment #7 Hough transform

Signature Segmentation from Machine Printed Documents using Conditional Random Field

PowerPoint: Graphics and SmartArt

Recognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28

Evaluation of Interest Point Detectors

VECTORAL IMAGING THE NEW DIRECTION IN AUTOMATED OPTICAL INSPECTION

MusicGuide: Album Reviews on the Go Serdar Sali

A New Image Edge Detection Method using Quality-based Clustering. Bijay Neupane Zeyar Aung Wei Lee Woon. Technical Report DNA #

A Comparative Study of SIFT and its Variants

A Comparison of Photometric Normalisation Algorithms for Face Verification

A Novel Method to Improve Resolution of Satellite Images Using DWT and Interpolation

Position Estimation Using Principal Components of Range Data

Transcription:

Local Features Tutorial: Nov. 8, 04 Local Features Tutorial References: Matlab SIFT tutorial (from course webpage) Lowe, David G. Distinctive Image Features from Scale Invariant Features, International Journal of Computer Vision, Vol. 60, No. 2, 2004, pp. 91-110 Local Features Tutorial 1

Local Features Tutorial Previous week: recognition View based models for object - The problem: Build a model that captures general properties of eye appearance that we can use to identify eyes (though the approach is general, and does not depend on the particular object class). - Generalized model of eye appearance based on PCA. Images taken from same pose and normalized for contrast. - Demonstrated to be useful for classification, key property: the model can find instances of eyes it has never seen before. Local Features Tutorial 2

Local Features Tutorial Today: Local features for object recognition - The problem: Obtain a representation that allows us to find a particular object we ve encountered before (i.e. find Paco s mug as opposed to find a mug ). - Local features based on the appearance of the object at particular interest points. - Features should be reasonably invariant to illumination changes, and ideally, also to scaling, rotation, and minor changes in viewing direction. - In addition, we can use local features for matching, this is useful for tracking and 3D scene reconstruction. Local Features Tutorial 3

Local Features Tutorial Key properties of a good local feature: - Must be highly distinctive, a good feature should allow for correct object identification with low probability of mismatch. Question: How to identify image locations that are distinctive enough?. - Should be easy to extract. - Invariance, a good local feature should be tolerant to Image noise Changes in illumination Uniform scaling Rotation Minor changes in viewing direction Question: How to construct the local feature to achieve invariance to the above? - Should be easy to match against a (large) database of local features. Local Features Tutorial 4

SIFT features Scale Invariant Feature Transform (SIFT) is an approach for detecting and extracting local feature descriptors that are reasonably invariant to changes in illumination, image noise, rotation, scaling, and small changes in viewpoint. Detection stages for SIFT features: - Scale-space extrema detection - Keypoint localization - Orientation assignment - Generation of keypoint descriptors. In the following pages we ll examine these stages in detail. SIFT features 5

Scale-space extrema detection Interest points for SIFT features correspond to local extrema of difference-of-gaussian filters at different scales. Given a Gaussian-blurred image where L(x, y, σ) = G(x, y, σ) I(x, y), G(x, y, σ) = 1/(2πσ 2 ) exp (x2 +y 2 )/σ 2 is a variable scale Gaussian, the result of convolving an image with a difference-of-gaussian filter is given by G(x, y, kσ) G(x, y, σ) D(x, y, σ) = L(x, y, kσ) L(x, y, σ) (1) Scale-space extrema detection 6

Which is just the difference of the Gaussian-blurred images at scales σ and kσ. Figure 1: Diagram showing the blurred images at different scales, and the computation of the difference-of-gaussian images (from Lowe, 2004, see ref. at the beginning of the tutorial) The first step toward the detection of interest points is the convolution of the image with Gaussian filters at different scales, and the generation of difference-of- Gaussian images from the difference of adjacent blurred images. Scale-space extrema detection 7

Scale-space extrema detection The convolved images are grouped by octave (an octave corresponds to doubling the value of σ), and the value of k is selected so that we obtain a fixed number of blurred images per octave. This also ensures that we obtain the same number of difference-of-gaussian images per octave. Note: The difference-of-gaussian filter provides an approximation to the scale-normalized Laplacian of Gaussian σ 2 2 G. The difference-of-gaussian filter is in effect a tunable bandpass filter. Scale-space extrema detection 8

Scale-space extrema detection Figure 2: Local extrema detection, the pixel marked is compared against its 26 neighbors in a 3 3 3 neighborhood that spans adjacent DoG images (from Lowe, 2004) Interest points (called keypoints in the SIFT framework) are identified as local maxima or minima of the DoG images across scales. Each pixel in the DoG images is compared to its 8 neighbors at the same scale, plus the 9 corresponding neighbors at neighboring scales. If the pixel is a local maximum or minimum, it is selected as a candidate keypoint. Scale-space extrema detection 9

Scale-space extrema detection For each candidate keypoint: - Interpolation of nearby data is used to accurately determine its position. - Keypoints with low contrast are removed - Responses along edges are eliminated - The keypoint is assigned an orientation To determine the keypoint orientation, a gradient orientation histogram is computed in the neighborhood of the keypoint (using the Gaussian image at the closest scale to the keypoint s scale). The contribution of each neighboring pixel is weighted by the gradient magnitude and a Gaussian window with a σ that is 1.5 times the scale of the keypoint. Peaks in the histogram correspond to dominant orientations. A separate keypoint is created for the direction corresponding to the histogram maximum, Scale-space extrema detection 10

and any other direction within 80% of the maximum value. All the properties of the keypoint are measured relative to the keypoint orientation, this provides invariance to rotation. Scale-space extrema detection 11

SIFT feature representation Once a keypoint orientation has been selected, the feature descriptor is computed as a set of orientation histograms on 4 4 pixel neighborhoods. The orientation histograms are relative to the keypoint orientation, the orientation data comes from the Gaussian image closest in scale to the keypoint s scale. Just like before, the contribution of each pixel is weighted by the gradient magnitude, and by a Gaussian with σ 1.5 times the scale of the keypoint. Figure 3: SIFT feature descriptor (from Lowe, 2004) SIFT feature representation 12

Histograms contain 8 bins each, and each descriptor contains an array of 4 histograms around the keypoint. This leads to a SIFT feature vector with 4 4 8 = 128 elements. This vector is normalized to enhance invariance to changes in illumination. SIFT feature representation 13

SIFT feature matching - Find nearest neighbor in a database of SIFT features from training images. - For robustness, use ratio of nearest neighbor to ratio of second nearest neighbor. - Neighbor with minimum Euclidean distance expensive search. - Use an approximate, fast method to find nearest neighbor with high probability. SIFT feature matching 14

Recognition using SIFT features - Compute SIFT features on the input image - Match these features to the SIFT feature database - Each keypoint specifies 4 parameters: 2D location, scale, and orientation. - To increase recognition robustness: Hough transform to identify clusters of matches that vote for the same object pose. - Each keypoint votes for the set of object poses that are consistent with the keypoint s location, scale, and orientation. - Locations in the Hough accumulator that accumulate at least 3 votes are selected as candidate object/pose matches. - A verification step matches the training image for the hypothesized object/pose to the image using a least-squares fit to the hypothesized location, scale, and orientation of the object. Recognition using SIFT features 15

SIFT matlab tutorial Gaussian blurred images and Difference of Gaussian images Range: [ 0.11, 0.131] Dims: [959, 2044] Figure 4: Gaussian and DoG images grouped by octave SIFT matlab tutorial 16

SIFT matlab tutorial Keypoint detection a) b) c) Figure 5: a) Maxima of DoG across scales. b) Remaining keypoints after removal of low contrast points. C) Remaining keypoints after removal of edge responses (bottom). SIFT matlab tutorial 17

SIFT matlab tutorial Final keypoints with selected orientation and scale Figure 6: orientation. Extracted keypoints, arrows indicate scale and SIFT matlab tutorial 18

SIFT matlab tutorial Warped image and extracted keypoints Figure 7: Warped image and extracted keypoints. The hough transform of matched SIFT features yields SIFT matlab tutorial 19

the transformation that aligns the original and warped images: Computed affine transformation from rotated image to original image: >> disp(aff); 0.7060-0.7052 128.4230 0.7057 0.7100-128.9491 0 0 1.0000 Actual transformation from rotated image to original image: >> disp(a); 0.7071-0.7071 128.6934 0.7071 0.7071-128.6934 0 0 1.0000 SIFT matlab tutorial 20

SIFT matlab tutorial Matching and alignment of different views using local features. Orignial View Reference View Dims: [384, 512] Dims: [384, 512] Aligned View Reference minus Aligned View Range: [ 0.0273, 1] Dims: [384, 512] Range: [ 0.767, 0.822] Dims: [384, 512] Figure 8: Two views of Wadham College and affine transformation for alignment. SIFT matlab tutorial 21

SIFT matlab tutorial Object recognition with SIFT Image Model Location Image Model Range: [ 0.986, 0.765] Location Image Model Range: [ 1.05, 0.866] Location Range: [ 1.07, 1.01] Figure 9: Cellphone examples with different poses and occlusion. SIFT matlab tutorial 22

SIFT matlab tutorial Object recognition with SIFT Image Model Location Image Model Range: [ 0.991, 0.992] Location Image Model Range: [ 1.05, 0.963] Location Range: [ 0.988, 1.05] Figure 10: Book example, what happens when we match similar features outside the object? SIFT matlab tutorial 23

Closing Comments - SIFT features are reasonably invariant to rotation, scaling, and illumination changes. - We can use them for matching and object recognition among other things. - Robust to occlusion, as long as we can see at least 3 features from the object we can compute the location and pose. - Efficient on-line matching, recognition can be performed in close-to-real time (at least for small object databases). Closing Comments 24

Closing Comments Questions: - Do local features solve the object recognition problem? - How about distinctiveness? how do we deal with false positives outside the object of interest? (see Figure 10). - Can we learn new object models without photographing them under special conditions? - How does this approach compare to the object recognition method proposed by Murase and Nayar? Recall that their model consists of a PCA basis for each object, generated from images taken under diverse illumination and viewing directions; and a representation of the manifold described by the training images in this eigenspace (see the tutorial on Eigen Eyes). Closing Comments 25